I encountered this question last October in one of the forums I’m frequent. The question was: given the table with the keywords, find all records from some table which would include all these words (Implementing search with AND keyword). From the first glance the problem looks trivial, but for me the solution was not very obvious and I used Borislav Borissov’s help in solving it.

I want to also note, that we can pass the list of keywords as a delimited list and split into table using any of the available splitting functions (you can explore this problem at Arrays and Lists in SQL Server blog)

The code bellow illustrates the problem and the solution I found.

T-SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
DECLARE @MyTable TABLE (Id int identity(1,1), Searched varchar(200))
DECLARE @Keys    TABLE (Word varchar(200), WordId int identity(1,1))
 
INSERT INTO @MyTable VALUES ('Mother Father Daughter Son')
INSERT INTO @MyTable VALUES ('Mother Daughter Son')
INSERT INTO @MyTable VALUES ('Mother Son')
INSERT INTO @MyTable VALUES ('Daughter Son')
INSERT INTO @MyTable VALUES ('Mother Father Son')
INSERT INTO @MyTable VALUES ('Son Daughter Father')
INSERT INTO @MyTable VALUES ('Mother Son')
INSERT INTO @MyTable VALUES ('Other Word')
INSERT INTO @MyTable VALUES ('Mother Father Daughter Brother Son')
INSERT INTO @MyTable VALUES ('Mother Daughter Son Stepdaughter')
INSERT INTO @MyTable VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather')
INSERT INTO @MyTable VALUES ('Daughter Son Family')
INSERT INTO @MyTable VALUES ('Mother Brother Father Son Orphan')
INSERT INTO @MyTable VALUES ('Son or Daughter or Father')
INSERT INTO @MyTable VALUES ('Mother And Son')
INSERT INTO @MyTable VALUES ('Other Word One More')
 
INSERT INTO @Keys VALUES ('Mother')
INSERT INTO @Keys VALUES ('Father')
INSERT INTO @Keys VALUES ('Son')
INSERT INTO @Keys VALUES ('Daughter')
 
DECLARE @nAllWords int
SELECT @nAllWords = COUNT(*) FROM @Keys
 
SELECT MyTable.*
FROM @MyTable MyTable
INNER JOIN (SELECT MyTable.Id
                   FROM @MyTable MyTable
            INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word  + ' %'
            GROUP BY MyTable.Id
            HAVING COUNT(Distinct(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id
DECLARE @MyTable TABLE (Id int identity(1,1), Searched varchar(200))
DECLARE @Keys    TABLE (Word varchar(200), WordId int identity(1,1))

INSERT INTO @MyTable VALUES ('Mother Father Daughter Son')
INSERT INTO @MyTable VALUES ('Mother Daughter Son')
INSERT INTO @MyTable VALUES ('Mother Son')
INSERT INTO @MyTable VALUES ('Daughter Son')
INSERT INTO @MyTable VALUES ('Mother Father Son')
INSERT INTO @MyTable VALUES ('Son Daughter Father')
INSERT INTO @MyTable VALUES ('Mother Son')
INSERT INTO @MyTable VALUES ('Other Word')
INSERT INTO @MyTable VALUES ('Mother Father Daughter Brother Son')
INSERT INTO @MyTable VALUES ('Mother Daughter Son Stepdaughter')
INSERT INTO @MyTable VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather')
INSERT INTO @MyTable VALUES ('Daughter Son Family')
INSERT INTO @MyTable VALUES ('Mother Brother Father Son Orphan')
INSERT INTO @MyTable VALUES ('Son or Daughter or Father')
INSERT INTO @MyTable VALUES ('Mother And Son')
INSERT INTO @MyTable VALUES ('Other Word One More')

INSERT INTO @Keys VALUES ('Mother')
INSERT INTO @Keys VALUES ('Father')
INSERT INTO @Keys VALUES ('Son')
INSERT INTO @Keys VALUES ('Daughter')

DECLARE @nAllWords int
SELECT @nAllWords = COUNT(*) FROM @Keys

SELECT MyTable.*
FROM @MyTable MyTable
INNER JOIN (SELECT MyTable.Id
                   FROM @MyTable MyTable
            INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word  + ' %'
            GROUP BY MyTable.Id
            HAVING COUNT(Distinct(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id

The above is the original test case and the solution we implemented with Borislav.

Based on Nikola’s comments I re-worked the test case. The last suggestion performs the best. We can get further improvement if we include ‘% ‘ in our Keys and WordsToExclude tables. For the explanation of the last Nikola’s query see http://msdn.microsoft.com/en-us/library/ms178543.aspx

See also this blog ALL, ANY and SOME: The Three Stooges

T-SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
USE [AllTests]
GO
 
/****** Object:  Table [dbo].[RealTest]    Script Date: 08/19/2009 19:39:24 ******/
SET ANSI_NULLS ON
GO
 
SET QUOTED_IDENTIFIER ON
GO
 
SET ANSI_PADDING ON
GO
 
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[RealTest](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Searched] [varchar](200) NULL,
 CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
END
GO
 
--ALTER TABLE [dbo].[RealTest] DISABLE CHANGE_TRACKING 
GO
 
SET ANSI_PADDING OFF
GO
 
USE [AllTests]
GO
 
USE [AllTests]
GO
 
/****** Object:  Index [PK_RealTest]    Script Date: 08/19/2009 19:53:06 ******/
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'PK_RealTest')
ALTER TABLE [dbo].[RealTest] ADD  CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
 
 
/****** Object:  Index [IX_RealTest]    Script Date: 08/19/2009 19:52:29 ******/
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'IX_RealTest')
CREATE NONCLUSTERED INDEX [IX_RealTest] ON [dbo].[RealTest] 
(
    [Searched] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
 
 
 
 
DECLARE @Keys    TABLE (Word VARCHAR(200), WordId INT IDENTITY(1,1))
DECLARE @WordsToExclude TABLE (Word VARCHAR(200), WordID INT IDENTITY(1,1))
DECLARE @I INT = 1
SET NOCOUNT ON
select COUNT(*) from RealTest  
WHILE @I<10000 
BEGIN
INSERT INTO RealTest  VALUES ('Mother Father Daughter Son iteration' + CAST(@i AS VARCHAR(10)))
INSERT INTO RealTest VALUES ('Mother Daughter Son iteration' + CAST(@i AS VARCHAR(10)))
INSERT INTO RealTest VALUES ('Mother Son iteration' + CAST(@i AS VARCHAR(10)))
INSERT INTO RealTest VALUES ('Daughter Son')
INSERT INTO RealTest VALUES ('Mother Father Son')
INSERT INTO RealTest VALUES ('Son Daughter Father')
INSERT INTO RealTest VALUES ('Mother Son')
INSERT INTO RealTest VALUES ('Other Word')
INSERT INTO RealTest VALUES ('Mother Father Daughter Brother Son')
INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan')
INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan')
INSERT INTO RealTest VALUES ('MotherFatherDaughterBrotherSon')
INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Son Stepdaughter')
INSERT INTO RealTest VALUES ('Brother Mother Father Daughter Son Stepdaughter')
INSERT INTO RealTest VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather')
INSERT INTO RealTest VALUES ('Daughter Son Family')
INSERT INTO RealTest VALUES ('Mother Brother Father Daughter Son Orphan')
INSERT INTO RealTest VALUES ('Son or Daughter or Father')
INSERT INTO RealTest VALUES ('Mother And Son')
INSERT INTO RealTest VALUES ('Other Word One More')
 
SET @i = @i +1
END
 
INSERT INTO @Keys VALUES ('Mother')
INSERT INTO @Keys VALUES ('Father')
INSERT INTO @Keys VALUES ('Son')
INSERT INTO @Keys VALUES ('Daughter')
 
INSERT INTO @WordsToExclude VALUES ('Exclude') 
INSERT INTO @WordsToExclude VALUES ('Stepdaughter') 
 
SELECT COUNT(*) FROM RealTest 
 
SET STATISTICS TIME ON 
DECLARE @nAllWords INT
SELECT @nAllWords = COUNT(*) FROM @Keys
 
SELECT MyTable.*
FROM RealTest MyTable
INNER JOIN (SELECT MyTable.Id
                   FROM RealTest MyTable
            INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word  + ' %'
            WHERE not exists (SELECT 1 FROM RealTest M INNER JOIN @WordsToExclude W ON ' ' + M.Searched + ' ' LIKE '%' + W.Word + ' %' and M.Id = MyTable.Id)
            GROUP BY MyTable.Id
            HAVING COUNT(DISTINCT(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id --order by MyTable.Id 
 
PRINT 'Regular INNER JOIN ' +  CAST(@@ROWCOUNT AS VARCHAR(10))
            
SELECT x.Id,
       x.Searched
  FROM ( SELECT m.Id,
                m.Searched, 
                (SELECT COUNT(*) 
                   FROM @Keys k 
                  WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %'
                ) AS n, (SELECT COUNT(*) FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %') AS WE
           FROM RealTest m
       ) AS x
 WHERE n=@nAllWords and isnull(WE,0) = 0 --order by x.Id 
 
PRINT 'Subquery solution ' +  CAST(@@ROWCOUNT AS VARCHAR(10))        
 
SELECT x.Id,
       x.Searched
  FROM ( SELECT m.Id,
                m.Searched,
                ( SELECT COUNT(*)
                    FROM @Keys k
                   WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %'
                ) AS n
           FROM RealTest m
          WHERE not exists ( SELECT *
                               FROM @WordsToExclude W
                              WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %'
                           )
       ) AS x
 WHERE n=@nAllWords
 
PRINT 'Optimized Nikola''s  solution #1 ' +  CAST(@@ROWCOUNT AS VARCHAR(10))        
 
SELECT m.Id, m.Searched
FROM RealTest m
WHERE not exists (SELECT * FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %')
and 1 = ALL ( SELECT Case When ' ' + m.Searched + ' ' like '% ' + k.Word + ' %' Then 1 Else 0 End FROM @Keys k)
 
PRINT 'Optimized Nikola''s  solution #2 ' +  CAST(@@ROWCOUNT AS VARCHAR(10))        
 
    
SET STATISTICS TIME OFF
USE [AllTests]
GO

/****** Object:  Table [dbo].[RealTest]    Script Date: 08/19/2009 19:39:24 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[RealTest](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[Searched] [varchar](200) NULL,
 CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED 
(
	[Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
END
GO

--ALTER TABLE [dbo].[RealTest] DISABLE CHANGE_TRACKING 
GO

SET ANSI_PADDING OFF
GO

USE [AllTests]
GO

USE [AllTests]
GO

/****** Object:  Index [PK_RealTest]    Script Date: 08/19/2009 19:53:06 ******/
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'PK_RealTest')
ALTER TABLE [dbo].[RealTest] ADD  CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED 
(
	[Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO


/****** Object:  Index [IX_RealTest]    Script Date: 08/19/2009 19:52:29 ******/
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'IX_RealTest')
CREATE NONCLUSTERED INDEX [IX_RealTest] ON [dbo].[RealTest] 
(
	[Searched] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO




DECLARE @Keys    TABLE (Word VARCHAR(200), WordId INT IDENTITY(1,1))
DECLARE @WordsToExclude TABLE (Word VARCHAR(200), WordID INT IDENTITY(1,1))
DECLARE @I INT = 1
SET NOCOUNT ON
select COUNT(*) from RealTest  
WHILE @I<10000 
BEGIN
INSERT INTO RealTest  VALUES ('Mother Father Daughter Son iteration' + CAST(@i AS VARCHAR(10)))
INSERT INTO RealTest VALUES ('Mother Daughter Son iteration' + CAST(@i AS VARCHAR(10)))
INSERT INTO RealTest VALUES ('Mother Son iteration' + CAST(@i AS VARCHAR(10)))
INSERT INTO RealTest VALUES ('Daughter Son')
INSERT INTO RealTest VALUES ('Mother Father Son')
INSERT INTO RealTest VALUES ('Son Daughter Father')
INSERT INTO RealTest VALUES ('Mother Son')
INSERT INTO RealTest VALUES ('Other Word')
INSERT INTO RealTest VALUES ('Mother Father Daughter Brother Son')
INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan')
INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan')
INSERT INTO RealTest VALUES ('MotherFatherDaughterBrotherSon')
INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Son Stepdaughter')
INSERT INTO RealTest VALUES ('Brother Mother Father Daughter Son Stepdaughter')
INSERT INTO RealTest VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather')
INSERT INTO RealTest VALUES ('Daughter Son Family')
INSERT INTO RealTest VALUES ('Mother Brother Father Daughter Son Orphan')
INSERT INTO RealTest VALUES ('Son or Daughter or Father')
INSERT INTO RealTest VALUES ('Mother And Son')
INSERT INTO RealTest VALUES ('Other Word One More')
 
SET @i = @i +1
END
 
INSERT INTO @Keys VALUES ('Mother')
INSERT INTO @Keys VALUES ('Father')
INSERT INTO @Keys VALUES ('Son')
INSERT INTO @Keys VALUES ('Daughter')
 
INSERT INTO @WordsToExclude VALUES ('Exclude') 
INSERT INTO @WordsToExclude VALUES ('Stepdaughter') 
 
SELECT COUNT(*) FROM RealTest 
 
SET STATISTICS TIME ON 
DECLARE @nAllWords INT
SELECT @nAllWords = COUNT(*) FROM @Keys
 
SELECT MyTable.*
FROM RealTest MyTable
INNER JOIN (SELECT MyTable.Id
                   FROM RealTest MyTable
            INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word  + ' %'
            WHERE not exists (SELECT 1 FROM RealTest M INNER JOIN @WordsToExclude W ON ' ' + M.Searched + ' ' LIKE '%' + W.Word + ' %' and M.Id = MyTable.Id)
            GROUP BY MyTable.Id
            HAVING COUNT(DISTINCT(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id --order by MyTable.Id 
 
PRINT 'Regular INNER JOIN ' +  CAST(@@ROWCOUNT AS VARCHAR(10))
            
SELECT x.Id,
       x.Searched
  FROM ( SELECT m.Id,
                m.Searched, 
                (SELECT COUNT(*) 
                   FROM @Keys k 
                  WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %'
                ) AS n, (SELECT COUNT(*) FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %') AS WE
           FROM RealTest m
       ) AS x
 WHERE n=@nAllWords and isnull(WE,0) = 0 --order by x.Id 
 
PRINT 'Subquery solution ' +  CAST(@@ROWCOUNT AS VARCHAR(10))        

SELECT x.Id,
       x.Searched
  FROM ( SELECT m.Id,
                m.Searched,
                ( SELECT COUNT(*)
                    FROM @Keys k
                   WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %'
                ) AS n
           FROM RealTest m
          WHERE not exists ( SELECT *
                               FROM @WordsToExclude W
                              WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %'
                           )
       ) AS x
 WHERE n=@nAllWords

PRINT 'Optimized Nikola''s  solution #1 ' +  CAST(@@ROWCOUNT AS VARCHAR(10))        

SELECT m.Id, m.Searched
FROM RealTest m
WHERE not exists (SELECT * FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %')
and 1 = ALL ( SELECT Case When ' ' + m.Searched + ' ' like '% ' + k.Word + ' %' Then 1 Else 0 End FROM @Keys k)

PRINT 'Optimized Nikola''s  solution #2 ' +  CAST(@@ROWCOUNT AS VARCHAR(10))        

    
SET STATISTICS TIME OFF

with the result on my PC
SQL Server Execution Times:

SQL Server Execution Times:
CPU time = 6599 ms, elapsed time = 9802 ms.
Regular INNER JOIN 79992

SQL Server Execution Times:
CPU time = 4961 ms, elapsed time = 6697 ms.
Subquery solution 79992

SQL Server Execution Times:
CPU time = 5366 ms, elapsed time = 6716 ms.
Optimized Nikola's solution #1 79992

SQL Server Execution Times:
CPU time = 2901 ms, elapsed time = 3670 ms.
Optimized Nikola's solution #2 79992

*** Remember, if you have a SQL related question, try our Microsoft SQL Server Programming forum or our Microsoft SQL Server Admin forum