I encountered this question last October in one of the forums I’m frequent. The question was: given the table with the keywords, find all records from some table which would include all these words (Implementing search with AND keyword). From the first glance the problem looks trivial, but for me the solution was not very obvious and I used Borislav Borissov’s help in solving it.
I want to also note, that we can pass the list of keywords as a delimited list and split into table using any of the available splitting functions (you can explore this problem at Arrays and Lists in SQL Server blog)
The code bellow illustrates the problem and the solution I found.
T-SQL | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | DECLARE @MyTable TABLE (Id int identity(1,1), Searched varchar(200)) DECLARE @Keys TABLE (Word varchar(200), WordId int identity(1,1)) INSERT INTO @MyTable VALUES ('Mother Father Daughter Son') INSERT INTO @MyTable VALUES ('Mother Daughter Son') INSERT INTO @MyTable VALUES ('Mother Son') INSERT INTO @MyTable VALUES ('Daughter Son') INSERT INTO @MyTable VALUES ('Mother Father Son') INSERT INTO @MyTable VALUES ('Son Daughter Father') INSERT INTO @MyTable VALUES ('Mother Son') INSERT INTO @MyTable VALUES ('Other Word') INSERT INTO @MyTable VALUES ('Mother Father Daughter Brother Son') INSERT INTO @MyTable VALUES ('Mother Daughter Son Stepdaughter') INSERT INTO @MyTable VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather') INSERT INTO @MyTable VALUES ('Daughter Son Family') INSERT INTO @MyTable VALUES ('Mother Brother Father Son Orphan') INSERT INTO @MyTable VALUES ('Son or Daughter or Father') INSERT INTO @MyTable VALUES ('Mother And Son') INSERT INTO @MyTable VALUES ('Other Word One More') INSERT INTO @Keys VALUES ('Mother') INSERT INTO @Keys VALUES ('Father') INSERT INTO @Keys VALUES ('Son') INSERT INTO @Keys VALUES ('Daughter') DECLARE @nAllWords int SELECT @nAllWords = COUNT(*) FROM @Keys SELECT MyTable.* FROM @MyTable MyTable INNER JOIN (SELECT MyTable.Id FROM @MyTable MyTable INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word + ' %' GROUP BY MyTable.Id HAVING COUNT(Distinct(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id |
DECLARE @MyTable TABLE (Id int identity(1,1), Searched varchar(200)) DECLARE @Keys TABLE (Word varchar(200), WordId int identity(1,1)) INSERT INTO @MyTable VALUES ('Mother Father Daughter Son') INSERT INTO @MyTable VALUES ('Mother Daughter Son') INSERT INTO @MyTable VALUES ('Mother Son') INSERT INTO @MyTable VALUES ('Daughter Son') INSERT INTO @MyTable VALUES ('Mother Father Son') INSERT INTO @MyTable VALUES ('Son Daughter Father') INSERT INTO @MyTable VALUES ('Mother Son') INSERT INTO @MyTable VALUES ('Other Word') INSERT INTO @MyTable VALUES ('Mother Father Daughter Brother Son') INSERT INTO @MyTable VALUES ('Mother Daughter Son Stepdaughter') INSERT INTO @MyTable VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather') INSERT INTO @MyTable VALUES ('Daughter Son Family') INSERT INTO @MyTable VALUES ('Mother Brother Father Son Orphan') INSERT INTO @MyTable VALUES ('Son or Daughter or Father') INSERT INTO @MyTable VALUES ('Mother And Son') INSERT INTO @MyTable VALUES ('Other Word One More') INSERT INTO @Keys VALUES ('Mother') INSERT INTO @Keys VALUES ('Father') INSERT INTO @Keys VALUES ('Son') INSERT INTO @Keys VALUES ('Daughter') DECLARE @nAllWords int SELECT @nAllWords = COUNT(*) FROM @Keys SELECT MyTable.* FROM @MyTable MyTable INNER JOIN (SELECT MyTable.Id FROM @MyTable MyTable INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word + ' %' GROUP BY MyTable.Id HAVING COUNT(Distinct(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id
The above is the original test case and the solution we implemented with Borislav.
Based on Nikola’s comments I re-worked the test case. The last suggestion performs the best. We can get further improvement if we include ‘% ‘ in our Keys and WordsToExclude tables. For the explanation of the last Nikola’s query see http://msdn.microsoft.com/en-us/library/ms178543.aspx
See also this blog ALL, ANY and SOME: The Three Stooges
T-SQL | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | USE [AllTests] GO /****** Object: Table [dbo].[RealTest] Script Date: 08/19/2009 19:39:24 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING ON GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND type in (N'U')) BEGIN CREATE TABLE [dbo].[RealTest]( [Id] [int] IDENTITY(1,1) NOT NULL, [Searched] [varchar](200) NULL, CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] END GO --ALTER TABLE [dbo].[RealTest] DISABLE CHANGE_TRACKING GO SET ANSI_PADDING OFF GO USE [AllTests] GO USE [AllTests] GO /****** Object: Index [PK_RealTest] Script Date: 08/19/2009 19:53:06 ******/ IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'PK_RealTest') ALTER TABLE [dbo].[RealTest] ADD CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO /****** Object: Index [IX_RealTest] Script Date: 08/19/2009 19:52:29 ******/ IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'IX_RealTest') CREATE NONCLUSTERED INDEX [IX_RealTest] ON [dbo].[RealTest] ( [Searched] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO DECLARE @Keys TABLE (Word VARCHAR(200), WordId INT IDENTITY(1,1)) DECLARE @WordsToExclude TABLE (Word VARCHAR(200), WordID INT IDENTITY(1,1)) DECLARE @I INT = 1 SET NOCOUNT ON select COUNT(*) from RealTest WHILE @I<10000 BEGIN INSERT INTO RealTest VALUES ('Mother Father Daughter Son iteration' + CAST(@i AS VARCHAR(10))) INSERT INTO RealTest VALUES ('Mother Daughter Son iteration' + CAST(@i AS VARCHAR(10))) INSERT INTO RealTest VALUES ('Mother Son iteration' + CAST(@i AS VARCHAR(10))) INSERT INTO RealTest VALUES ('Daughter Son') INSERT INTO RealTest VALUES ('Mother Father Son') INSERT INTO RealTest VALUES ('Son Daughter Father') INSERT INTO RealTest VALUES ('Mother Son') INSERT INTO RealTest VALUES ('Other Word') INSERT INTO RealTest VALUES ('Mother Father Daughter Brother Son') INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan') INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan') INSERT INTO RealTest VALUES ('MotherFatherDaughterBrotherSon') INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Son Stepdaughter') INSERT INTO RealTest VALUES ('Brother Mother Father Daughter Son Stepdaughter') INSERT INTO RealTest VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather') INSERT INTO RealTest VALUES ('Daughter Son Family') INSERT INTO RealTest VALUES ('Mother Brother Father Daughter Son Orphan') INSERT INTO RealTest VALUES ('Son or Daughter or Father') INSERT INTO RealTest VALUES ('Mother And Son') INSERT INTO RealTest VALUES ('Other Word One More') SET @i = @i +1 END INSERT INTO @Keys VALUES ('Mother') INSERT INTO @Keys VALUES ('Father') INSERT INTO @Keys VALUES ('Son') INSERT INTO @Keys VALUES ('Daughter') INSERT INTO @WordsToExclude VALUES ('Exclude') INSERT INTO @WordsToExclude VALUES ('Stepdaughter') SELECT COUNT(*) FROM RealTest SET STATISTICS TIME ON DECLARE @nAllWords INT SELECT @nAllWords = COUNT(*) FROM @Keys SELECT MyTable.* FROM RealTest MyTable INNER JOIN (SELECT MyTable.Id FROM RealTest MyTable INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word + ' %' WHERE not exists (SELECT 1 FROM RealTest M INNER JOIN @WordsToExclude W ON ' ' + M.Searched + ' ' LIKE '%' + W.Word + ' %' and M.Id = MyTable.Id) GROUP BY MyTable.Id HAVING COUNT(DISTINCT(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id --order by MyTable.Id PRINT 'Regular INNER JOIN ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SELECT x.Id, x.Searched FROM ( SELECT m.Id, m.Searched, (SELECT COUNT(*) FROM @Keys k WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %' ) AS n, (SELECT COUNT(*) FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %') AS WE FROM RealTest m ) AS x WHERE n=@nAllWords and isnull(WE,0) = 0 --order by x.Id PRINT 'Subquery solution ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SELECT x.Id, x.Searched FROM ( SELECT m.Id, m.Searched, ( SELECT COUNT(*) FROM @Keys k WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %' ) AS n FROM RealTest m WHERE not exists ( SELECT * FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %' ) ) AS x WHERE n=@nAllWords PRINT 'Optimized Nikola''s solution #1 ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SELECT m.Id, m.Searched FROM RealTest m WHERE not exists (SELECT * FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %') and 1 = ALL ( SELECT Case When ' ' + m.Searched + ' ' like '% ' + k.Word + ' %' Then 1 Else 0 End FROM @Keys k) PRINT 'Optimized Nikola''s solution #2 ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SET STATISTICS TIME OFF |
USE [AllTests] GO /****** Object: Table [dbo].[RealTest] Script Date: 08/19/2009 19:39:24 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING ON GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND type in (N'U')) BEGIN CREATE TABLE [dbo].[RealTest]( [Id] [int] IDENTITY(1,1) NOT NULL, [Searched] [varchar](200) NULL, CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] END GO --ALTER TABLE [dbo].[RealTest] DISABLE CHANGE_TRACKING GO SET ANSI_PADDING OFF GO USE [AllTests] GO USE [AllTests] GO /****** Object: Index [PK_RealTest] Script Date: 08/19/2009 19:53:06 ******/ IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'PK_RealTest') ALTER TABLE [dbo].[RealTest] ADD CONSTRAINT [PK_RealTest] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO /****** Object: Index [IX_RealTest] Script Date: 08/19/2009 19:52:29 ******/ IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RealTest]') AND name = N'IX_RealTest') CREATE NONCLUSTERED INDEX [IX_RealTest] ON [dbo].[RealTest] ( [Searched] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO DECLARE @Keys TABLE (Word VARCHAR(200), WordId INT IDENTITY(1,1)) DECLARE @WordsToExclude TABLE (Word VARCHAR(200), WordID INT IDENTITY(1,1)) DECLARE @I INT = 1 SET NOCOUNT ON select COUNT(*) from RealTest WHILE @I<10000 BEGIN INSERT INTO RealTest VALUES ('Mother Father Daughter Son iteration' + CAST(@i AS VARCHAR(10))) INSERT INTO RealTest VALUES ('Mother Daughter Son iteration' + CAST(@i AS VARCHAR(10))) INSERT INTO RealTest VALUES ('Mother Son iteration' + CAST(@i AS VARCHAR(10))) INSERT INTO RealTest VALUES ('Daughter Son') INSERT INTO RealTest VALUES ('Mother Father Son') INSERT INTO RealTest VALUES ('Son Daughter Father') INSERT INTO RealTest VALUES ('Mother Son') INSERT INTO RealTest VALUES ('Other Word') INSERT INTO RealTest VALUES ('Mother Father Daughter Brother Son') INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan') INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Brother Son Orphan') INSERT INTO RealTest VALUES ('MotherFatherDaughterBrotherSon') INSERT INTO RealTest VALUES ('Exclude Mother Father Daughter Son Stepdaughter') INSERT INTO RealTest VALUES ('Brother Mother Father Daughter Son Stepdaughter') INSERT INTO RealTest VALUES ('Mother Son And Stepson and Daughter and Father and Grandfather') INSERT INTO RealTest VALUES ('Daughter Son Family') INSERT INTO RealTest VALUES ('Mother Brother Father Daughter Son Orphan') INSERT INTO RealTest VALUES ('Son or Daughter or Father') INSERT INTO RealTest VALUES ('Mother And Son') INSERT INTO RealTest VALUES ('Other Word One More') SET @i = @i +1 END INSERT INTO @Keys VALUES ('Mother') INSERT INTO @Keys VALUES ('Father') INSERT INTO @Keys VALUES ('Son') INSERT INTO @Keys VALUES ('Daughter') INSERT INTO @WordsToExclude VALUES ('Exclude') INSERT INTO @WordsToExclude VALUES ('Stepdaughter') SELECT COUNT(*) FROM RealTest SET STATISTICS TIME ON DECLARE @nAllWords INT SELECT @nAllWords = COUNT(*) FROM @Keys SELECT MyTable.* FROM RealTest MyTable INNER JOIN (SELECT MyTable.Id FROM RealTest MyTable INNER JOIN @Keys KeyWords ON ' ' + MyTable.Searched + ' ' LIKE '% ' + KeyWords.Word + ' %' WHERE not exists (SELECT 1 FROM RealTest M INNER JOIN @WordsToExclude W ON ' ' + M.Searched + ' ' LIKE '%' + W.Word + ' %' and M.Id = MyTable.Id) GROUP BY MyTable.Id HAVING COUNT(DISTINCT(KeyWords.Word)) = @nAllWords) Tbl1 ON MyTable.Id = Tbl1.Id --order by MyTable.Id PRINT 'Regular INNER JOIN ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SELECT x.Id, x.Searched FROM ( SELECT m.Id, m.Searched, (SELECT COUNT(*) FROM @Keys k WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %' ) AS n, (SELECT COUNT(*) FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %') AS WE FROM RealTest m ) AS x WHERE n=@nAllWords and isnull(WE,0) = 0 --order by x.Id PRINT 'Subquery solution ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SELECT x.Id, x.Searched FROM ( SELECT m.Id, m.Searched, ( SELECT COUNT(*) FROM @Keys k WHERE ' '+ m.Searched+' ' like '% ' + k.Word + ' %' ) AS n FROM RealTest m WHERE not exists ( SELECT * FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %' ) ) AS x WHERE n=@nAllWords PRINT 'Optimized Nikola''s solution #1 ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SELECT m.Id, m.Searched FROM RealTest m WHERE not exists (SELECT * FROM @WordsToExclude W WHERE ' ' + m.Searched + ' ' like '% ' + W.Word + ' %') and 1 = ALL ( SELECT Case When ' ' + m.Searched + ' ' like '% ' + k.Word + ' %' Then 1 Else 0 End FROM @Keys k) PRINT 'Optimized Nikola''s solution #2 ' + CAST(@@ROWCOUNT AS VARCHAR(10)) SET STATISTICS TIME OFF
with the result on my PC
SQL Server Execution Times:
SQL Server Execution Times:
CPU time = 6599 ms, elapsed time = 9802 ms.
Regular INNER JOIN 79992
SQL Server Execution Times:
CPU time = 4961 ms, elapsed time = 6697 ms.
Subquery solution 79992
SQL Server Execution Times:
CPU time = 5366 ms, elapsed time = 6716 ms.
Optimized Nikola's solution #1 79992
SQL Server Execution Times:
CPU time = 2901 ms, elapsed time = 3670 ms.
Optimized Nikola's solution #2 79992
*** Remember, if you have a SQL related question, try our Microsoft SQL Server Programming forum or our Microsoft SQL Server Admin forum
25 Comments
Excellent post!
Very nice Naomi and congratz on the first post 🙂
Welcome to the club.
What about next one, should be faster:
Select x.Id,
x.Searched
from ( Select m.Id,
m.Searched,
(Select Count(*)
From @Keys k
Where ‘ ‘+ m.Searched+’ ‘ like ‘% ‘ + k.Word + ‘ %’
) as n
from @myTable m
) as x
Where n=(Select COUNT(*) From @Keys)
Nice!
Hi Niikola,
I would need to test your solution – need to create a good test scenario and find time for this as well. To me this one is slightly harder to understand than the solution I posted.
Also today there was a little twist in the original problem – find all records including all the words, but excluding some other words (list of words). I have an idea, but don’t have time to think / test in more details.
Hi Niikola,
Your solution proved to be much faster. Also I added an ability to exclude certain words – see, if you can suggest a different idea.
Hi Naomi,
I liked much the trick you separate the sentence into words for the inner join using the below code fragment.
Excellent!
ON ‘ ‘ + MyTable.Searched + ‘ ‘ LIKE ‘% ‘ + KeyWords.Word + ‘ %’
Excellent idea… we easily extend the solution in the same way how we can use Contains/FreeText
and ContainsTable/FreeTextTable in Full Text Search.
Nice,
gotta try that one.
Slightly faster version with “Words to Exclude”
SELECT x.Id,
x.Searched
FROM ( SELECT m.Id,
m.Searched,
( SELECT COUNT(*)
FROM @Keys k
WHERE ‘ ‘+ m.Searched+’ ‘ like ‘% ‘ + k.Word + ‘ %’
) AS n
FROM @myTable m
WHERE not exists ( SELECT *
FROM @WordsToExclude W
WHERE ‘ ‘ + m.Searched + ‘ ‘ like ‘% ‘ + W.Word + ‘ %’
)
) AS x
WHERE n=@nAllWords
It would be nice if we can use next syntax:
SELECT m.Id,
m.Searched
FROM @myTable
WHERE m.Searched LIKE ALL ( SELECT ‘% ‘ + k.Word + ‘ %’ FROM @Keys k)
and not exists (SELECT * FROM @WordsToExclude W WHERE ‘ ‘ + m.Searched + ‘ ‘ like ‘% ‘ + W.Word + ‘ %’)
but unfortunately ALL does not work with LIKE operator 🙁
I forgot to add some comment why this query is faster, although I believe all of you know that already.
As WHERE clause is executed before SELECT part, it means sub query in select clause will not be executed for rows that contains at least one unwanted word, which further means less CPU and faster execution.
Changing sub query for unwanted words from (SELECT COUNT(*)…) to NOT EXISTS (…) reduces number of rows affected and number of LIKE comparisons because it will stop execution on the first match.
Effectively, instead of having nWantedWords+nUnwantedWords comparisons for each row, we will have – in case of rows that have at least one unwanted word – no one comparison with wanted words and between 1 and nUnwantedWords comparisons against unwanted ones
Rows without unwanted words will still have same number of comparisons, but without aggregates for unwanted words.
Maybe we could us ALL with a bit different sub query, although I’m not sure we would get any improvement:
SELECT m.Id, m.Searched
FROM @myTable
WHERE not exists (SELECT * FROM @WordsToExclude W WHERE ‘ ‘ + m.Searched + ‘ ‘ like ‘% ‘ + W.Word + ‘ %’)
and 1 = ALL ( SELECT Case When ‘ ‘ + m.Searched + ‘ ‘ like ‘% ‘ + k.Word + ‘ %’ Then 1 Else 0 End FROM @Keys k)
Me again 🙂
1 = ALL (Select…)
should improve performance as ALL will stop execution of sub query for the first row in @Keys that doesn’t mach LIKE criteria – which means less comparisons in total
Niikola – thanks a lot for your comments, I’ll try different variations at the first opportunity.
This is the last one, I promise 🙂
Results with real tables (not the table variables) on both, SQL Server 2005 and 2008 makes difference between original and modified queries much smaller. It seems that joins with table variables are very costly (especially in 2008). Adding PKs and indexes makes differences even smaller, but still visible.
The best one is the last one (with 1=ALL()), followed by UnwantedWords in Where clause as second, 2 subqueries in select 3rd and JOIN 4th.
Another interesting fact is that comparing Execution Plans and estimated cost for each of the queries shows completely opposite order of efficiency.
Further improvement (not a big one, but affects all solutions) would be
Update @Keys Set Word = ‘% ‘+word+’ %’
Update @WordsToExclude Set Word = ‘% ‘+word+’ %’
In that case LIKE comparisons would be
‘ ‘ + M.Searched + ‘ ‘ LIKE Word
without unnecessary concatenation in every step
Hi Niikola,
Thanks again and keep them coming – I’m sorry I didn’t have a chance to play with it yesterday and not sure would have time this week, but I plan to test and update the blog – thanks again.
I’ve updated the blog with your comments – thanks again.
Thanks for this! This absolutely saved my bacon, and provided an awesome solution where I could not use Full Text Indexing. Thanks again!
Love the article. I was wondering if you could guide me in implementing something similar, but without the requirement to match all keywords. I need a query that will take a list of keywords and return results that contain one or more of those keywords. Thanks.
If your requirement is just to get the records containing the keywords and you have the keywords as a table, then you can use this query
select T.* from myTable T INNER JOIN KeyWords W on ‘ ‘ + T.Phrase + ‘ ‘ like ‘% ‘ + KeyWord + ‘ %’
The only problem is that this is not optimized solution. If your phrases are not very long and you know the maximum number of words they can contain, then you can also try to use CROSS APPLY idea to split the phrases into words as shown here
http://bradsruminations.blogspot.com/2009/07/cool-cross-apply-tricks-part-2.html
Thank for the info, i visit daily actually mainly for articles like this. Thank again
You provide very good solution….I am a great fan of your solutions…thank you for all the contributions you make.
Very good solution pals..happy coding.