Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « How Do You Check If A Temporary Table Exists In SQL Server.NET Framework 3.5 SP1: LINQ perf improvements (LINQ to Objects and LINQ to SQL) »
    comments

    Take a look at this query.

    1. select * from
    2.  
    3. (
    4.  
    5. select customer_id, 'MTD' as record_type, count(*), sum(...), avg(...)
    6.  
    7. from payment_table
    8.  
    9.  
    10. where year(payment_dt) = year(getDate())
    11.  
    12. and month(payment_dt) = month(getDate())
    13.  
    14. group by customer_id) MTD_payments
    15.  
    16. UNION ALL
    17.  
    18. (
    19.  
    20. select customer_id, 'YTD' as record_type, count(*), sum(...), avg(...)
    21.  
    22. from payment_table
    23.  
    24. where
    25.  
    26. where year(payment_dt) = year(getDate())
    27.  
    28. group by customer_id) YTD_payments
    29.  
    30. UNION ALL
    31.  
    32. (
    33.  
    34. select customer_id, 'LTD' as record_type, count(*), sum(...), avg(...)
    35.  
    36. from payment_table) LTD_payments
    37.  
    38. ) payments_report
    39.  
    40. order by customer_id, record_type

    Can you see the problem?
    A person had this query, it would run for over 24 hours. Wow, that is pretty bad, I don't think I had ever written something that ran over an hour, and the ones I did were mostly defragmentation and update statistics jobs.

    The problem is that the following piece of code

    1. where year(payment_dt) = year(getDate())
    2. and month(payment_dt) = month(getDate())

    is not sargable. First what does it mean to be sargable? A query is said to be sargable if the DBMS engine can take advantage of an index to speed up the execution of the query (using index seeks, not covering indexes). The term is derived from a contraction of Search ARGument Able.

    This query is not sargable because there is a function on the column, whenever you use a function on the column you will not get an index seek but an index scan. The difference between an index seek and an index scan can be explained like this: when searching for something in a book, you go to the index in the back find the page number and go to the page, that is an index seek. When looking for something in a book you go from page one until the last page, read all the words on all the ages and get what you need, that was an index scan. Do you see how much more expensive in terms of performance that was?

    Let's get back to the query, what can we do to make this piece of code use an index seek?

    1. where year(payment_dt) = year(getDate())
    2. and month(payment_dt) = month(getDate())

    You would change it to this:

    1. where payment_dt >= dateadd(mm, datediff(mm, 0, getdate())+0, 0)
    2. and payment_dt < dateadd(mm, datediff(mm, 0, getdate())+1, 0)

    You can see the complete question on the MSDN forum site here:
    http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/d8464d33-38cc-48ba-bd15-5e7d14eeb19c

    The Person said that his query went from over 24 hours to 36 seconds. Wow!! That is very significant. hardware cannot help you out if you have bad queries like that.

    The same exact day I answered a very similar question, take a look here: http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/6107ec5c-b671-485f-941e-4efdf3d2fc66

    The person had this

    1. AND DATEDIFF(d, '08/10/2008', DateCreated) >= 0
    2. AND DATEDIFF(d, DateCreated, '08/15/2008') >= 0

    I told him to change it to this

    1. AND DateCreated >= '08/10/2008'
    2. and DateCreated < '08/16/2008'

    And that solved that query. If you are interested in some more performance, I have written some Query Optimization items on the LessThanDot Wiki. Below are some direct links

    Case Sensitive Search
    No Functions on Left Side of Operator
    Query Optimizations With Dates
    Optimization: Set Nocount On
    No Math In Where Clause
    Don't Use (select *), but List Columns
    

    If you are interested in some blogposts about dates, take a look at these two which I wrote earlier
    How Are Dates Stored In SQL Server?
    Do You Know How Between Works With Dates?

    About the Author

    User bio imageDenis has been working with SQL Server since version 6.5. Although he worked as an ASP/JSP/ColdFusion developer before the dot com bust, he has been working exclusively as a database developer/architect since 2002. In addition to English, Denis is also fluent in Croatian and Dutch, but he can curse in many other languages and dialects (just ask the SQL optimizer) He lives in Princeton, NJ with his wife and three kids.
    Social SitingsTwitterFacebookLinkedInHomePageFlickrLTD RSS Feed
    15949 views
    InstapaperVote on HN

    19 comments

    Comment from: Adam Zochowski [Visitor] · http://apz.nofate.com
    Adam Zochowski || what can we do to make this piece
    || of code use an index seek

    || WHERE YEAR(payment_dt) = YEAR(GETDATE())
    || and MONTH(payment_dt) = MONTH(GETDATE())

    why not

    create index idx_Year on table (year(payment_dt))
    create index idx_Month on table (month(payment_dt))

    year(getDate()) is evaluted once (since getdate does not change per row, only per transaction), and year(payment_dt) would be evaluated through index?


    Why the ugly and hard to grok

    WHERE payment_dt >= DATEADD(mm, DATEDIFF(mm, 0, GETDATE())+0, 0)

    08/18/08 @ 12:48
    Comment from: Venshi [Visitor]
    *****
    Venshi I hadn't thought about this before. Is this true for all functions? I have a few statements with COALESCE on both sides but haven't noticed any issues. I'll have to see if I can improve on them.
    08/18/08 @ 13:40
    Comment from: SQLDenis [Member] Email
    SQLDenis Adam, because the more indexes you add, the slower your inserts and updates will be
    08/18/08 @ 14:15
    Comment from: SQLDenis [Member] Email
    SQLDenis Venshi,

    Look at the execution plan to see if you get a seek or a scan

    where coalesce(column,0) = 0

    should result in a scan
    08/18/08 @ 14:17
    Comment from: Kenny [Visitor]
    *****
    Kenny 240000% improvement?
    08/18/08 @ 14:39
    Comment from: jürgen [Visitor]
    ****-
    jürgen the only thing i would add is that the analogy of the book surely is a bit misleading, because reading the book from start to the item you are looking for is analogous to a table scan whereas reading the index from beginning to what you are looking for is analogous to an index scan.

    the use of functions causes this behaviour, because as in this case to find all the index items where the date has a certain month the db can't use direct access. it's analogous to finding all the index items in a book that contain 'blah'. the solution is to limit the search to a range of dates, ie between two index items that can be accessed directly. i don't think there is a way to convert my example so that the db can directly access index items.
    08/18/08 @ 14:59
    Comment from: SQLDenis [Member] Email
    SQLDenis Jürgen,

    It also depends if the index is clustered or not
    08/18/08 @ 15:45
    Comment from: Mark Brackett [Visitor] Email
    Mark Brackett While I understand the technicalities, I think it's a pretty sad statement of affairs that after 20 years of development Sql Server (and other RDBMS's) still can't perform the following logic:

    1. YEAR(GETDATE()) is deterministic (for a single statement, at least) returning 2008 - so it's effectively a constant.
    2. (YEAR(datetime) = 2008) is exactly equal to (datetime BETWEEN '1/1/2008' AND '12/31/2008').

    I learned how to "solve for x" in 4th grade, and now I can even buy a $30 calculator that'll do it for me. Is it too much to ask that my $5000 server, and $20000 software do the same?

    For a supposed declarative language that touts it's "Query Optimizer", Sql sure is brain dead quite often.
    08/18/08 @ 17:21
    Comment from: Jerry Hung [Visitor] · http://www.jerryhung.ca
    ****-
    Jerry Hung I am surprised the person did not look at the Execution Plan to figure out what's going on

    A simple UNION query that runs 24 hours = something is SERIOUSLY wrong

    I do agree, SQL should be smarter in Date-handling cases
    and YEAR(getdate()) and MONTH(getdate()) are essentially CONSTANT in the above query. The user should save them into a variable at least

    I've learned to use BETWEEN if I can, versus Date functions or DATEDIFF/DATEPART
    08/19/08 @ 09:20
    Comment from: jürgen [Visitor]
    ****-
    jürgen it's not the YEAR(GETDATE()) and MONTH(getdate()) that's the problem, but the YEAR(payment_dt) and MONTH(payment_dt).

    But you might think that the query optimiser would know that YEAR(payment_dt) = YEAR(GETDATE()) was the same as payment_date BETWEEN '1-Jan-' + YEAR(GETDATE()) and '31-Dec-' + YEAR(GETDATE()) and MONTH(payment_dt) = MONTH(GETDATE()) in combination with YEAR(payment_dt) = YEAR(GETDATE()) is the same as payment_dt BETWEEN '1-' + DATENAME(m,GETDATE()) + '-' + CONVERT(varchar,YEAR(GETDATE())) and DATEADD(d,-1,DATEADD(m,1,'1-' + DATENAME(m,GETDATE()) + '-' + CONVERT(varchar,YEAR(GETDATE())))) (there is probably a nicer way of doing this).
    08/19/08 @ 15:04
    Comment from: jürgen [Visitor]
    ****-
    jürgen i made a mistake with using BETWEEN in the previous comment, it should be:

    payment_dt >= '1-' + DATENAME(m,GETDATE()) + '-' + CONVERT(varchar,YEAR(GETDATE())) and payment_dt < DATEADD(m,1,'1-' + DATENAME(m,GETDATE()) + '-' + CONVERT(varchar,YEAR(GETDATE())))

    which is pretty much the same as your answer, but not as nice.
    08/19/08 @ 15:21
    Comment from: Erik [Member] Email
    *****
    Erik >> Why the ugly and hard to grok

    It's not ugly when the performance is so great *without adding any more indexes*! And it's only hard to grok if you're new to it. The concept is quite simple, in fact, much simpler than the code looks.
    08/20/08 @ 11:39
    Comment from: Steve Campbell [Visitor] · http://blog.perfectapi.com
    ****-
    Steve Campbell The title should be "Only in a database is compiler optimization so poor that it cannot figure out a trivial transform".

    Really, from a developer perspective, it is *ridiculous* how poor the sql compiler is at making simple optimizations.
    08/21/08 @ 11:25
    Comment from: Christiaan Baes (chrissie1) [Member]
    *****
    Christiaan Baes (chrissie1) Well Steve I'm sure MS will appreciate the patch you are going to give them ;-).
    08/21/08 @ 13:03
    Comment from: Brent Ozar [Visitor] · http://www.brentozar.com
    *****
    Brent Ozar I'm chuckling about the angry developer comments above, complaining about why SQL Server can't predict the results on year/getdate. It's a poor musician who blames his instrument....
    08/30/08 @ 10:35
    Comment from: jbr [Visitor] · http://www.dl3m.com
    ***--
    jbr thanks for posting
    09/07/08 @ 15:42
    Comment from: Naomi Nosonovsky [Member]
    *****
    Naomi Nosonovsky Great article and very important.
    07/14/09 @ 13:25
    Comment from: Will [Visitor]
    Will Great info here.. even almost 2 years later! Question - many of the tables I'm dealing with at work have broken up month, day, and year into three different fields: STAT_DAY, STAT_MONTH (indexed), and FOUR_DIGIT_YEAR (indexed). I've been tinkering around with the best way to get this year's data, but I've primarily been using:

    WHERE FOUR_DIGIT_YEAR=YEAR(GETDATE())

    How would this fit into "sargable-ness"? Any better way to do it?
    03/30/10 @ 06:58
    Comment from: SQLDenis [Member] Email
    SQLDenis Will, YEAR(GETDATE() should be fine since the functions is evaluated against a column but only one time

    you could also do this

    declare @Year int
    select @Year = YEAR(GETDATE())

    .....
    .....
    WHERE FOUR_DIGIT_YEAR= @Year
    03/30/10 @ 07:56

    Leave a comment


    Your email address will not be revealed on this site.

    To mislead the spambots.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)