I was recently contacted by the fine gents of Webucator, an online training services provider. In order to promote their SQL Server classes, they are doing a free series called SQL Server Solutions from the Web where they show different SQL Server solutions found in blog posts around the web. Essentially, they are turning blog posts into videos. They asked me if they could turn my blog post Beware the defaults! (in windowing functions) into such a video and – humble as I am – I gave them permission to do so. And the result is now here for everyone to watch:
I'm quite pleased with the video – excellent Zoomit use by the way – as it highlights the most important aspects of my blog post:
- For regular aggregation functions using the OVER clause, do not use the ORDER BY unless necessary because this invokes the horrible default of RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
- If you do use the ORDER BY specify a correct sorting order.
Or even better, specify ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW or another framing option. Just not the default 🙂
The one thing missing in the video – and I must confess I didn't highlight it enough in my blog post – is the massive performance difference between ROWS and RANGE. I'm re-reading the excellent book by Itzik Ben-Gan about the T-SQL windowing functions and there Itzik explains why this is the case. In a nutshell:
When using the ROWS window frame extent, the window spool operator can use an optimized in-memory work table which speeds things up tremendously. However, when using RANGE the typical on-disk work table has to be used, which is of course much slower. In theory, RANGE is equal to ROWS when the ordering values are unique within the partition, but the optimizer doesn't check for uniqueness so RANGE will always default to the on-disk work table.
If you were not convinced to always specify ROWS instead of the default RANGE, I hope you now are.