In my Are you ready for SQL Server 2012 or are you still partying like it is 1999? post, I wrote about how you should start using SQL Server 2005 and SQL Server 2008 functionality now in order to prepare for SQL Server 2012. I still see tons of code that is written in the pre 2005 style and people still keep using those functions, procs and statements even though SQL Server 2005 and 2008 have much better functionality.
In today’s post I will cover Common Table Expressions(CTEs). Common Table Expressions were introduced in SQL Server 2005 and you can think of them as a derived table in another form of a table expression.
The syntax looks like this
[ WITH[ ,...n ] ] ::= expression_name [ ( column_name [ ,...n ] ) ] AS ( CTE_query_definition )
Here are the arguments:
expression_name
Is a valid identifier for the common table expression. expression_name must be different from the name of any other common table expression defined in the same WITH <common_table_expression> clause, but expression_name can be the same as the name of a base table or view. Any reference to expression_name in the query uses the common table expression and not the base object.
column_name
Specifies a column name in the common table expression. Duplicate names within a single CTE definition are not allowed. The number of column names specified must match the number of columns in the result set of the CTE_query_definition. The list of column names is optional only if distinct names for all resulting columns are supplied in the query definition.
CTE_query_definition
Specifies a SELECT statement whose result set populates the common table expression. The SELECT statement for CTE_query_definition must meet the same requirements as for creating a view, except a CTE cannot define another CTE.
If more than one CTE_query_definition is defined, the query definitions must be joined by one of these set operators: UNION ALL, UNION, EXCEPT, or INTERSECT.
With Common Table Expressions you can use two form of table aliasing; external and internal. If the columns are aliased in the query definition itself then this is an internal alias
--inline column alias
;WITH cte AS (SELECT 1 AS col1,2 AS col2)
SELECT * FROM cte
If you supply the column names in the parentheses right after the Common Table Expression name then this is external aliasing
-- external column alias
;WITH cte(col1,col2) AS (SELECT 1,2)
SELECT * FROM cte
If you use both external and internal then the external names will show up in the result
;WITH cte(bla,bla2) AS (SELECT 1 AS col1,2 AS col2)
SELECT * FROM cte
Output
bla bla2 ------------- 1 2
</common_table_expression>
Every Common Table Expression has to start with WITH, the statement before it has to be terminated with a semicolon. What you can do is start your Common Table Expression with semicolon followed by WITH (;WITH) this way if someone adds some code before your Common Table Expression it won’t mess up the proc or sql script
Here is an example of one Common Table Expression where the previous statement wasn’t terminated with a semicolon
SELECT 1
WITH cte AS (SELECT 1 AS col1,2 AS col2)
SELECT * FROM cte
And here is the error
_Msg 319, Level 15, State 1, Line 2
Incorrect syntax near the keyword ‘with’. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon._
Adding the semicolon before WITH fixes that
SELECT 1
;WITH cte AS (SELECT 1 AS col1,2 AS col2)
SELECT * FROM cte
Some more examples
Paging with Common Table Expressions
I will cover the windowing functions tomorrow but here is just a small example of how you can use a Common Table Expression to do paging
USE master
GO
;WITH cte AS(SELECT ROW_NUMBER() OVER(ORDER BY name) AS row,*
FROM sys.sysobjects)
SELECT row, name
FROM cte
WHERE row BETWEEN 11 AND 20
GO
Here are my results
row name
11 assembly_types
12 asymmetric_keys
13 availability_group_listener_ip_addresses
14 availability_group_listeners
15 availability_groups
16 availability_read_only_routing_lists
17 availability_replicas
18 backup_devices
19 certificates
20 change_tracking_databases
Okay that is all fine, what if I want to also add a total count? You can reference a Common Table Expression more than once in a query so that is no problem, here is what the query looks like
;WITH cte AS(SELECT ROW_NUMBER() OVER(ORDER BY name) AS row,*
FROM sys.sysobjects)
SELECT row, name, (SELECT COUNT(*) FROM cte) AS TotalCount
FROM cte
WHERE row BETWEEN 11 AND 20
GO
Here is another example that selects more than once from the same Common Table Expression
;WITH cte AS(SELECT ROW_NUMBER() OVER(ORDER BY name) AS row,*
FROM sys.sysobjects)
SELECT COUNT(*) as TheCount,
(SELECT Min(row) from cte) as MinRow,
(SELECT Max(row) from cte) as MaxRow
FROM cte
However if you reference the Common Table Expression in more than one query you will get an error, run this
;WITH cte AS(SELECT ROW_NUMBER() OVER(ORDER BY name) AS row,*
FROM sys.sysobjects)
SELECT COUNT(*) as TheCount from cte
SELECT Min(row) from cte as MinRow
Msg 208, Level 16, State 1, Line 6
Invalid object name ‘cte’.
As you can see, you got an invalid object name error.
If you put a UNION in between the two queries, it will also work fine since it is considered one query
;WITH cte AS(SELECT ROW_NUMBER() OVER(ORDER BY name) AS row,*
FROM sys.sysobjects)
SELECT COUNT(*) as TheCount from cte
union all
SELECT Min(row) from cte as MinRow
You can also do multiple levels of recursion with Common Table Expression. Here is an example from books on line. First create this table in the AdventureWorks2008R2 database
USE AdventureWorks2008R2;
GO
-- Create an Employee table.
CREATE TABLE dbo.MyEmployees
(
EmployeeID smallint NOT NULL,
FirstName nvarchar(30) NOT NULL,
LastName nvarchar(40) NOT NULL,
Title nvarchar(50) NOT NULL,
DeptID smallint NOT NULL,
ManagerID int NULL,
CONSTRAINT PK_EmployeeID PRIMARY KEY CLUSTERED (EmployeeID ASC)
);
-- Populate the table with values.
INSERT INTO dbo.MyEmployees VALUES
(1, N'Ken', N'Sánchez', N'Chief Executive Officer',16,NULL)
,(273, N'Brian', N'Welcker', N'Vice President of Sales',3,1)
,(274, N'Stephen', N'Jiang', N'North American Sales Manager',3,273)
,(275, N'Michael', N'Blythe', N'Sales Representative',3,274)
,(276, N'Linda', N'Mitchell', N'Sales Representative',3,274)
,(285, N'Syed', N'Abbas', N'Pacific Sales Manager',3,273)
,(286, N'Lynn', N'Tsoflias', N'Sales Representative',3,285)
,(16, N'David',N'Bradley', N'Marketing Manager', 4, 273)
,(23, N'Mary', N'Gibson', N'Marketing Specialist', 4, 16);
Now this is the query that will give you the first two levels of the employee hierarchy
WITH DirectReports(ManagerID, EmployeeID, Title, EmployeeLevel) AS
(
SELECT ManagerID, EmployeeID, Title, 0 AS EmployeeLevel
FROM dbo.MyEmployees
WHERE ManagerID IS NULL
UNION ALL
SELECT e.ManagerID, e.EmployeeID, e.Title, EmployeeLevel + 1
FROM dbo.MyEmployees AS e
INNER JOIN DirectReports AS d
ON e.ManagerID = d.EmployeeID
)
SELECT ManagerID, EmployeeID, Title, EmployeeLevel
FROM DirectReports
WHERE EmployeeLevel <= 2 ;
GO
ManagerID EmployeeID Title EmployeeLevel ----------- ---------- -------------------------------------------------- --- NULL 1 Chief Executive Officer 0 1 273 Vice President of Sales 1 273 16 Marketing Manager 2 273 274 North American Sales Manager 2 273 285 Pacific Sales Manager 2
You can also limit the number of recursions by applying a hint, for example OPTION (MAXRECURSION 5), that will limit it to 5 levels.
Besides just selecting from Common Table Expression you can also manipulate data through them, the underlying tables will then have the changes. See my post Use common table expressions to simplify your updates in SQL Server to learn more about that