In this blog post I want to discuss a problem I encounter frequently in SQL related forums – how to perform dynamic PIVOT involving multiple columns using pure T-SQL solution. In SQL Server Reporting Services this functionality can be easily achieved using Matrix template.
With the introduction of PIVOT operator in SQL 2005, it became very easy to write queries that transform rows into columns. There are numerous articles on the Web as how to perform PIVOT and even dynamic PIVOT. However, most of these articles explain how to perform such queries for just one column. I want to expand the transformation for multiple columns.
The idea of a solution is quite simple – you need to generate the SQL dynamically. Since we want to use pivot on multiple columns, we need to use CASE based pivot.
You need to keep in mind, that resulting query should have less than 1024 columns. You may build the check for number of columns into the query.
Now, every time I need to create a dynamic query, I need to know what is the final query I am going to arrive to. Having the idea in mind and printing the SQL command until I got it right helps to create the working query.
Let’s consider the first example based on the AdventureWorks SalesOrderHeader table.
This code creates the table we’re going to transform – it gives us summary of orders and total due per each quarter:
USE AdventureWorks
SELECT DATEPART(quarter,OrderDate) AS [Quarter],
DATEPART(YEAR,OrderDate) AS [Year],
COUNT(SalesOrderID) AS [Orders Count],
SUM(TotalDue) AS [Total Due]
INTO #SalesSummary
FROM Sales.SalesOrderHeader
GROUP BY DATEPART(quarter,OrderDate),
DATEPART(YEAR,OrderDate)
SELECT *
FROM #SalesSummary
ORDER BY [Year],
[Quarter]
The #SalesSummary table lists count of orders and total due for each quarter and year.
Quarter | Year | Orders Count | Total Due |
---|---|---|---|
3 | 2001 | 621 | $5,850,932.9483 |
4 | 2001 | 758 | $8,476,619.278 |
1 | 2002 | 741 | $7,379,686.3091 |
2 | 2002 | 825 | $8,210,285.1655 |
3 | 2002 | 1054 | $13,458,206.13 |
4 | 2002 | 1072 | $10,827,327.4904 |
1 | 2003 | 1091 | $8,550,831.8702 |
2 | 2003 | 1260 | $10,749,269.374 |
3 | 2003 | 4152 | $18,220,131.5285 |
4 | 2003 | 5940 | $16,787,382.3141 |
1 | 2004 | 6087 | $14,170,982.5455 |
2 | 2004 | 6888 | $17,969,750.9487 |
3 | 2004 | 976 | $56,178.9223 |
Now, suppose we want to see these results horizontally. The following dynamic SQL produces the desired output:
DECLARE @SQL NVARCHAR(MAX),
@Cols NVARCHAR(MAX)
SELECT @Cols = STUFF((select ',
SUM(CASE WHEN [Quarter]=' + CAST([Quarter] as varchar(3)) + ' AND [Year] = ' +
CAST([Year] as char(4)) +
' THEN [Orders Count] ELSE 0 END) AS [' +
CAST([Year] as char(4)) + '-' + CAST([Quarter] as varchar(3)) +
' Orders],
SUM(CASE WHEN [Quarter]=' + CAST([Quarter] as varchar(3)) +
' AND [Year] = ' + CAST([Year] as char(4)) +
' THEN [Total Due] ELSE 0 END) AS [' + CAST([Year] as char(4)) + '-' +
CAST([Quarter] as varchar(3)) + ' Sales]'
FROM #SalesSummary
ORDER BY [Year],[Quarter] FOR XML PATH(''),type).value('.','varchar(max)'),1,2,'')
SET @SQL = 'SELECT ' + @Cols + ' FROM #SalesSummary'
--print @SQL
EXECUTE( @SQL)
It produces the transformed result (I show only few first columns):
2001-3 Orders | 2001-3 Sales | 2001-4 Orders | 2001-4 Sales | 2002-1 Orders | 2002-1 Sales | 2002-2 Orders | 2002-2 Sales |
---|---|---|---|---|---|---|---|
621 | $5,850,932.9483 | 758 | $8,476,619.278 | 741 | $7,379,686.3091 | 825 | $8,210,285.1655 |
You may want to un-comment PRINT @SQL statement if you want to see the generated command. I used XML PATH to concatenate rows into a string based on this blog post by Brad Schulz Making a list and checking it twice.
Another interesting problem was presented in this MSDN Transact-SQL forum thread:
For the claims table that has many integer columns, transform rows into columns. I demonstrate just the solution with the table creation script:
USE tempdb
IF OBJECT_ID('Claims','U') IS NOT NULL
DROP TABLE Claims
GO
CREATE TABLE Claims (
Claim INT,
HCPC INT,
[mod] INT,
charge INT,
paid INT,
Qty INT)
INSERT INTO Claims
SELECT 12345,
99245,
90,
20,
10,
1
UNION ALL
SELECT 12345,
99112,
NULL,
30,
20,
1
UNION ALL
SELECT 12345,
99111,
80,
50,
25,
2
UNION ALL
SELECT 11112,
99911,
60,
50,
20,
1
UNION ALL
SELECT 12222,
99454,
NULL,
50,
20,
1
SELECT * FROM Claims
The output is:
Claim | HCPC | mod | Charge | Paid | Qty |
---|---|---|---|---|---|
12345 | 99245 | 90 | 20 | 10 | 1 |
12345 | 99112 | NULL | 30 | 20 | 1 |
12345 | 99111 | 80 | 50 | 25 | 2 |
11112 | 99911 | 60 | 50 | 20 | 1 |
12222 | 99454 | NULL | 50 | 20 | 1 |
And the following code transforms the result (as you see, I am using a loop and ROW_NUMBER() approach):
DECLARE @SQL NVARCHAR(MAX),
@Loop INT,
@MaxRows INT
SET @Sql = ''
SELECT @MaxRows = MAX(MaxRow)
FROM (SELECT COUNT(* ) AS MaxRow,
Claim
FROM Claims
GROUP BY Claim) X
SET @Loop = 1
WHILE @Loop <= @MaxRows
BEGIN
SELECT @SQL = @SQL + ', SUM(CASE WHEN Row = ' + CAST(@Loop AS VARCHAR(10)) + ' THEN ' + QUOTENAME(Column_Name) + ' END) AS [' + COLUMN_NAME + CAST(@Loop AS VARCHAR(10)) + ']'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_Name = 'Claims'
AND COLUMN_NAME NOT IN ('Row','Claim')
SET @Loop = @Loop + 1
END
--PRINT @SQL
SET @SQL = 'SELECT Claim' + @SQL + ' FROM (select *, row_number() over (partition by Claim ORDER BY Claim) as Row FROM Claims) X GROUP BY Claim '
PRINT @SQL
EXECUTE( @SQL)
Claim | HCPC1 | mod1 | Charge1 | Paid1 | Qty1 | HCPC2 | mod2 | Charge2 | Paid2 | Qty2 | HCPC3 | mod3 | Charge3 | Paid3 | Qty3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
11112 | 99911 | 60 | 50 | 20 | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
12222 | 99454 | NULL | 50 | 20 | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
12345 | 99245 | 90 | 20 | 10 | 1 | 99112 | NULL | 30 | 20 | 1 | 99111 | 80 | 50 | 25 | 2 |
I'll post another recent problem found in the following MSDN thread Basic Crosstab Query
SET NOCOUNT ON;
GO
USE tempdb;
GO
CREATE TABLE #T (
StoreID char(5) NOT NULL,
CriteriaNo int NOT NULL,
Result int NOT NULL,
Position int NOT NULL,
PRIMARY KEY (StoreID, CriteriaNo)
);
GO
INSERT INTO #T(StoreID, CriteriaNo, Result, Position)
SELECT '0001', 9, 10, 1 UNION ALL
SELECT '0002', 9, 12, 2 UNION ALL
SELECT '0001', 10, 5, 1 UNION ALL
SELECT '0002', 10, 6, 2;
GO
-- dynamic
DECLARE @sql nvarchar(MAX), @Cols nvarchar(max);
SELECT @Cols = (select ', ' + 'MAX(case when CriteriaNo = ' + CONVERT(varchar(20), CriteriaNo) +
' then Result else 0 end) AS CriteriaNo' + CONVERT(varchar(20), CriteriaNo) +
', MAX(case when CriteriaNo = ' + CONVERT(varchar(20), CriteriaNo) +
' then Position else 0 end) AS CriteriaPosition' + CONVERT(varchar(20), CriteriaNo)
from (select distinct CriteriaNo from #T) X ORDER By CriteriaNo
FOR XML PATH(''))
SET @sql = 'SELECT StoreID' + @Cols + ', SUM(Result) as PerformancePrint
FROM #T
GROUP BY StoreID'
EXECUTE(@sql)
See also my TechNet WiKi article on this same topic with more code samples
T-SQL: Dynamic Pivot on Multiple Columns
*** Remember, if you have a SQL related question, try our Microsoft SQL Server Programming forum or our Microsoft SQL Server Admin forum