Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « Renaming Physical Database FilesT-SQL Tuesday#15 – Automation with PowerShell (Replace with RegEx) »
    comments

    Pivoting and Unpivoting data has been one of the parts of tsql that has been eluding my grasp for a while now. I’ve seen plenty of examples for both of these, but none of those examples really helped me in understanding exactly what’s going on. We have a wiki article for each here on LTD, Pivot and Unpivot,
    but again these don’t tell me a whole lot about what’s going on, just the before and after pictures. Well, I was messing around with the unpivot code the other day, and how it worked finally dawned on me. So, I would like to share what I learned.

    At my part time job, we were working on a Workforce Availability study. As part of the study, we hired out to a phone survey company. We created a survey script for their callers and gave it to them. The company made a lot of phone calls and when all was said and done they provided us with a spreadsheet of the answers. Needless to say this spreadsheet was in no way normalized for data import. It looked a something like this:

    1. declare @survey table (Respondent int identity(1,1), Q1 int, Q2 int, Q3 int, Q4 int, Q5 int)
    2.  
    3. insert into @survey values (1,3,1,5,8)
    4. insert into @survey values (2,4,1,6,8)
    5. insert into @survey values (1,4,1,7,9)
    6. insert into @survey values (2,3,2,6,9)
    7. insert into @survey values (1,3,1,5,10)
    8.  
    9. select * from @survey

    Pivot 1

    Well, right now that’s a decent sized mess to clean up using union alls. However, we don’t have to make that big mess thanks to unpivot. Here’s what the unpivot code looks like.

    1. select
    2.     Respondent,
    3.     Q,
    4.     A
    5. from @survey
    6. unpivot
    7. (
    8.     A for Q in (Q1, Q2, Q3, Q4, Q5)
    9. ) as u

    The way this works is in the unpivot statement, the A for Q statement breaks apart each column you list in the in statement. So for columns Q1, Q2, Q3, Q4, and Q5, the unpivot function breaks it apart, putting the column name in Q, and putting the actual value into A.

    Now that the data’s normalized we can join to it rather easily.

    1. declare @survey table (Respondent int identity(1,1), Q1 int, Q2 int, Q3 int, Q4 int, Q5 int)
    2. declare @questions table (PK_quID int identity(1,1), quName varchar(255))
    3. declare @answers table (PK_anID int identity(1,1), anName varchar(255))
    4.  
    5. insert into @questions values ('Do you have a drivers license?')
    6. insert into @questions values ('What best describes your current employment conditions?')
    7. insert into @questions values ('Do you have hair?')
    8. insert into @questions values ('What kind of vehicle do you drive?')
    9. insert into @questions values ('What state are you from?')
    10. insert into @answers values ('yes')
    11. insert into @answers values ('no')
    12. insert into @answers values ('working')
    13. insert into @answers values ('not working')
    14. insert into @answers values ('car')
    15. insert into @answers values ('truck')
    16. insert into @answers values ('suv')
    17. insert into @answers values ('MO')
    18. insert into @answers values ('IL')
    19. insert into @answers values ('TN')
    20. insert into @survey values (1,3,1,5,8)
    21. insert into @survey values (2,4,1,6,8)
    22. insert into @survey values (1,4,1,7,9)
    23. insert into @survey values (2,3,2,6,9)
    24. insert into @survey values (1,3,1,5,10)
    25.  
    26.  
    27.  
    28. select
    29.     s.*,
    30.     qu.*,
    31.     an.*
    32. from
    33. (
    34.     select
    35.         Respondent,
    36.         right(Q,1) as Q,
    37.         A
    38.     from @survey
    39.     unpivot
    40.     (
    41.         A for Q in (Q1, Q2, Q3, Q4, Q5)
    42.     ) as u
    43. ) s
    44.     left outer join @questions qu
    45.         on s.Q=qu.PK_quID
    46.     left outer join @answers an
    47.         on s.a = an.PK_anID

    Well, seeing as my data is now normalized I can put it into my responses table.

    1. declare @responses table (Respondent int, FK_quID int, FK_anID int)
    2.  
    3. insert into @responses
    4. (
    5.     Respondent,
    6.     FK_quID,
    7.     FK_anID
    8. )
    9. select
    10.     Respondent,
    11.     right(Q,1) as Q,
    12.     A
    13. from @survey
    14. unpivot
    15. (
    16.     A for Q in (Q1, Q2, Q3, Q4, Q5)
    17. ) as u

    Alright, now I’ve got my survey data stored and I want to run some analytics. One of the ways to analyze the data is to run a crosstab off of the data. Here’s a crosstab using pivot:

    1. select
    2. *
    3. from
    4. (
    5.     select
    6.         Respondent,
    7.         quName,
    8.         anName
    9.     from @responses r
    10.         inner join @questions q
    11.             on r.FK_quID=q.PK_quID
    12.         inner join @answers a
    13.             on r.FK_anID=a.PK_anID
    14. ) a
    15. pivot
    16. (
    17.     count(a.Respondent)
    18.     for anName in
    19.         (
    20.             [yes],
    21.             [no],
    22.             [working],
    23.             [not working],
    24.             [car],
    25.             [truck],
    26.             [suv],
    27.             [MO],
    28.             [IL],
    29.             [TN]
    30.         )
    31. ) as pvt

    In this I’m joining my response table to my questions and answers table in a subquery. I then pivot all of that data using the pivot function. The pivot function takes an aggregate and applies that across the columns and rows. In this example, I specify count(a.Respondent) as my aggregate, and then I specify that I want that across anName (and I provide a list of values that I want it applied to) on the X axis. Unpivot then takes the last column and applies that to the Y column, and then applies the aggregate data to any pieces of that that match up on the X and Y axis.

    Pivot 2

    So, in the entire survey, three respondents answered yes to having a driver’s license, while two answered no, four answered as having hair while one answered no.

    If you remove quName from the subquery, pivot doesn’t have a third data point to aggregate the data to, so it’ll just aggregate to the X axis.

    Pivot 3

    You can play with the aggregate in the query a little bit. If you change it to avg, it’ll average all the respondent ids. If you change it to min, it’ll show you which respondent provided each answer for each question first.

    About the Author

    I have an A.S. in Computer Programming. I am currently a Database Developer (April 2008-Present) for a development group at a contract research lab, where we build in-house applications using Microsoft SQL Server (2005 and 2008) and ASP.Net 3.5. I have a part-time job (April 2007-Present) as a dba/db developer/web developer/designer. I use: SQL Server 2005/08/R2, HTML, CSS, PHP, and Java Script. Penguins are awesome, but I'm not a linux person. If you have any questions feel free to email me at dforck@gmail.com
    Social SitingsTwitterFacebookLTD RSS Feed
    Instapaper

    4 comments

    Comment from: SQLDenis [Member] Email
    SQLDenis Yep PIVOT is one of my favorite 'new' things....The other day I had to do a 2000+ column dynamic Pivot..was interesting
    02/08/11 @ 17:21
    SQLKohai this was so helpful! Thanks so much!
    07/22/11 @ 08:42
    SQLKohai Just out of curiosity, how to you account for null responses? I have some in my dataset, and it doesn't bother listing the question when I do.
    07/22/11 @ 10:06
    Comment from: David Forck (thirster42) [Member]
    @SQLKohai, if there's the potential of an answer not getting answered i typically greate a catch all "No Response" answer and apply that where i have missing answers.
    07/22/11 @ 13:35

    Leave a comment


    Your email address will not be revealed on this site.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)