Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « how to make sure that the view will have the underlying table changes by using sp_refreshviewPerformancePoint Server To Become PerformancePoint Services In SharePoint Server »
    comments

    It's not often that we (as programmers) need random numbers, especially from a database. When you are testing your queries for performance, it's best to use a large table to do it. Sometimes you just don't have access to a large table to test with, so you may think about creating many rows in your existing table.

    SQL Server has a rand() function that will return a random (fractional) number between 0 and 1.
    For example:

    1. Select Rand()
    2. --  0.686350654426017

    The problem with the rand function occurs when you use set based operations.

    1. Select Rand() As RandomNumber
    2. From   (Select 1 As NUM Union All
    3.         Select 2 Union All
    4.         Select 3) As Alias
    RandomNumber
    ----------------------
    0.920057583532051
    0.920057583532051
    0.920057583532051

    Not very random, is it? I mean, the number is random, but it's also the same for each row. The purpose of this blog is to demonstrate how to get a random value in set based operations.

    There is an interesting technique that you can use to accomplish this. There is a NewId() function in SQL Server that returns a GUID, for example: '94344EE4-5D7A-45EB-9EBC-7A596B7F90F3'. NewId does work well for set based operations, for example:

    1. Select Rand() As RandomNumber, NewId() As GUID
    2. From   (Select 1 As NUM Union All
    3.         Select 2 Union All
    4.         Select 3) As Alias
    RandomNumber           GUID
    ---------------------- ------------------------------------
    0.130542187318667      D12D6274-CB93-4CDB-B848-9F0FA8B88614
    0.130542187318667      C708217F-5204-4284-B76B-A55D87BEFBC0
    0.130542187318667      B42D7973-B3E2-420F-8D4D-DC72442D0824
    

    Notice that the 'Random Number' column is the same for each row, but the GUID column is different. We can use this interesting fact to generate random numbers by combining this with another function available in SQL Server. The CHECKSUM function will return an integer hash value based on its argument. In this case, we can pass in a GUID, and checksum will return an integer.

    1. Select Rand() As RandomNumber,
    2.        NewId() As GUID,
    3.        Checksum(NewId()) As RandomInteger
    4. From   (Select 1 As NUM Union All
    5.         Select 2 Union All
    6.         Select 3) As Alias
    RandomNumber           GUID                                 RandomInteger
    ---------------------- ------------------------------------ --------------
    0.152440960483059      CDEB9F5D-E8A2-4FC7-A5A4-0E0A9CCF9786     84,364,212
    0.152440960483059      9ABCEC8F-6C0C-4CEF-BA16-E8F0B50FB399 -1,317,220,961
    0.152440960483059      6BD23217-F3FF-4F19-AA7F-C127CA7A4763    976,389,102
    

    Before we continue, let's take a look at the output, because there are some interesting observations that we need to consider before continuing. RandomInteger can be positive or negative because it's limited to the range of an integer, so the values must fall between -2,147,483,648 and 2,147,483,647.

    Most of the time, we want a random number within a certain range of numbers. In most languages, we simply multiply the result of the Rand() function to get this number. Since our RandomInteger is already a whole number, we really can't do this. However, we could use the mod operator to guarantee a range of numbers. Mod is the remainder of a division operation, so if we mod a number by 10, we are guaranteed to get a number between -9 and +9. Unfortunately, this is a little misleading because there are 19 possible numbers we can get for this. So, to make sure we get a range to 10 numbers, we need to take the absolute value of the number, and then mod 10. Like this:

    1. Select Rand() As RandomNumber,
    2.        NewId() As GUID,
    3.        Abs(Checksum(NewId())) % 10 As RandomInteger
    4. From   (Select 1 As NUM Union All
    5.         Select 2 Union All
    6.         Select 3) As Alias
    RandomNumber           GUID                                 RandomInteger
    ---------------------- ------------------------------------ -------------
    0.490769895131745      8237A3F3-98C6-4B31-8F4B-8A80ECC8FBFF 1
    0.490769895131745      C231EB65-FA81-4536-B909-44FEC91953D9 9
    0.490769895131745      F0AD28C5-8AE4-46A1-B8AC-D8C9F90A5747 4
    

    Now, we are guaranteed to get a Random number between 0 and 9. Suppose you want to get a random number between 10 and 15. The range of numbers is 6 (10,11,12,13,14,15). The mod value for this needs to be 6, to get a number in the range of 0 to 5. Then, we add 10 to the result to get a number in the range of 10 to 15.

    If you want to generate a random number between -5 and 5, don’t be tempted to remove the absolute value function, because you will get numbers that appears to be random, but are not 'as random' as they should be.

    In my database, I have a numbers table with 1,000,000 rows. When I run this code:

    1. Select  RandomNumber, Count(*) As NumberCount
    2. From    (
    3.         Select Checksum(NewId()) % 6 As RandomNumber
    4.         From Numbers
    5.         ) As A
    6. Group By RandomNumber
    7. Order By RandomNumber
    RandomNumber NumberCount
    ------------ -----------
    -5           83539
    -4           83443
    -3           83372
    -2           82931
    -1           82950
    0            166254
    1            83207
    2            83635
    3            83741
    4            83417
    5            83511
    

    Notice that 0 has double the number of occurrences. Instead, we should write it like this:

    1. Select  RandomNumber, Count(*) As NumberCount
    2. From    (
    3.         Select Abs(Checksum(NewId())) % 11 - 5 As RandomNumber
    4.         From Numbers
    5.         ) As A
    6. Group By RandomNumber
    7. Order By RandomNumber
    RandomNumber NumberCount
    ------------ -----------
    -5           90889
    -4           90794
    -3           91365
    -2           90476
    -1           91104
    0            90730
    1            90895
    2            90815
    3            90762
    4            91133
    5            91037
    

    Notice now that the values are 'more' random than the previous version (where 0 had double the number of occurrences). Since we are talking about random numbers, we wouldn't expect there to be the same number of occurrences for each number, but with the previous version, there were double the number of zero's, which isn't truly random. This version has approximately the same number of zero's as any other number, making it 'more random'.

    About the Author

    George has been developing software professionally for 19 years, first for the department of defense, and then for various other companies. In 1998, George started his software company, Orbit Software, specializing in School Bus Transportation software. His specialty is refining SQL Server queries to deliver optimal performance.
    Social SitingsTwitterLTD RSS Feed
    InstapaperVote on HN

    9 comments

    Comment from: SQLDenis [Member] Email
    *****
    SQLDenis Interesting, why are you doing the % 11 - 5 part?
    01/24/09 @ 13:53
    Comment from: George Mastros (gmmastros) [Member]
    George Mastros (gmmastros) Because there are 11 possible numbers between -5 and 5. Since I am taking the absolute value, you get a range between 0 and 10, subtract 5 and your range becomes -5 to 5.

    01/24/09 @ 13:57
    Comment from: Web [Visitor] · http://www.webdigi.co.uk
    *****
    Web great tip.
    01/26/09 @ 13:11
    Comment from: Jackie [Visitor]
    *****
    Jackie very very helpful!
    05/11/09 @ 07:20
    Comment from: Bud [Visitor] · http://www.campuscollusion.com
    Bud Great, this is exactly what i needed to server up questions in a random order. Thanks.
    11/18/09 @ 07:13
    Comment from: Naomi Nosonovsky [Member]
    Naomi Nosonovsky Great blog, thanks.
    11/18/09 @ 09:52
    Comment from: Jack Tipsword [Visitor] · http://www.impactscs.com
    Jack Tipsword Using your technique I created a trigger on a table that I needed a random password for. The trigger is in SQL Server 2008. For this application I didn't need a strong password so this solution worked very well.

    Here is the code

    CREATE TRIGGER [dbo].[trg_tblResponsesPW]
    ON [dbo].[tblResponses]
    AFTER INSERT
    AS
    BEGIN

    DECLARE
    @MyCursor CURSOR,
    @intID int,
    @PW varchar (8)

    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    SET @MyCursor = CURSOR FAST_FORWARD
    FOR
    SELECT intID FROM inserted

    -- Loop through each record in the inserted table to get the unique ID field. In each loop update the table
    -- that the record was appended to with a new random password. I loop here so that when bulk inserts occur each new record get's it's own unique password

    OPEN @MyCursor
    FETCH NEXT FROM @MyCursor
    -- Populate the variables with the values from the SQL Statement
    INTO @intID

    WHILE @@FETCH_STATUS = 0
    BEGIN

    -- The NewId function returns a GUID Like "D12D6274-CB93-4CDB-B848-9F0FA8B88614". Here I pull the 6 characters
    -- from the right but I could pull from anywhere or pull any number of characters depending on how long I wanted
    -- my password to be.
    SET @PW = Right(NewID(),6)

    UPDATE tblResponses
    SET txtPW = @PW
    WHERE (intID = @intID)

    -- Now we are finished doing the work so move to the next inserted record
    FETCH NEXT FROM @MyCursor
    -- and populate the next variable
    INTO @intID

    END

    CLOSE @MyCursor
    DEALLOCATE @MyCursor

    END
    03/12/10 @ 05:48
    Comment from: David Manheim [Visitor]
    David Manheim The ABS(CHECKSUM(NewId())) is less random than is needed for some applications. There is an article that can be found online describing the differences in how "random" the different techniques are: Random Sampling in T-SQL
    by Brian Connolly.

    An alternative in terms of generating the numbers so that the same is not returned multiple times, is described well in an article "Returning Random Numbers in a SELECT statement" on Jeff's SQL Server Blog, on the Sqlteam web site.
    03/08/11 @ 07:26
    Comment from: Phoenix [Visitor] · http://www.jwt.com
    Phoenix Thanks for posting this. In my case, I needed to fill a map table with random IDs from other tables. I was able to adapt your technique to achieve what I needed (identifiers munged):

    DECLARE @count INT
    SET @count = (SELECT COUNT(*) FROM [ProductType])

    INSERT INTO [ProductTypeMap]
    (
    [ProductId],
    [ProductTypeId],
    )
    SELECT
    [p].[ProductId],
    [t].[ProductTypeId]
    FROM
    (
    SELECT
    [ProductId],
    (ABS(CHECKSUM(NEWID())) % @count + 1) AS [RowNumber]
    FROM
    [Product]
    ) AS [p]
    LEFT JOIN
    (
    SELECT
    [ProductTypeId],
    (ROW_NUMBER() OVER (ORDER BY [ProductTypeId])) AS [RowNumber]
    FROM
    [ProductType]
    ) AS [p]
    ON ([t].[RowNumber] = [p].[RowNumber])
    04/06/11 @ 17:11

    Leave a comment


    Your email address will not be revealed on this site.

    To mislead the spambots.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)