It’s not often that we (as programmers) need random numbers, especially from a database. When you are testing your queries for performance, it’s best to use a large table to do it. Sometimes you just don’t have access to a large table to test with, so you may think about creating many rows in your existing table.

SQL Server has a rand() function that will return a random (fractional) number between 0 and 1.

For example:

Select Rand()
--	0.686350654426017

The problem with the rand function occurs when you use set based operations.

Select Rand() As RandomNumber
From   (Select 1 As NUM Union All
        Select 2 Union All
        Select 3) As Alias
RandomNumber
----------------------
0.920057583532051
0.920057583532051
0.920057583532051

Not very random, is it? I mean, the number is random, but it’s also the same for each row. The purpose of this blog is to demonstrate how to get a random value in set based operations.

There is an interesting technique that you can use to accomplish this. There is a NewId() function in SQL Server that returns a GUID, for example: ‘94344EE4-5D7A-45EB-9EBC-7A596B7F90F3’. NewId does work well for set based operations, for example:

Select Rand() As RandomNumber, NewId() As GUID
From   (Select 1 As NUM Union All
        Select 2 Union All
        Select 3) As Alias
RandomNumber           GUID
---------------------- ------------------------------------
0.130542187318667      D12D6274-CB93-4CDB-B848-9F0FA8B88614
0.130542187318667      C708217F-5204-4284-B76B-A55D87BEFBC0
0.130542187318667      B42D7973-B3E2-420F-8D4D-DC72442D0824

Notice that the ‘Random Number’ column is the same for each row, but the GUID column is different. We can use this interesting fact to generate random numbers by combining this with another function available in SQL Server. The CHECKSUM function will return an integer hash value based on its argument. In this case, we can pass in a GUID, and checksum will return an integer.

Select Rand() As RandomNumber, 
       NewId() As GUID, 
       Checksum(NewId()) As RandomInteger
From   (Select 1 As NUM Union All
        Select 2 Union All
        Select 3) As Alias
RandomNumber           GUID                                 RandomInteger
---------------------- ------------------------------------ --------------
0.152440960483059      CDEB9F5D-E8A2-4FC7-A5A4-0E0A9CCF9786     84,364,212
0.152440960483059      9ABCEC8F-6C0C-4CEF-BA16-E8F0B50FB399 -1,317,220,961
0.152440960483059      6BD23217-F3FF-4F19-AA7F-C127CA7A4763    976,389,102

Before we continue, let’s take a look at the output, because there are some interesting observations that we need to consider before continuing. RandomInteger can be positive or negative because it’s limited to the range of an integer, so the values must fall between -2,147,483,648 and 2,147,483,647.

Most of the time, we want a random number within a certain range of numbers. In most languages, we simply multiply the result of the Rand() function to get this number. Since our RandomInteger is already a whole number, we really can’t do this. However, we could use the mod operator to guarantee a range of numbers. Mod is the remainder of a division operation, so if we mod a number by 10, we are guaranteed to get a number between -9 and +9. Unfortunately, this is a little misleading because there are 19 possible numbers we can get for this. So, to make sure we get a range to 10 numbers, we need to take the absolute value of the number, and then mod 10. Like this:

Select Rand() As RandomNumber, 
       NewId() As GUID, 
       Abs(Checksum(NewId())) % 10 As RandomInteger
From   (Select 1 As NUM Union All
        Select 2 Union All
        Select 3) As Alias
RandomNumber           GUID                                 RandomInteger
---------------------- ------------------------------------ -------------
0.490769895131745      8237A3F3-98C6-4B31-8F4B-8A80ECC8FBFF 1
0.490769895131745      C231EB65-FA81-4536-B909-44FEC91953D9 9
0.490769895131745      F0AD28C5-8AE4-46A1-B8AC-D8C9F90A5747 4

Now, we are guaranteed to get a Random number between 0 and 9. Suppose you want to get a random number between 10 and 15. The range of numbers is 6 (10,11,12,13,14,15). The mod value for this needs to be 6, to get a number in the range of 0 to 5. Then, we add 10 to the result to get a number in the range of 10 to 15.

If you want to generate a random number between -5 and 5, don’t be tempted to remove the absolute value function, because you will get numbers that appears to be random, but are not ‘as random’ as they should be.

In my database, I have a numbers table with 1,000,000 rows. When I run this code:

Select	RandomNumber, Count(*) As NumberCount
From	(
		Select Checksum(NewId()) % 6 As RandomNumber
		From Numbers
		) As A
Group By RandomNumber
Order By RandomNumber
RandomNumber NumberCount
------------ -----------
-5           83539
-4           83443
-3           83372
-2           82931
-1           82950
0            166254
1            83207
2            83635
3            83741
4            83417
5            83511

Notice that 0 has double the number of occurrences. Instead, we should write it like this:

Select	RandomNumber, Count(*) As NumberCount
From	(
		Select Abs(Checksum(NewId())) % 11 - 5 As RandomNumber
		From Numbers
		) As A
Group By RandomNumber
Order By RandomNumber
RandomNumber NumberCount
------------ -----------
-5           90889
-4           90794
-3           91365
-2           90476
-1           91104
0            90730
1            90895
2            90815
3            90762
4            91133
5            91037

Notice now that the values are ‘more’ random than the previous version (where 0 had double the number of occurrences). Since we are talking about random numbers, we wouldn’t expect there to be the same number of occurrences for each number, but with the previous version, there were double the number of zero’s, which isn’t truly random. This version has approximately the same number of zero’s as any other number, making it ‘more random’.