Generally when we are defining tables, the more specific the column definition the better. Yesterday, however, I ran into a case where better definition actually has increased storage use for no appreciable benefit.
Integers - Using Numeric vs Int
As I was working on the database I came across a curious sight, multiple columns defined as numeric(7,0), numeric(9,0), and so on. It seemed like someone was trying to provide the database with the most specific definition possible for a number of different pieces of data. Having never run into this particular practice, I immediately started searching for a reason. Was it smaller? faster? better?
Storage
Using a very specific, well-defined numeric has actually cost us storage space, not reduced it. A numeric (or decimal) with a precision value of 1 to 9 requires 5 bytes and with 10-19 requires 9 bytes. Compare this to the many varieties of int:
| Digits | Int Variety | Int Bytes | Numeric(*,0) Bytes | Difference |
|---|---|---|---|---|
| 1 - 2 | tinyint | 1 | 5 | 4 bytes |
| 3 - 4 | smallint | 2 | 5 | 3 bytes |
| 5 - 9 | int | 4 | 5 | 1 byte |
| 10 - 18 | bigint | 8 | 9 | 1 byte |
So for every row and every index that includes this value, we lose storage space.
References: Int reference on MSDN and Numeric/Decimal reference on MSDN
Performance #1
When SQL Server is asked to execute a math function (+,-,*,/), it uses a defined set of rules to determine the output type, then implicitly converts the arguments to that type (see this article for a subset that relates to decimals). This means that in many cases there could be implicit conversions to numeric from int, so it's possible someone believed we could try and tweak our performance by defining the field as numeric instead of an int.
Let's test out implicit conversions:
- /* ****** Creation of some number tables ****** */
- Create Table NumberIntTest(Num Int Identity(1,1) Primary Key)
- go
- Set NOCOUNT ON
- Begin Tran
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Insert Into NumberIntTest Default Values
- Commit Tran
- go 100000
- Create Table NumberNumericTest(Num numeric(7,0) Primary Key)
- Go
- Set NOCOUNT ON
- Begin Tran
- Insert Into NumberNumericTest(Num) Select Num From NumberIntTest WHERE Num > 10001
- Commit Tran
- Go
- /* ****** Execute a variety of test scripts ****** */
- DECLARE @Start DateTime;
- DECLARE @Garbage int, @Junk numeric(7,0);
- DECLARE @Int int; SET @Int = 1;
- DECLARE @Num numeric(7,0); SET @Num = 1;
- -- Divide an int by an int
- SELECT @start = GETDATE();
- SELECT @Garbage = Num/@Int FROM NumberIntTest n;
- SELECT DateDiff(Millisecond, @Start, GetDate());
- -- Divide a numeric(7,0) by an int
- SELECT @start = GETDATE();
- SELECT @Junk = Num/@Int FROM NumberNumericTest n;
- SELECT DateDiff(Millisecond, @Start, GetDate());
- -- Divide a numeric(7,0) by a numeric(7,0)
- SELECT @start = GETDATE();
- SELECT @Junk = Num/@Num FROM NumberNumericTest n;
- SELECT DateDiff(Millisecond, @Start, GetDate());
- -- Divide an int by an int w/ explicit casting to numeric
- SELECT @start = GETDATE();
- SELECT @Junk = Num/CAST(@Int as numeric(7,0)) FROM NumberIntTest n;
- SELECT DateDiff(Millisecond, @Start, GetDate());
- -- Divide an numeric(7,0) by an int w/ explicit casting
- SELECT @start = GETDATE();
- SELECT @Junk = Num/CAST(@Int as numeric(7,0)) FROM NumberNumericTest n;
- SELECT DateDiff(Millisecond, @Start, GetDate());
Initially I compared the execution plans and didn't see much difference, but after some modifications (thanks George!) and additions we can see the differences between a number of different situations.
Sample Results:
| Test | time (ms) |
|---|---|
| int/int - No Cast | 170 |
| numeric/int - Implicit Cast | 313 |
| numeric/numeric - Implicit Cast | 296 |
| int/CAST(int as numeric) - Implicit Cast | 320 |
| numeric/CAST(int as numeric) | 290 |
In the second test's plan we can see an example of that implicit cast:
numeric/int w/ Implicit Cast:
|--Compute Scalar(DEFINE:([Expr1002]=CONVERT_IMPLICIT(numeric(7,0),[utils].[dbo].[NumberNumericTest].[Num] as [n].[Num]/CONVERT_IMPLICIT(numeric(10,0),[@Int],0),0)))
|--Clustered Index Scan(OBJECT:([utils].[dbo].[NumberNumericTest].[PK__NumberNu__C7D08B630AD2A005] AS [n]))
So if we did have an integer that we need to operate on with a float value (the last case), the addition of a simple cast on the integer argument will bring the execution performance in line with having two numerics, meaning there is no gain in storing the value as a numeric(7,0).
Performance #2
The other potential performance impact is with auto-parameterization. Auto-parameterization occurs when you provide SQL Server with a non-parameterized SQL statement. The server determines a type for those parameters and parameterizes them (part of the magic that makes plans reusable for non-parameterized queries). I couldn't find anything terribly recent, but as far back as SQL Server 6.5 and 7.0 the engine was documented as using the int type for any non-decimal value of 9 digits or less. This means that in the unlikely situation that you're executing inline, non-parameterized SQL statements and have used numeric(*,0) types in your table definitions, you will actually be taking a performance hit for the implicit conversion from auto-parameterized integer to the numeric(*,0) field.
And if that wasn't bad enough, the same SQL Server documentation says that SQL Server treats integers as more exact than numeric and decimal types. It doesn't specify why the document goes out of its way to share this information with us, but generally when someone goes out of their way to point out something like this in a document, I get a little nervous and tend to focus more heavily on their 'recommended' practice (use int).
More information on Parameterization and SQL 7 Comparison Optimization
The Wrap-up
So at the end of the day, using a numeric(*,0) requires more space, provides no appreciable benefit over explicit casting, and can actually harm you if you are executing non-parameterized SQL statements against your server.
There are two options for finding these columns, using a SQL query like the one below or downloading SQLCop to check for this and many other common situations.
- SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, DATA_TYPE, NUMERIC_PRECISION, NUMERIC_SCALE
- FROM INFORMATION_SCHEMA.COLUMNS C
- WHERE C.DATA_TYPE IN ('numeric','decimal') AND NUMERIC_SCALE = 0 AND NUMERIC_PRECISION <= 18

Eli delivers software and technology solutions for a living. His roles have included lone developer, accidental DBA, team lead, and even unintentional Solaris consultant once. With experience in adhoc, Lean, and Agile environments across NSF grants, SaaS products, and in-house IT groups, he is just as willing to chat about the principles of Lean or Continuous Delivery as he is to dive into Azure, SQL Server, or the last ATDD project he created. 



LTD Social Sitings
Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.