Login or Sign Up to become a member!

EXPERTS, INFORMATION, IDEAS & KNOWLEDGE

Social bookmarker Add this

Your profile

Search

November 2008
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

XML Feeds

Tags: indexing

The Data Management Journal

Finding Fragmentation Of An Index And Fixing It

by SQLDenis


Permalink 07 Nov 2008 09:45 , Categories: Data Modelling & Design Tags: fragmentation, indexing, maintenance, sql server 2005, sql server 2008

A lof of time your index will get framented over time if you do a lot of updates or insert and deletes.
We will look at an example by creating a table, fragmenting the heck out of it and then doing a reorganize and rebuild on the index

First create this table

  1. CREATE TABLE TestIndex (name1 VARCHAR(500)
  2. not null,id INT
  3. not null,userstat  INT not null,
  4. name2 VARCHAR(500) not null,
  5. SomeVal UNIQUEIDENTIFIER not null)

Now insert 50000 rows

  1. INSERT TestIndex
  2. SELECT TOP 50000 s.name,s.id,s.userstat,s2.name,newid()
  3. FROM master..sysobjects s
  4. CROSS JOIN master..sysobjects s2

Now create this index

  1. CREATE CLUSTERED INDEX IX_TestIndex_Index ON TestIndex(SomeVal)

Now let us look at some data by using the sys.dm_db_index_physical_stats DMV. Keep this query handy, we will run it many times

  1. SELECT OBJECT_NAME(OBJECT_ID) AS Tablename,s.name AS Indexname
  2. ,index_type_desc
  3. ,avg_fragmentation_in_percent
  4. ,page_count
  5. FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, N‘LIMITED’) d
  6. join sysindexes s ON d.OBJECT_ID = s.id
  7. and d.index_id = s.indid
  8. and s.name =‘IX_TestIndex_Index’


(Result Set)

Tablename Indexname index_type_desc avg_fragmentation_in_percent page_count
TestIndex IX_TestIndex_Index CLUSTERED INDEX 0.22172949002217296 451

That is good, almost no fragmentation. Let’s change that shall we?

  1. UPDATE TestIndex
  2. SET SomeVal = NEWID()

Okay, now you can see that the index is completely fragmented, we are also using 955 pages to store the data instead of 451
(Result Set)

Tablename Indexname index_type_desc avg_fragmentation_in_percent page_count
TestIndex IX_TestIndex_Index CLUSTERED INDEX 99.3717277486911 955

There are two ways to fix fragmentation, one is to reorganize the index and the other is to rebuild the index. Reorganize is an online operation while rebuild is not unless you specify ONLINE = ON, ONLINE = ON will only work on Enterprise editions of SQL Server.
Here is how to do a reorganize

  1. ALTER INDEX IX_TestIndex_Index ON TestIndex
  2. REORGANIZE;

(Result Set)

Tablename Indexname index_type_desc avg_fragmentation_in_percent page_count
TestIndex IX_TestIndex_Index CLUSTERED INDEX 2.8824833702882482 451


As you can see after the reorganize(DBCC INDEXDEFRAG for you SQL Server 2000 folks) fragmentation levels dropped to less than 3 percent.

Just for fun let’s also rebuild(Drop and recreate the index for you SQL Server 2000 folks) the index

  1. ALTER INDEX IX_TestIndex_Index ON TestIndex
  2. REBUILD;

(Result Set)

Tablename Indexname index_type_desc avg_fragmentation_in_percent page_count
TestIndex IX_TestIndex_Index CLUSTERED INDEX 0.22222222222222221 450

As you can see the rebuild made fragmentation almost 0

Here are two differences between REBUILD ONLINE = ON and REBUILD ONLINE = OFF
ON
Long-term table locks are not held for the duration of the index operation. During the main phase of the index operation, only an Intent Share (IS) lock is held on the source table. This allows queries or updates to the underlying table and indexes to continue. At the start of the operation, a Shared (S) lock is very briefly held on the source object. At the end of the operation, an S lock is very briefly held on the source if a nonclustered index is being created, or an SCH-M (Schema Modification) lock is acquired when a clustered index is created or dropped online, or when a clustered or nonclustered index is being rebuilt. ONLINE cannot be set to ON when an index is being created on a local temporary table.

OFF
Table locks are applied for the duration of the index operation. An offline index operation that creates, rebuilds, or drops a clustered, spatial, or XML index, or rebuilds or drops a nonclustered index, acquires a Schema modification (Sch-M) lock on the table. This prevents all user access to the underlying table for the duration of the operation. An offline index operation that creates a nonclustered index acquires a Shared (S) lock on the table. This prevents updates to the underlying table but allows read operations, such as SELECT statements.

Of course you will not run rebuild/reorganize manually for every index in your database, Michelle Ufford from the SQL Fool blog has a nice post with just a script which can do this automatically, you can find that here: Index Defrag Script

This post is also on our SQL Server Admin Hacks

1 comment »Send a trackback » 1812 views

Only In A Database Can You Get 1000% + Improvement By Changing A Few Lines Of Code

by SQLDenis


Permalink 17 Aug 2008 07:55 , Categories: Data Modelling & Design Tags: database, dates, indexing, performance tuning, rdbms, sql, t-sql, temporal data

Take a look at this query.

  1. SELECT * FROM
  2.  
  3. (
  4.  
  5. SELECT customer_id, ‘MTD’ AS record_type, COUNT(*), SUM(), AVG()
  6.  
  7. FROM payment_table
  8.  
  9.  
  10. WHERE YEAR(payment_dt) = YEAR(GETDATE())
  11.  
  12. and MONTH(payment_dt) = MONTH(GETDATE())
  13.  
  14. GROUP BY customer_id) MTD_payments
  15.  
  16. UNION ALL
  17.  
  18. (
  19.  
  20. SELECT customer_id, ‘YTD’ AS record_type, COUNT(*), SUM(), AVG()
  21.  
  22. FROM payment_table
  23.  
  24. WHERE
  25.  
  26. WHERE YEAR(payment_dt) = YEAR(GETDATE())
  27.  
  28. GROUP BY customer_id) YTD_payments
  29.  
  30. UNION ALL
  31.  
  32. (
  33.  
  34. SELECT customer_id, ‘LTD’ AS record_type, COUNT(*), SUM(), AVG()
  35.  
  36. FROM payment_table) LTD_payments
  37.  
  38. ) payments_report
  39.  
  40. ORDER BY customer_id, record_type

Can you see the problem?
A person had this query, it would run for over 24 hours. Wow, that is pretty bad, I don’t think I had ever written something that ran over an hour, and the ones I did were mostly defragmentation and update statistics jobs.

The problem is that the following piece of code

  1. WHERE YEAR(payment_dt) = YEAR(GETDATE())
  2. and MONTH(payment_dt) = MONTH(GETDATE())

is not sargable. First what does it mean to be sargable? A query is said to be sargable if the DBMS engine can take advantage of an index to speed up the execution of the query (using index seeks, not covering indexes). The term is derived from a contraction of Search ARGument Able.

This query is not sargable because there is a function on the column, whenever you use a function on the column you will not get an index seek but an index scan. The difference between an index seek and an index scan can be explained like this: when searching for something in a book, you go to the index in the back find the page number and go to the page, that is an index seek. When looking for something in a book you go from page one until the last page, read all the words on all the ages and get what you need, that was an index scan. Do you see how much more expensive in terms of performance that was?

Let’s get back to the query, what can we do to make this piece of code use an index seek?

  1. WHERE YEAR(payment_dt) = YEAR(GETDATE())
  2. and MONTH(payment_dt) = MONTH(GETDATE())

You would change it to this:

  1. WHERE payment_dt >= DATEADD(mm, DATEDIFF(mm, 0, GETDATE())+0, 0)
  2. and payment_dt < DATEADD(mm, DATEDIFF(mm, 0, GETDATE())+1, 0)

You can see the complete question on the MSDN forum site here:
http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=3746751&SiteID=1

The Person said that his query went from over 24 hours to 36 seconds. Wow!! That is very significant. hardware cannot help you out if you have bad queries like that.

The same exact day I answered a very similar question, take a look here: http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=3752248&SiteID=1

The person had this

  1. AND DATEDIFF(d, ‘08/10/2008′, DateCreated) >= 0
  2. AND DATEDIFF(d, DateCreated, ‘08/15/2008′) >= 0

I told him to change it to this

  1. AND DateCreated >= ‘08/10/2008′
  2. and DateCreated <= ‘08/15/2008′

And that solved that query. If you are interested in some more performance, I have written some Query Optimization items on the LessThanDot Wiki. Below are some direct links

Case Sensitive Search
No Functions on Left Side of Operator
Query Optimizations With Dates
Optimization: Set Nocount On
No Math In Where Clause
Don't Use (select *), but List Columns

If you are interested in some blogposts about dates, take a look at these two which I wrote earlier
How Are Dates Stored In SQL Server?
Do You Know How Between Works With Dates?

16 comments »1 trackback » 5852 views