I once had a boss whose desk looked something like this:
Shudder. I like things organized, from the files on my desk to the files in my database. There’s a mechanism in SQL Server to help you separate and organize files: filegroups.
What is a Filegroup?
A filegroup is a logical structure to group objects in a database. Don’t confuse filegroups with actual files (.mdf, .ddf, .ndf, .ldf, etc.). You can have multiple filegroups per database. One filegroup will be the primary, and all system tables are stored on it. Then, you add additional filegroups. You can specify one filegroup as the default, and objects not specifically assigned to a filegroup will exist in the default. In a filegroup, you can have multiple files.
Only data files can be assigned to filegroups. Log space is managed separately from data space.
Why Should I Create Multiple Filegroups?
There are two primary reasons for creating filegroups: performance and recovery.
Filegroups that contain files created on specific disks can alleviate disk performance issues. For example, you may have one very large table in your database with a lot of read and write activity – an orders table, perhaps. You can create a filegroup, create a file in the filegroup, and then move a table to the filegroup by moving the clustered index. (I’ll cover how to do this later in this post.) If the file is created on a disk separate from other files, you are going to have better performance. This is similar to the logic behind separating data and log files in a database. Performance improves when you spread files across multiple disks because you have multiple heads reading and writing, rather than one doing all the work.
Filegroups can be backed up and restored separately as well. This can enable faster object recovery in the case of a disaster. It can also help the administration of large databases.
How Do I Create Multiple Filegroups?
If you are creating a new database, you can specify the filegroups in the CREATE DATABASE statement.
Here, I will create a database named FilegroupTest. It has two filegroups, PRIMARY and FGTestFG2. There are two data files, FGTest1_dat, assigned to PRIMARY; and FGTest2_dat, assigned to FGTestFG2.
T-SQL | |
1 2 3 4 5 6 7 8 9 10 | CREATE DATABASE FilegroupTest ON PRIMARY (NAME = FGTest1_dat, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest1_dat.mdf'), FILEGROUP FGTestFG2 (NAME = FGTest2_dat, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest2_dat.mdf') LOG ON (NAME = FGTest_log, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest_log.ldf') |
CREATE DATABASE FilegroupTest ON PRIMARY (NAME = FGTest1_dat, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest1_dat.mdf'), FILEGROUP FGTestFG2 (NAME = FGTest2_dat, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest2_dat.mdf') LOG ON (NAME = FGTest_log, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest_log.ldf')
If you have an existing database, use can use the ALTER DATABASE statement to add a filegroup. I’m going to add FGTestFG3 to FilegroupTest.
T-SQL | |
1 2 | ALTER DATABASE FilegroupTest ADD FILEGROUP FGTestFG3 |
ALTER DATABASE FilegroupTest ADD FILEGROUP FGTestFG3
I can view the filegroups in a database using sys.filegroups.
T-SQL | |
1 2 3 4 | USE FilegroupTest; GO SELECT * FROM sys.filegroups |
USE FilegroupTest; GO SELECT * FROM sys.filegroups
To create a new file, FGTest3_dat, and assign it to FGTestFG3, I’ll use ALTER DATABASE again.
T-SQL | |
1 2 3 4 5 | ALTER DATABASE FilegroupTest ADD FILE (NAME = FGTest3_dat, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest3_dat.mdf') TO FILEGROUP FGTestFG3 |
ALTER DATABASE FilegroupTest ADD FILE (NAME = FGTest3_dat, FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLDATAFGTest3_dat.mdf') TO FILEGROUP FGTestFG3
Right now, my PRIMARY filegroup is the default filegroup. I can change that to FGTestFG3 using ALTER DATABASE.
T-SQL | |
1 2 | ALTER DATABASE FilegroupTest MODIFY FILEGROUP FGTestFG3 DEFAULT |
ALTER DATABASE FilegroupTest MODIFY FILEGROUP FGTestFG3 DEFAULT
When I re-run my sys.filegroups query, is_default value has changed.
How Do I Move An Object to a Different Filegroup?
You can move a table from one filegroup to another, provided the table has a clustered index on it.
Note: You can move a heap (a table with no clustered index). To do so, you would create an index, move it, and drop the index.
First, I create a table with no indexes.
T-SQL | |
1 2 3 | CREATE TABLE StuffAndJunk (StuffHere INT NOT NULL, JunkHere INT NOT NULL) |
CREATE TABLE StuffAndJunk (StuffHere INT NOT NULL, JunkHere INT NOT NULL)
I can use sp_help to see which filegroup this was created on. It was created on the default, FGTestFG3.
T-SQL | |
1 | exec sp_help 'dbo.StuffAndJunk' |
exec sp_help 'dbo.StuffAndJunk'
I can also return this information using the sys.filegroups, sys.allocation_units and sys.partitions tables.
T-SQL | |
1 2 3 4 5 6 | SELECT PA.object_id, FG.name FROM sys.filegroups FG INNER JOIN sys.allocation_units AU ON AU.data_space_id = FG.data_space_id INNER JOIN sys.partitions PA ON PA.partition_id = AU.container_id WHERE PA.object_id = (SELECT object_id(N'FilegroupTest.dbo.StuffAndJunk')) |
SELECT PA.object_id, FG.name FROM sys.filegroups FG INNER JOIN sys.allocation_units AU ON AU.data_space_id = FG.data_space_id INNER JOIN sys.partitions PA ON PA.partition_id = AU.container_id WHERE PA.object_id = (SELECT object_id(N'FilegroupTest.dbo.StuffAndJunk'))
I cannot move the table only. That is simply not part of the ALTER TABLE syntax. According to BOL, MOVE TO “Specifies a location to move the data rows currently in the leaf level of the clustered index.”
Note: It is possible to create a table in a secondary filegroup, move the data from the first filegroup to the second, and then drop the table from the primary. Be aware that these types of operations cause a high level of transaction log entries. Ensure that the transaction log is properly sized to prevent a large amount of growth in the logs or inadvertently affect things that rely on the transaction logs such as log shipping or mirroring.
I’m going to add a clustered index to the table. When I do this, I specify which filegroup I want it created on. I create StuffJunk on FGTestFG2.
T-SQL | |
1 2 3 | CREATE CLUSTERED INDEX StuffJunk ON StuffAndJunk (StuffHere, JunkHere) ON FGTestFG2 |
CREATE CLUSTERED INDEX StuffJunk ON StuffAndJunk (StuffHere, JunkHere) ON FGTestFG2
If I run my sys.filegroups query again, I can see I have the same object_id, but it has moved to a different filegroup.
How would I move a table with an existing clustered index? Let’s move StuffAndJunk back to FGTestFG3. I would issue a create clustered index command with the option to drop existing, like this.
T-SQL | |
1 2 3 4 | CREATE CLUSTERED INDEX StuffJunk ON StuffAndJunk (StuffHere, JunkHere) WITH (DROP_EXISTING = ON) ON FGTestFG3 |
CREATE CLUSTERED INDEX StuffJunk ON StuffAndJunk (StuffHere, JunkHere) WITH (DROP_EXISTING = ON) ON FGTestFG3
Re-running my sys.filegroups query shows that the index, and thus the data and table, are on FGTestFG3.
Organize, Organize, Organize!
Help your databases look like this:
Filegroups are a great way to organize your data, increasing performance and providing additional disaster recovery. If you can do this in the planning stages, it’s great, but be aware that you can add filegroups in later, too.
11 Comments
Good post!!
In general I will create non clustered indexes on a different file group than the one that the heap/clustered table is created on..of course placed on a different spindle
Of course I always get a chuckle when people try to place the clustered index on a different file group than the base table…you know what will happen
So is a schema 🙂
Great write up Jes!
Great post Jes!
Good post Jes!
Good post. The availability you touch upon is particularly relevant in SQL Enterprise edition where it will allow for partial availability and piecemeal restore. Best practice dictates that the Primary filegroup should only be used for system objects so any user objects ideally should be created on additional filegroups. I’m sure you already know all this Jes but thought I’d add it for the benefit of your readers.
Thanks guys!
Denis – I agree, schemas are another logical organizer. The difference is that users see schemas, whereas filegroups are really an internal structure that users wouldn’t worry about (other than creating objects on the correct one).
Mark – thanks for the great points! I will probably do a filegroup backup/restore post in the near future, since that is a great topic.
great post.
When we create a new database inside the SQL Server, it is typical that SQl Server creates two physical files in the Operating System: one with .MDF Extension, and another with .LDF Extension.
If you add one or more data files to a database, the physical file that will be created in the Operating System will have an extension of .NDF, which is called as Secondary Data File;
but in your script it is .mdf extension for Secondary file,can u please clarify on that.Which will be beter option?
I am planning to design a database which has
–> read only data
–> read and write data(booking table,customer table)
–> log data (deleting the data after 4 months)
–> system data(some user configuration settings for running the application)
the last one would be primary file group and other three would be secondary filegroup.
Is it a good design and also how to backup the DB with this design
Thank you
with regards
Dilip D
Dilip – the .mdf, .ndf, .ddf and .ldf file extensions are recommended extensions, but not required. You could place any extension on the files. The best thing you can do is create a standard for your environment, and stick to that standard.
For reference: http://msdn.microsoft.com/en-us/library/ms179316.aspx.
Great post, thanks!!
We should mention “The trouble with Text Filegroups” though.. lol … if you have EVER had a varchar(max), text, ntext or image datatype in the table, sql server will fill in the Text Filegroup to the default filegroup, usually PRIMARY. If someone then just deletes that field later, the property field will not be changed. Changing the datatype of the offending field will null the property correctly.
So, if you try to do the filegroup switch, even if you no longer have one of those datatypes in your table, that property for the table(text filegroup) is still filled in, which can cause script issues if doing the switch via scritping.
Is this fixed in any new release/patch that anyone is aware of?
Thanks!
Russ
Hi Jes,
If you have existing non-clustered indexes on the table you are moving, will they be moved as well?
Thanks a lot for your post, much appreciated.
Travis – no, nonclustered indexes are separate objects, and those would need to be moved separately.