Someone at my job needed to grab a filepath, take out the filename and then split all the pieces into their own columns which were separated by either a dot or an underscore. If you had the following table
CREATE TABLE #test (id INT primary key, PATH VARCHAR(1000))
INSERT #test VALUES(1,'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK')
INSERT #test VALUES(2,'Y:dgfdbla_2010_10_15_Pre_Calc.BAK')
INSERT #test VALUES(4,'Y:dgfdgdgdgdgdgdgdSome_Databla_2010_16_Pre_Calc.BAK')
INSERT #test VALUES(3,'Y:dgfdgdgdgdgdgdgdSome_Databla_2010_11_12_Pre_Calc.BAK')
Then this was the required output.
| id |
|---|
I thought that this was an interesting exercise, the person who asked me about this wrote some procedures and user defined functions but it was too slow because he was using a lot of charindex and substring functions and storing results in intermediate variables, then continued parsing until the end of the string.
In order to attack this problem you have to divide and conquer. Here are the four steps that are needed
**
- Take the dot out
- Get just the filename
- Split the string
- Pivot the results
1 Take the dot out
This is pretty easy, all you have to do is use the replace function and replace ‘.’ with ‘_’
declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT REPLACE(@FilePath,'.','_')
Result
————————————————-
Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc_BAK
2 Get just the filename
To get just the filename, you use the reverse function, then you look for the position of the first backslash, you will then use the right function and use the position of the first backslash as the number of characters to keep. You add -1 to the length because you do not want the backslash included.
declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT RIGHT(@FilePath,PATINDEX('%%',REVERSE(@FilePath))-1) AS PATH
Result
————————————————-
bla_2010_10_14_Pre_Calc.BAK
2A combine 1 and 2
This is just a combination of 1 and 2, take the dot out and get just the file name
declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT RIGHT(REPLACE(@FilePath,'.','_'),PATINDEX('%%',REVERSE(@FilePath))-1) AS PATH
Result
————————————–
bla_2010_10_14_Pre_Calc_BAK
3 Split the string
To split the string we can use the built in table of numbers master..spt_values. Running the code below
declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT @FilePath = RIGHT(REPLACE(@FilePath,'.','_'),PATINDEX('%%',REVERSE(@FilePath))-1)
SELECT @FilePath as FilePath, SUBSTRING('_' + @FilePath + '_', Number + 1,
CHARINDEX('_', '_' + @FilePath + '_', Number + 1) - Number -1)AS VALUE
FROM master..spt_values v
WHERE TYPE = 'P'
AND Number <= LEN('_' + @FilePath + '_') - 1
AND SUBSTRING('_' + @FilePath + '_', Number, 1) = '_'
This is the output
| FilePath |
|---|
We are almost there, instead of just one string we want to grab the whole table. We do a cross join with the numbers table and also use the ROW_NUMBER function to assign a column number which we will use as the column name later
Run this to see what we have so far
SELECT id,PATH, SUBSTRING('_' + PATH + '_', Number + 1,
CHARINDEX('_', '_' + PATH + '_', Number + 1) - Number -1)AS VALUE,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY id,number ) AS ROW
FROM master..spt_values v
CROSS JOIN (SELECT id,RIGHT(REPLACE(PATH,'.','_'),PATINDEX('%%',REVERSE(PATH))-1) AS PATH
FROM #test ) AS #test
WHERE TYPE = 'P'
AND Number <= LEN('_' + PATH + '_') - 1
AND SUBSTRING('_' + PATH + '_', Number, 1) = '_'
Here is the output
| id |
|---|
The missing piece is transposing the columns, this is easily accomplished by using the PIVOT operator. As you can see we use the ROW_NUMBER function to create a number for us, this number is then used as a column name by the Pivot operator. Here is the final query which produced the desired output
SELECT * FROM
(
SELECT id,PATH, SUBSTRING('_' + PATH + '_', Number + 1,
CHARINDEX('_', '_' + PATH + '_', Number + 1) - Number -1)AS VALUE,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY id,number ) AS ROW
FROM master..spt_values v
CROSS JOIN (SELECT id,RIGHT(REPLACE(PATH,'.','_'),PATINDEX('%%',REVERSE(PATH))-1) AS PATH
FROM #test ) AS #test
WHERE TYPE = 'P'
AND Number <= LEN('_' + PATH + '_') - 1
AND SUBSTRING('_' + PATH + '_', Number, 1) = '_') AS pivTemp
PIVOT
( MAX(VALUE)
FOR ROW IN ([1],[2],[3],[4],[5],[6],[7],[8],[9])
) AS pivTable
The query produces the desired output.
| id |
|---|
Conclusion
As you can see, if you apply small steps and divide the problem, you can easily solve it even if it looks very difficult at first glance. Remember most likely you can already use something which is built into the product, try not to reinvent the wheel.
*** Remember, if you have a SQL related question, try our Microsoft SQL Server Programming forum or our Microsoft SQL Server Admin forum

Denis has been working with SQL Server since version 6.5. Although he worked as an ASP/JSP/ColdFusion developer before the dot com bust, he has been working exclusively as a database developer/architect since 2002. In addition to English, Denis is also fluent in Croatian and Dutch, but he can curse in many other languages and dialects (just ask the SQL optimizer) He lives in Princeton, NJ with his wife and three kids.