Someone at my job needed to grab a filepath, take out the filename and then split all the pieces into their own columns which were separated by either a dot or an underscore. If you had the following table

CREATE TABLE #test (id INT primary key, PATH VARCHAR(1000))
INSERT #test VALUES(1,'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK')
INSERT #test VALUES(2,'Y:dgfdbla_2010_10_15_Pre_Calc.BAK')
INSERT #test VALUES(4,'Y:dgfdgdgdgdgdgdgdSome_Databla_2010_16_Pre_Calc.BAK')
INSERT #test VALUES(3,'Y:dgfdgdgdgdgdgdgdSome_Databla_2010_11_12_Pre_Calc.BAK')

Then this was the required output.

id PATH 1 2 3 4 5 6 7 8 9
1 bla_2010_10_14_Pre_Calc_BAK bla 2010 10 14 Pre Calc BAK NULL NULL
2 bla_2010_10_15_Pre_Calc_BAK bla 2010 10 15 Pre Calc BAK NULL NULL
3 bla_2010_11_12_Pre_Calc_BAK bla 2010 11 12 Pre Calc BAK NULL NULL
4 bla_2010_16_Pre_Calc_BAK bla 2010 16 Pre Calc BAK NULL NULL NULL

I thought that this was an interesting exercise, the person who asked me about this wrote some procedures and user defined functions but it was too slow because he was using a lot of charindex and substring functions and storing results in intermediate variables, then continued parsing until the end of the string.

In order to attack this problem you have to divide and conquer. Here are the four steps that are needed

**

  1. Take the dot out
  2. Get just the filename
  3. Split the string
  4. Pivot the results

1 Take the dot out

This is pretty easy, all you have to do is use the replace function and replace ‘.’ with ‘_’

declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT REPLACE(@FilePath,'.','_')

Result

————————————————-

Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc_BAK

2 Get just the filename

To get just the filename, you use the reverse function, then you look for the position of the first backslash, you will then use the right function and use the position of the first backslash as the number of characters to keep. You add -1 to the length because you do not want the backslash included.

declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT RIGHT(@FilePath,PATINDEX('%%',REVERSE(@FilePath))-1) AS PATH

Result

————————————————-

bla_2010_10_14_Pre_Calc.BAK

2A combine 1 and 2

This is just a combination of 1 and 2, take the dot out and get just the file name

declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT RIGHT(REPLACE(@FilePath,'.','_'),PATINDEX('%%',REVERSE(@FilePath))-1) AS PATH

Result

————————————–

bla_2010_10_14_Pre_Calc_BAK

3 Split the string

To split the string we can use the built in table of numbers master..spt_values. Running the code below

declare @FilePath varchar(100)
SELECT @FilePath = 'Y:dgfdgdgd_Databla_2010_10_14_Pre_Calc.BAK'
SELECT @FilePath = RIGHT(REPLACE(@FilePath,'.','_'),PATINDEX('%%',REVERSE(@FilePath))-1) 

SELECT @FilePath as FilePath, SUBSTRING('_' + @FilePath + '_', Number + 1,
    CHARINDEX('_', '_' + @FilePath + '_', Number + 1) - Number -1)AS VALUE
    FROM master..spt_values v
    WHERE TYPE = 'P'
    AND Number <= LEN('_' + @FilePath  + '_') - 1
    AND SUBSTRING('_' + @FilePath  + '_', Number, 1) = '_'

This is the output

FilePath VALUE
bla_2010_10_14_Pre_Calc_BAK bla
bla_2010_10_14_Pre_Calc_BAK 2010
bla_2010_10_14_Pre_Calc_BAK 10
bla_2010_10_14_Pre_Calc_BAK 14
bla_2010_10_14_Pre_Calc_BAK Pre
bla_2010_10_14_Pre_Calc_BAK Calc
bla_2010_10_14_Pre_Calc_BAK BAK

We are almost there, instead of just one string we want to grab the whole table. We do a cross join with the numbers table and also use the ROW_NUMBER function to assign a column number which we will use as the column name later

Run this to see what we have so far

SELECT id,PATH, SUBSTRING('_' + PATH + '_', Number + 1,
    CHARINDEX('_', '_' + PATH + '_', Number + 1) - Number -1)AS VALUE,
    ROW_NUMBER() OVER (PARTITION BY id ORDER BY id,number ) AS ROW
    FROM master..spt_values v
    CROSS JOIN (SELECT id,RIGHT(REPLACE(PATH,'.','_'),PATINDEX('%%',REVERSE(PATH))-1) AS PATH  
    FROM #test ) AS #test
    WHERE TYPE = 'P'
    AND Number <= LEN('_' + PATH  + '_') - 1
    AND SUBSTRING('_' + PATH  + '_', Number, 1) = '_'
    

Here is the output

id PATH VALUE ROW
1 bla_2010_10_14_Pre_Calc_BAK bla 1
1 bla_2010_10_14_Pre_Calc_BAK 2010 2
1 bla_2010_10_14_Pre_Calc_BAK 10 3
1 bla_2010_10_14_Pre_Calc_BAK 14 4
1 bla_2010_10_14_Pre_Calc_BAK Pre 5
1 bla_2010_10_14_Pre_Calc_BAK Calc 6
1 bla_2010_10_14_Pre_Calc_BAK BAK 7
2 bla_2010_10_15_Pre_Calc_BAK bla 1
2 bla_2010_10_15_Pre_Calc_BAK 2010 2
2 bla_2010_10_15_Pre_Calc_BAK 10 3
2 bla_2010_10_15_Pre_Calc_BAK 15 4
2 bla_2010_10_15_Pre_Calc_BAK Pre 5
2 bla_2010_10_15_Pre_Calc_BAK Calc 6
2 bla_2010_10_15_Pre_Calc_BAK BAK 7
3 bla_2010_11_12_Pre_Calc_BAK bla 1
3 bla_2010_11_12_Pre_Calc_BAK 2010 2
3 bla_2010_11_12_Pre_Calc_BAK 11 3
3 bla_2010_11_12_Pre_Calc_BAK 12 4
3 bla_2010_11_12_Pre_Calc_BAK Pre 5
3 bla_2010_11_12_Pre_Calc_BAK Calc 6
3 bla_2010_11_12_Pre_Calc_BAK BAK 7
4 bla_2010_16_Pre_Calc_BAK bla 1
4 bla_2010_16_Pre_Calc_BAK 2010 2
4 bla_2010_16_Pre_Calc_BAK 16 3
4 bla_2010_16_Pre_Calc_BAK Pre 4
4 bla_2010_16_Pre_Calc_BAK Calc 5
4 bla_2010_16_Pre_Calc_BAK BAK 6

The missing piece is transposing the columns, this is easily accomplished by using the PIVOT operator. As you can see we use the ROW_NUMBER function to create a number for us, this number is then used as a column name by the Pivot operator. Here is the final query which produced the desired output

SELECT * FROM
   ( 
    SELECT id,PATH, SUBSTRING('_' + PATH + '_', Number + 1,
    CHARINDEX('_', '_' + PATH + '_', Number + 1) - Number -1)AS VALUE,
    ROW_NUMBER() OVER (PARTITION BY id ORDER BY id,number ) AS ROW
    FROM master..spt_values v
    CROSS JOIN (SELECT id,RIGHT(REPLACE(PATH,'.','_'),PATINDEX('%%',REVERSE(PATH))-1) AS PATH  
    FROM #test ) AS #test
    WHERE TYPE = 'P'
    AND Number <= LEN('_' + PATH  + '_') - 1
    AND SUBSTRING('_' + PATH  + '_', Number, 1) = '_') AS pivTemp
    PIVOT
(   MAX(VALUE)
    FOR ROW IN ([1],[2],[3],[4],[5],[6],[7],[8],[9])
  ) AS pivTable

The query produces the desired output.

id PATH 1 2 3 4 5 6 7 8 9
1 bla_2010_10_14_Pre_Calc_BAK bla 2010 10 14 Pre Calc BAK NULL NULL
2 bla_2010_10_15_Pre_Calc_BAK bla 2010 10 15 Pre Calc BAK NULL NULL
3 bla_2010_11_12_Pre_Calc_BAK bla 2010 11 12 Pre Calc BAK NULL NULL
4 bla_2010_16_Pre_Calc_BAK bla 2010 16 Pre Calc BAK NULL NULL NULL

Conclusion

As you can see, if you apply small steps and divide the problem, you can easily solve it even if it looks very difficult at first glance. Remember most likely you can already use something which is built into the product, try not to reinvent the wheel.

*** Remember, if you have a SQL related question, try our Microsoft SQL Server Programming forum or our Microsoft SQL Server Admin forum