This post will illustrate two methods for removing old files from directories using SSIS. This task is often used to delete old backup files and other ETL files that are not required any longer. We’ll step through two methods. First method uses a script task entirely for the removal and the logging events. This method will also have some comments in for logging and using the FireInformation method to mimic the normal logging abilities of SSIS. The FireInformation method didn’t provide much more of a performance boost so it wasn’t used here. Second method uses a Foreach Loop Container, Script Task for logic and a File System Task for the delete event. SSIS Logging will be utilized with the OnPreExecute and OnPostExecute events in the second method over the System.IO method of AppendText.

The use of System.IO in a script task performed much better than the use of the File System Task.

Both methods were tested on folders containing 300 files with size ranging from 200MB to 1GB. 168 files were placed among the 300 in order to meet the properties that will require a delete event to fire.

Deletion by System.IO and manually logging to flat files elapsed in 156 Milliseconds.

Deletion by means of a Foreach Loop Container containing a Script Task to validate the file meets criteria (File System Task is limited in doing that) and then the File System Task to force the delete took a whopping 3.962 seconds!

Have fun playing with this yourself and hope it comes in handy when you need to perform this type of task.

If anyone has other methods or improvements to these, please feel free to comment or start a thread in the LessThanDot SQL Server forums.

Method one: Script Task

  1. Bring over a Script Task into the Control Flow
  2. Name the task, “Remove all files based on backdays”
  3. Create the following variables
  4. Double click the script task to open the editor.
  5. Add the variable to the ReadOnlyVariables area
  6. Click Edit Script
  7. Add the following code
  8. C#
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    
    using System;
    using System.IO;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
     
    namespace ST_aaf20a0bd5b94ad394ab80cf5d585c41.csproj
    {
        [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
        public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
        {
     
            #region VSTA generated code
            enum ScriptResults
            {
                Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
                Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
            };
            #endregion
     
     
            public void Main()
            {
                try
                {
                    string logfile = Dts.Variables["User::logfile"].Value.ToString();
                    DateTime olderThanDate = (DateTime)Dts.Variables["User::daysback"].Value;
                    string folder = Dts.Variables["User::folder"].Value.ToString();
     
                        if (!(File.Exists(logfile)))
                        {
                            FileInfo fi = new FileInfo(logfile.ToString());
                            FileStream fstr = fi.Create();
                            fstr.Close();
                        }
                        ActionJackson(folder, logfile, olderThanDate);
                }
                catch (Exception Ex)
                {
                    Dts.Events.FireError(1, "", "BOOM!!!  " + Ex.Message.ToString(), "", 0);
                }
     
                Dts.TaskResult = (int)ScriptResults.Success;
            }
     
            private void ActionJackson(string folder, string path, DateTime olderThanDate)
            {
                    DirectoryInfo dirInfo = new DirectoryInfo(folder);
                    TextWriter tw = File.AppendText(path);
     
                    FileInfo[] files = dirInfo.GetFiles();
                    foreach (FileInfo file in files)
                    {
                        if (file.LastWriteTime < olderThanDate)
                        {
                            //or use Dts.Events.FireInformation(0, "", "File Deleted Succesfully", "", 0, True)
                            //with a logging file destination setup
                            //the two methods did not vary in performance much.  FireInformation allows for 
                            //the same descriptive logging as OnPreExecute/OnPostExecute descriptions 
                            tw.WriteLine("File " + file.FullName.ToString() + " Deleted on " + System.DateTime.Now.ToString());
                            file.IsReadOnly = false;
                            file.Delete();
                        }
                    }
                    tw.Close();
            }
        }
    }
    using System;
    using System.IO;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    
    namespace ST_aaf20a0bd5b94ad394ab80cf5d585c41.csproj
    {
        [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
        public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
        {
    
            #region VSTA generated code
            enum ScriptResults
            {
                Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
                Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
            };
            #endregion
    
    
            public void Main()
            {
                try
                {
                    string logfile = Dts.Variables["User::logfile"].Value.ToString();
                    DateTime olderThanDate = (DateTime)Dts.Variables["User::daysback"].Value;
                    string folder = Dts.Variables["User::folder"].Value.ToString();
    
                        if (!(File.Exists(logfile)))
                        {
                            FileInfo fi = new FileInfo(logfile.ToString());
                            FileStream fstr = fi.Create();
                            fstr.Close();
                        }
                        ActionJackson(folder, logfile, olderThanDate);
                }
                catch (Exception Ex)
                {
                    Dts.Events.FireError(1, "", "BOOM!!!  " + Ex.Message.ToString(), "", 0);
                }
    
                Dts.TaskResult = (int)ScriptResults.Success;
            }
    
            private void ActionJackson(string folder, string path, DateTime olderThanDate)
            {
                    DirectoryInfo dirInfo = new DirectoryInfo(folder);
                    TextWriter tw = File.AppendText(path);
    
                    FileInfo[] files = dirInfo.GetFiles();
                    foreach (FileInfo file in files)
                    {
                        if (file.LastWriteTime < olderThanDate)
                        {
                            //or use Dts.Events.FireInformation(0, "", "File Deleted Succesfully", "", 0, True)
                            //with a logging file destination setup
                            //the two methods did not vary in performance much.  FireInformation allows for 
                            //the same descriptive logging as OnPreExecute/OnPostExecute descriptions 
                            tw.WriteLine("File " + file.FullName.ToString() + " Deleted on " + System.DateTime.Now.ToString());
                            file.IsReadOnly = false;
                            file.Delete();
                        }
                    }
                    tw.Close();
            }
        }
    }
  9. Close the code editor and click OK to the script task editor.
  10. Right click an empty space in the Control Flow and select Package Configurations
  11. Check Enable package configurations
  12. Click Add and enter C:DelConfig.xml to the specify configuration settings directly textbox
  13. Click Finish and Close to the configurations editor.
  14. You can open the XML file to edit the folder location and the log file location now
  15. Import the package into SSIS and execute it to run.

Method two: Foreach Loop Container

  1. Bring over a Foreach Loop Container in the Control Flow
  2. Create these variables
  3. Bring over a Script Task and drop it into the Foreach Loop Container
  4. Double click the Script Task and add these ReadOnly variables
  5. Click the Edit Script button and add this code
  6. C#
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    
    using System;
    using System.Data;
    using System.IO;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
     
    namespace ST_216af79b563f4866bb64f6043b232b4e.csproj
    {
        [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
        public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
        {
     
            #region VSTA generated code
            enum ScriptResults
            {
                Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
                Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
            };
            #endregion
     
            public void Main()
            {
                try
                {
                    FileInfo fi = new FileInfo(Dts.Variables["User::filefocus"].Value.ToString());
     
                    if (fi.LastWriteTime < (DateTime)Dts.Variables["User::daysback"].Value)
                    {
                        Dts.TaskResult = (int)ScriptResults.Success;
                    }
                }
                catch (Exception Ex)
                {
                    Dts.Events.FireError(1, "", "BOOM!!!  " + Ex.Message.ToString(), "", 0);
                }
            }
        }
    }
    using System;
    using System.Data;
    using System.IO;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    
    namespace ST_216af79b563f4866bb64f6043b232b4e.csproj
    {
        [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
        public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
        {
    
            #region VSTA generated code
            enum ScriptResults
            {
                Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
                Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
            };
            #endregion
    
            public void Main()
            {
                try
                {
                    FileInfo fi = new FileInfo(Dts.Variables["User::filefocus"].Value.ToString());
    
                    if (fi.LastWriteTime < (DateTime)Dts.Variables["User::daysback"].Value)
                    {
                        Dts.TaskResult = (int)ScriptResults.Success;
                    }
                }
                catch (Exception Ex)
                {
                    Dts.Events.FireError(1, "", "BOOM!!!  " + Ex.Message.ToString(), "", 0);
                }
            }
        }
    }
  7. Close the code editor and hit OK to exit and save the script task settings.
  8. Bring over and drop a File System Task into the Foreach Loop Container
  9. Connect the Script Task to the File System Task
  10. Double click the Foreach Loop Container and go to Collection.
  11. Enter in a default folder
  12. In Variable Mappings, drop down the Variable and select the User::filefocus variable and leave the index at 0
  13. Click OK to close the editor
  14. Double click the File System Task
  15. Change Operation to Delete
  16. Change IsSourcePathVariable to True
  17. SourceVariable to User::filefocus
  18. Click OK to exit and save
  19. Add a package configurations file as an XML file
  20. Right click an empty space in the Control Flow window
  21. Enter FileEnumConfig.xml and click, Next
  22. Click the variables filefocus and folder
  23. Name the configuration
  24. Click Finish and then Close to exit and save
  25. Click SSIS in the menu strip and select Logging
  26. Check container FileEnumTest
  27. Select SSIS log provider for Text Files
  28. Drop down configuration and select New Connection
  29. Select Create File and enter C:DelbyFilEnum.txt
  30. Click OK and then OK to exit and save
  31. You can open the XML file to edit the folder location and the log file location now
  32. Import the package into SSIS and execute it to run.

Logging from the first method appears in blocks like this

File C:TestDeleteSystemIOFile_20100603121501.trn Deleted on 6/4/2010 7:11:38 AM
File C:TestDeleteSystemIOFile_20100603123001.trn Deleted on 6/4/2010 7:11:38 AM

compared to the SSIS logging method of

Diagnostic,onpnt,onpnt,Delete Old File,{E4830550-A51A-46F3-A374-243888315707},{969D771F-24A9-4A57-B98D-42B025BE2573},6/4/2010 7:31:25 AM,6/4/2010 7:31:25 AM,0,(null),Trying to delete file ‘C:TestDeleteFileSystemTaskFile_20100603191001.trn’.
Diagnostic,onpnt,onpnt,Delete Old File,{E4830550-A51A-46F3-A374-243888315707},{969D771F-24A9-4A57-B98D-42B025BE2573},6/4/2010 7:31:25 AM,6/4/2010 7:31:25 AM,0,(null),Finished deleting file ‘C:TestDeleteFileSystemTaskFile_20100603191001.trn’.

Conclusions

The use of the script task and going to the System.IO namespace to do the work has some benefits over the second method. The largest was speed. The second was the administrative and development time needed.

Both of these methods are sound and good options. The File System Task usage removes the need for more extensive programming over the Script Task. There are also benefits in the logging method used and the amount of information we can easily obtain during the execution progress. Picking either method will be decided on comfort level with coding in .NET and performance requirements while giving up or gaining functionality.