Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « How to refresh the local IntelliSense cache in SQL Server Management StudioError: The GetBytes function can only be used on columns of type Text, NText, or Image »
    comments

    This post will illustrate two methods for removing old files from directories using SSIS. This task is often used to delete old backup files and other ETL files that are not required any longer. We’ll step through two methods. First method uses a script task entirely for the removal and the logging events. This method will also have some comments in for logging and using the FireInformation method to mimic the normal logging abilities of SSIS. The FireInformation method didn’t provide much more of a performance boost so it wasn’t used here. Second method uses a Foreach Loop Container, Script Task for logic and a File System Task for the delete event. SSIS Logging will be utilized with the OnPreExecute and OnPostExecute events in the second method over the System.IO method of AppendText.

    The use of System.IO in a script task performed much better than the use of the File System Task.

    Both methods were tested on folders containing 300 files with size ranging from 200MB to 1GB. 168 files were placed among the 300 in order to meet the properties that will require a delete event to fire.

    Deletion by System.IO and manually logging to flat files elapsed in 156 Milliseconds.

    Deletion by means of a Foreach Loop Container containing a Script Task to validate the file meets criteria (File System Task is limited in doing that) and then the File System Task to force the delete took a whopping 3.962 seconds!

    Have fun playing with this yourself and hope it comes in handy when you need to perform this type of task.

    If anyone has other methods or improvements to these, please feel free to comment or start a thread in the LessThanDot SQL Server forums.

    Method one: Script Task

    1. Bring over a Script Task into the Control Flow
    2. Name the task, "Remove all files based on backdays"
    3. Create the following variables
    4. Double click the script task to open the editor.
    5. Add the variable to the ReadOnlyVariables area
    6. Click Edit Script
    7. Add the following code
      1. using System;
      2. using System.IO;
      3. using System.Data;
      4. using Microsoft.SqlServer.Dts.Runtime;
      5. using System.Windows.Forms;
      6.  
      7. namespace ST_aaf20a0bd5b94ad394ab80cf5d585c41.csproj
      8. {
      9.     [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
      10.     public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
      11.     {
      12.  
      13.         #region VSTA generated code
      14.         enum ScriptResults
      15.         {
      16.             Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
      17.             Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
      18.         };
      19.         #endregion
      20.  
      21.  
      22.         public void Main()
      23.         {
      24.             try
      25.             {
      26.                 string logfile = Dts.Variables["User::logfile"].Value.ToString();
      27.                 DateTime olderThanDate = (DateTime)Dts.Variables["User::daysback"].Value;
      28.                 string folder = Dts.Variables["User::folder"].Value.ToString();
      29.  
      30.                     if (!(File.Exists(logfile)))
      31.                     {
      32.                         FileInfo fi = new FileInfo(logfile.ToString());
      33.                         FileStream fstr = fi.Create();
      34.                         fstr.Close();
      35.                     }
      36.                     ActionJackson(folder, logfile, olderThanDate);
      37.             }
      38.             catch (Exception Ex)
      39.             {
      40.                 Dts.Events.FireError(1, "", "BOOM!!!  " + Ex.Message.ToString(), "", 0);
      41.             }
      42.  
      43.             Dts.TaskResult = (int)ScriptResults.Success;
      44.         }
      45.  
      46.         private void ActionJackson(string folder, string path, DateTime olderThanDate)
      47.         {
      48.                 DirectoryInfo dirInfo = new DirectoryInfo(folder);
      49.                 TextWriter tw = File.AppendText(path);
      50.  
      51.                 FileInfo[] files = dirInfo.GetFiles();
      52.                 foreach (FileInfo file in files)
      53.                 {
      54.                     if (file.LastWriteTime < olderThanDate)
      55.                     {
      56.                         //or use Dts.Events.FireInformation(0, "", "File Deleted Succesfully", "", 0, True)
      57.                         //with a logging file destination setup
      58.                         //the two methods did not vary in performance much.  FireInformation allows for
      59.                         //the same descriptive logging as OnPreExecute/OnPostExecute descriptions
      60.                         tw.WriteLine("File " + file.FullName.ToString() + " Deleted on " + System.DateTime.Now.ToString());
      61.                         file.IsReadOnly = false;
      62.                         file.Delete();
      63.                     }
      64.                 }
      65.                 tw.Close();
      66.         }
      67.     }
      68. }
    8. Close the code editor and click OK to the script task editor.
    9. Right click an empty space in the Control Flow and select Package Configurations
    10. Check Enable package configurations
    11. Click Add and enter C:\DelConfig.xml to the specify configuration settings directly textbox
    12. Click Finish and Close to the configurations editor.
    13. You can open the XML file to edit the folder location and the log file location now
    14. Import the package into SSIS and execute it to run.

    Method two: Foreach Loop Container

    1. Bring over a Foreach Loop Container in the Control Flow
    2. Create these variables
    3. Bring over a Script Task and drop it into the Foreach Loop Container
    4. Double click the Script Task and add these ReadOnly variables
    5. Click the Edit Script button and add this code
      1. using System;
      2. using System.Data;
      3. using System.IO;
      4. using Microsoft.SqlServer.Dts.Runtime;
      5. using System.Windows.Forms;
      6.  
      7. namespace ST_216af79b563f4866bb64f6043b232b4e.csproj
      8. {
      9.     [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
      10.     public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
      11.     {
      12.  
      13.         #region VSTA generated code
      14.         enum ScriptResults
      15.         {
      16.             Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
      17.             Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
      18.         };
      19.         #endregion
      20.  
      21.         public void Main()
      22.         {
      23.             try
      24.             {
      25.                 FileInfo fi = new FileInfo(Dts.Variables["User::filefocus"].Value.ToString());
      26.  
      27.                 if (fi.LastWriteTime < (DateTime)Dts.Variables["User::daysback"].Value)
      28.                 {
      29.                     Dts.TaskResult = (int)ScriptResults.Success;
      30.                 }
      31.             }
      32.             catch (Exception Ex)
      33.             {
      34.                 Dts.Events.FireError(1, "", "BOOM!!!  " + Ex.Message.ToString(), "", 0);
      35.             }
      36.         }
      37.     }
      38. }
    6. Close the code editor and hit OK to exit and save the script task settings.
    7. Bring over and drop a File System Task into the Foreach Loop Container
    8. Connect the Script Task to the File System Task
    9. Double click the Foreach Loop Container and go to Collection.
    10. Enter in a default folder
    11. In Variable Mappings, drop down the Variable and select the User::filefocus variable and leave the index at 0
    12. Click OK to close the editor
    13. Double click the File System Task
    14. Change Operation to Delete
    15. Change IsSourcePathVariable to True
    16. SourceVariable to User::filefocus
    17. Click OK to exit and save
    18. Add a package configurations file as an XML file
    19. Right click an empty space in the Control Flow window
    20. Enter FileEnumConfig.xml and click, Next
    21. Click the variables filefocus and folder
    22. Name the configuration
    23. Click Finish and then Close to exit and save
    24. Click SSIS in the menu strip and select Logging
    25. Check container FileEnumTest
    26. Select SSIS log provider for Text Files
    27. Drop down configuration and select New Connection
    28. Select Create File and enter C:\DelbyFilEnum.txt
    29. Click OK and then OK to exit and save
    30. You can open the XML file to edit the folder location and the log file location now
    31. Import the package into SSIS and execute it to run.

    Logging from the first method appears in blocks like this

    File C:\TestDeleteSystemIO\File_20100603121501.trn Deleted on 6/4/2010 7:11:38 AM
    File C:\TestDeleteSystemIO\File_20100603123001.trn Deleted on 6/4/2010 7:11:38 AM

    compared to the SSIS logging method of

    Diagnostic,onpnt,onpnt,Delete Old File,{E4830550-A51A-46F3-A374-243888315707},{969D771F-24A9-4A57-B98D-42B025BE2573},6/4/2010 7:31:25 AM,6/4/2010 7:31:25 AM,0,(null),Trying to delete file 'C:\TestDeleteFileSystemTask\File_20100603191001.trn'.
    Diagnostic,onpnt,onpnt,Delete Old File,{E4830550-A51A-46F3-A374-243888315707},{969D771F-24A9-4A57-B98D-42B025BE2573},6/4/2010 7:31:25 AM,6/4/2010 7:31:25 AM,0,(null),Finished deleting file 'C:\TestDeleteFileSystemTask\File_20100603191001.trn'.

    Conclusions

    The use of the script task and going to the System.IO namespace to do the work has some benefits over the second method. The largest was speed. The second was the administrative and development time needed.

    Both of these methods are sound and good options. The File System Task usage removes the need for more extensive programming over the Script Task. There are also benefits in the logging method used and the amount of information we can easily obtain during the execution progress. Picking either method will be decided on comfort level with coding in .NET and performance requirements while giving up or gaining functionality.

    About the Author

    Ted Krueger is a SQL Server MVP and has been working in development and database administration for 13+ years. Specialties range from High Availability and Disaster / Recovery setup and testing methods down to custom assembly development for SQL Server Reporting Services. Ted blogs and is also one of the founders of LessThanDot.com technology community. Some of the articles focused on are Backup / Recovery, Security, SSIS and working on SQL Server and using all of the SQL Server features available to create stable and scalable database services. @onpnt
    Social SitingsTwitterLinkedInLTD RSS Feed
    4632 views
    Instapaper

    13 comments

    Comment from: SQLDenis [Member] Email
    SQLDenis Good post but it needs more images :-)

    I prefer the script task myself...and I knew Action Jackson was you favorite movie...must be because of Vanity
    06/04/10 @ 08:45
    Comment from: Ted Krueger (onpnt) [Member]
    Ted Krueger (onpnt) LOL

    Was wondering how many would catch that
    06/04/10 @ 09:06
    Comment from: Todd McDermid [Visitor] · http://toddmcdermid.blogspot.com
    Todd McDermid Great walkthough - this question is asked tons of times in the forums. Now I'll just have to remember where this is to point people to.

    I think I'll also have to blog an alternative using just the File Properties Task and File System Task - might take longer to execute, but there'd be no code...
    06/04/10 @ 11:57
    Comment from: Ted Krueger (onpnt) [Member]
    Ted Krueger (onpnt) Thanks Todd!

    Let me know when you get a chance to blog on the File Properties Task and File System Task. I think many more would rather go that route over starting to jump into using .NET like these methods rely on.
    06/04/10 @ 12:12
    Comment from: chopstik [Member]
    chopstik This would have been more useful a few weeks ago. ;-)

    I like both examples and may well find a need for them in the future. The script task, in particular, was a problem for me recently but your example helps me to see what I probably should have done. When I have the opportunity, I'll have to play with the ForEach container... Thanks!
    06/05/10 @ 07:07
    Comment from: Christiaan Baes (chrissie1) [Member]
    Christiaan Baes (chrissie1) Your C# code could do with some linq. And you should name your methods better.
    06/05/10 @ 11:19
    Comment from: diana [Visitor]
    diana I am not exactly clear on the variables. What is the value for the folder value in the second method?
    06/10/10 @ 09:18
    Comment from: Ted Krueger (onpnt) [Member]
    Ted Krueger (onpnt) Apologies Diana, looks like I missed a step. The folder variable is used in the Foreach loop container so it knows where to look to start off the interation of the script task
    06/10/10 @ 10:04
    Comment from: Diana [Visitor]
    Diana Thank you for the fast response. Could you clarify filefocus variable? Thanks!
    06/10/10 @ 12:16
    Comment from: Ted Krueger (onpnt) [Member]
    Ted Krueger (onpnt) Sure!

    Every time the loop will cycle, the filefocus variable will be used to hold the actual file that is focused on. That is why the setting is required to set the index of 0 in the loop to the variable. This will ensure that each file as it walks through them will obtain the file name. Any object within the container can then access that variable to perform its own job

    In the second method, we simply use teh filefocus to keep the same type of retention of the name down and allow us to fill the FileInfo so we can access all of the things we need with that given file in the directory
    06/10/10 @ 12:38
    Comment from: Diana [Visitor]
    Diana I did the second method and it works fine but it deletes all of files. How do I prevent this from happening? I set the daysback variable but it doesn't seem to base it on that variable. It just deletes all of files in the folder. Help!

    Thank you.
    06/15/10 @ 10:10
    Comment from: Ted Krueger (onpnt) [Member]
    Ted Krueger (onpnt) I see I flipped the check the date in the second while testing. I corrected it and thank you for the catch :-)
    06/15/10 @ 16:30
    Comment from: Abhas [Visitor] Email
    Abhas I am trying using 2 method to remove old files but
    I want to delete files in subfolders.There is main folder and under the main folder there are 4 subfolders.And i want to delete files which are in subfolders.

    i am getting following error.

    Error: Failed to lock variable "user::filefocus" for read access with error 0xC0010001 "The variable cannot be found. This occurs when an attempt is made to retrieve a variable from the Variables collection on a container during execution of the package, and the variable is not there. The variable name may have changed or the variable is not being created.".
    12/17/10 @ 07:15

    Leave a comment


    Your email address will not be revealed on this site.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)