Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « Run SSIS Package from Stored ProcedureThe Productive DBA: Part 2 Document the Database »
    comments

    SQL Server provides database professionals with several paths in order to import and export data in and out of many data sources and file services. One main task, the Foreach Loop Container, allows for processing large (or small) groups of files with one contained and manageable process. Tasks that once were difficult and required larger scale development efforts have now moved into the realm of the rapid development methodology. Rapid Development is defined as requiring a much lower amount of resources in order to push a process into a production environment. Resources can contain any one of the main components in developing strategies such as management, developers, infrastructure changes and so on. The resources take time in planning stages, developing stages and testing stages. Accomplishing this methodology is done by placing each of these on top of each other in a sense. Planning is done alongside developing and testing.

    SSIS promotes this in a way just by the imagery that the designer, BIDS (Business Intelligence Studio), uses while you develop packages. From most SSIS packages, you can compile a flow chart in great detail. Jumping ahead on the topic and task at hand; take the example below of the finished package for importing a directory of images into a table in a SQL Server database.



    With the steps shown we can write a flowchart as shown below

    The process in both representations is readable in the same manner. Process starts, then checks and if required, creates an archive directory and then processes the images internally in a subprocess (container). This added benefit of the way BIDS and the visual aspects of the development studio helps promote rapid development and quick documentation. All of this means lower to mid-level complex processing initiatives can be developed faster.

    Remember, with any process, the complexity and depth of the requirements are considered if the longer planning process can be performed while developing them. Planning out a process before hand when a certain point is passed, should be performed. Combining the planning, development and testing at one time can cause the initiative to be longer if misused.

    Back to the process at hand...

    The outlined flow and package above has the goal of iterating through a folder and inserting the images found in it into a SQL Server database. Some of the dynamic needs in this are:

    1. Passing the type of images to insert
    2. Passing the directory to look in
    3. All connections to SQL Server

    For error handling, each insert will be allowed to error only those insert statement. This means the package should not exit with an error state simply because one image failed to insert. Event handlers can be used to accomplish this.
    The following statements will create the supporting database and tables. Setting up the main table to insert into and the primary error handling table initially will help speed the process of developing our package up.

    1. CREATE DATABASE [IMAGES] ON  PRIMARY
    2. ( NAME = N'IMAGES', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.XPS2008R2\MSSQL\DATA\IMAGES.mdf' , SIZE = 14336KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
    3.  LOG ON
    4. ( NAME = N'IMAGES_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.XPS2008R2\MSSQL\DATA\IMAGES_log.ldf' , SIZE = 1792KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
    5. GO
    6. CREATE TABLE WebImages
    7. (
    8.     ImageName varchar(50) not NULL PRIMARY KEY,
    9.     ImageSource varbinary(max) not null
    10. )
    11. CREATE TABLE WebImageImportErrorLog
    12. (
    13.     ImageName varchar(50),
    14.     ErrorDesc VARCHAR(2000),
    15.     PackageID VARCHAR(255),
    16.     ExeDateTime VARCHAR(50),
    17.     UserExeName VARCHAR(255)
    18. )

    To insert an image into the table WebImages, use OPENROWSET with SINGLE_BLOB. This method is shown as

    1. INSERT INTO WebImages (ImageName, ImageSource)
    2. SELECT
    3.     'C:\Users\onpnt\Documents\Images\0001.jpg' ImageName
    4.     ,BulkColumn
    5. FROM
    6. OPENROWSET (BULK 'C:\Users\onpnt\Documents\Images\0001.jpg', SINGLE_BLOB) AS Images

    Notice that we set the ImageName to a primary key. This is due to the table following the same constraints as directory services. No file may exist in the same location with the same name. We propagate this concept to the table.

    Create the package - Preparation

    Variables are a key to making this process reusable and portable. For this example, four variables will be utilized.

    • SqlInsertCmd: Used to build the INSERT statement based on the image currently being processed
    • ArchiveDirectory: Mentioned earlier to create the archive folder once images are processed
    • ImageFileFocus: The place holder of the image being processed
      SourceDirectory: Directory to work on that is holding the images

    Data types for these variables are all strings. The scope will primarily be set to the package level with a few set to the container level.
    Create the variables as shown above after moving the following into the control flow window.

    File System Task, Foreach Loop Container with an Execute SQL Task and File System
    Task within the container

    Archive Task

    The first task that needs to be accomplished is the archive directory. In order to prevent unwanted errors to be logged be reprocessing images on each execution of the package, we want to move the images to a folder that is designated as processed already. This can be accomplished with a File System Task and the create directory operation.

    Set the UseDirectoryIfExists to True so if the directory exists already, we do not unintentionally remove or overwrite the directory. This may cause unwanted loss of images that exist in the directory. The Archive directory for this example is set in the directory that is being processed so using the source directory while adding the “\Archive” additional path is sufficient. This is done in the ArchiveDirectory variable as an expression

    @[User::SourceDirectory] + "\\Archive"

    The container

    The Foreach Loop Container is truly the core processing mechanism in this package. With the Foreach Loop Container, iterating all files in a specified directory is made simple in a sense. It is simple due to the limited requirement of properties to get the container looping through the files.

    The key properties that are required to set the container are directory (Folder), file extensions, resulting return of string that will be the file currently focused on and a variable mapping. The variable mapping is used to set a variable to the file that is currently being processed by the container so it can be passed to other tasks within the container.

    To set the properties for the container to process the directory from the variable SourceDirectory, use expressions. This will set the property at runtime. This allows us to quickly change where the processing will occur if the folder is changing often, the moved or the package is going to be used for other tasks. Set the expression by going to the Collection window in the Foreach Loop Editor, click the ellipse to open the Property Expression Editor and select Directory for Property and set the Expression to the variable, @[User::SourceDirectory]. (shown below)



    Use a hardcoded folder value for the setup and enter in “.jpg” (or the image types you want to import) in the Files text box.

    Execute SQL Task – INSERT

    Error Handling

    Event Handlers are used to catch any issues that may arise when inserting the image into SQL Server. The most common error that will come up with the process is a primary key violation. This is due to the primary key being the file name (or image) in the table WebImages. In the case an image was placed in the folder again or the archive process failed, there is a chance for the image to be inserted again. This is not ideal in this processing and would be set as a secondary processing design to update the table. That design would be set with either child package or a variable setting to validate the tasks to begin processing in the package itself. (for a later discussion)
    To setup the event handler:
    While the Execute SQL Task is highlighted, click the Event Handlers tab. Click the link to create the OnError event handler for the SQL task. Bring an Execute SQL Task over into the window.

    The connection for the task will be to the IMAGES database created earlier, and an insert into the error table also created earlier. The system level information and image file name will be inserted into the error table so they can be analyzed later. To insert the system variables, use the insert statement shown below

    1. INSERT INTO WebImageImportErrorLog
    2. ([ImageName]
    3.       ,[ErrorDesc]
    4.       ,[PackageID]
    5.       ,[ExeDateTime]
    6.       ,[UserExeName])
    7. VALUES (?,?,?,?,?)

    Set ByPassPrepare to True so the statement is not parsed.
    Move to the to the Parameter Mapping section. Each ? will map based on the ordinal positions mapped. This is set by the Parameter Name as shown below.

    The last property to set is a system variable in order to manage the errors when they occur.



    Set the Propagate system variable to False so the errors do not flow back up to the container and stop the processing of the next image.

    Archive the processed image

    The File System Task to perform the archive task will utilize the Move file type. Using the ImageFileFocus variable for the source, the last setting required is the archive directory path. The file name is not required on the move.
    Set the DelayValidation to true or the variable setting will cause it to error given it is empty until the package is executed.

    Processing

    Putting this all together we arrive at the package outlined in the beginning of this discussion. Executing this package initially and then a second time will show the normal process of inserting all images into the WebImages table along with the error handling and logging put in place.

    Execute the package initially

    Once complete, validate the WebImages table by selecting the contents

    To test the event handler error logging, move the images back into the root directory from the archive directory and execute the package a second time. While processing, the Execute SQL Task will start to throw errors based on the primary key violation. This can be seen by selecting from the WebImageErrorLog table

    Closing

    SSIS and the Foreach Loop Container allow for quick setup and processing of entire directory contents. Within the container, the use of T-SQL, Data Flow Tasks and other processing can be used to fully validate, transform and import data while iterating through those files. This promotes rapid development methods and allows for shorter and less costly initiatives to transport from development to testing and into production.

    About the Author

    Ted Krueger is a SQL Server MVP and has been working in development and database administration for 13+ years. Specialties range from High Availability and Disaster / Recovery setup and testing methods down to custom assembly development for SQL Server Reporting Services. Ted blogs and is also one of the founders of LessThanDot.com technology community. Some of the articles focused on are Backup / Recovery, Security, SSIS and working on SQL Server and using all of the SQL Server features available to create stable and scalable database services. @onpnt
    Social SitingsTwitterLinkedInLTD RSS Feed
    2464 views
    Instapaper

    3 comments

    Comment from: chinmay [Visitor]
    chinmay Hi Gurus,

    Please conform weather DAC is supporting SSIS tool for integration

    thanks
    12/08/10 @ 11:17
    Comment from: chinmay [Visitor]
    chinmay I mean to say Oracle DAC

    thanks
    12/09/10 @ 01:45
    Comment from: Amy [Visitor] Email
    Amy What do you set the property to for the Expression in the Create Archive Folder Task
    Also
    Where do you set this option (Last step in this package)
    'Set the DelayValidation to true or the variable setting will cause it to error given it is empty until the package is executed'

    Thanks
    07/20/11 @ 08:28

    Leave a comment


    Your email address will not be revealed on this site.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)