Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « Why avoiding multiple code blocks in a Stored ProcedureMultidocument updates with MongoDB »
    comments

    Recently I was developing an SSIS package in BIDS 2008R2 which was part of a Oracle to SQL Server migration; my favorite kind of migration. This package had a very simple SQL select statement in the OLE DB source using the Oracle OLE DB provider (slightly altered to protect the innocent):

    1. SELECT
    2.      columnA
    3.     ,columnB
    4.     ,42 AS Code
    5.     ,DATE'1900-01-01'
    6. FROM user.myTable mt
    7. WHERE mt.STATUS IN (10,20,30) AND mt.TransactionDate > DATE'2012-01-01';

    The preview in the OLE DB worked flawlessly and I finished constructing my data flow. However, when I ran the package, I got all sorts of weird metadata errors indicating problems at the source. Basically the errors told me new columns needed to be added to the external columns collection (see the advanced editor of the source and check out the Input and Output properties tab) and that existing columns should be removed. Weird because the source and the SQL query hadn’t changed.

    Time to investigate the issue. To reproduce the issue, I created a very simple package with one data flow. In the data flow, I have this select statement in an OLE DB source retrieving data from the Oracle database:

    1. SELECT
    2.      DATE'2013-01-14' AS TestDate
    3.     ,'This is a test' AS TestString
    4.     ,42 AS TestInt
    5. FROM DUAL;

    The output of the source component is connected to a multicast component, which serves as a trash destination.

    As you can see in the following screenshot, the preview works without a hitch:

    However, when I run the package, I get the following warnings and errors during validation:

    SSIS package "TestOracle.dtsx" starting. Information: 0x4004300A at (DFT) Read from Oracle, SSIS.Pipeline: Validation phase is beginning.
    Warning: 0x800470C8 at (DFT) Read from Oracle, (OLE_SRC) Read from DUAL [1]: The external columns for component "(OLE_SRC) Read from DUAL" (1) are out of synchronization with the data source columns. The column "TEST" needs to be added to the external columns.
    The external column "TESTDATE" (191) needs to be removed from the external columns.
    Error: 0xC004706B at (DFT) Read from Oracle, SSIS.Pipeline: "component "(OLE_SRC) Read from DUAL" (1)" failed validation and returned validation status "VS_NEEDSNEWMETADATA".
    Error: 0xC004700C at (DFT) Read from Oracle, SSIS.Pipeline: One or more component failed validation.
    Error: 0xC0024107 at (DFT) Read from Oracle: There were errors during task validation.
    SSIS package " TestOracle.dtsx " finished: Failure.

    I marked the issues in bold. Apparently, SSIS insists there’s a new column called Test which isn’t included in the SQL statement but should be added to the metadata anyway. The TestDate column, which is real, should be removed. Somehow, SSIS interpretes the SQL statement wrong and truncates the column name TestDate to Test.

    Changing the DelayValidation property of the data flow to True didn’t have much effect. The package starts running but crashes at the source component:

    The output has also slightly changed:

    SSIS package " TestOracle.dtsx " starting.
    Information: 0x4004300A at (DFT) Read from Oracle, SSIS.Pipeline: Validation phase is beginning.

    Information: 0x40043006 at (DFT) Read from Oracle, SSIS.Pipeline: Prepare for Execute phase is beginning.
    Information: 0x40043007 at (DFT) Read from Oracle, SSIS.Pipeline: Pre-Execute phase is beginning.
    Error: 0xC0202005 at (DFT) Read from Oracle, (OLE_SRC) Read from DUAL [210]: Column "TESTSTRING" cannot be found at the datasource.
    Error: 0xC004701A at (DFT) Read from Oracle, SSIS.Pipeline: component "(OLE_SRC) Read from DUAL" (210) failed the pre-execute phase and returned error code 0xC0202005.
    Information: 0x40043008 at (DFT) Read from Oracle, SSIS.Pipeline: Post Execute phase is beginning.
    Information: 0x40043009 at (DFT) Read from Oracle, SSIS.Pipeline: Cleanup phase is beginning.
    Task failed: (DFT) Read from Oracle
    SSIS package " TestOracle.dtsx " finished: Failure.

    I highlighted the important issue in bold and I removed some warnings about my input columns not being used in the data flow. Now, SSIS complains about another column suddenly missing at the datasource. However, the dual table in Oracle hasn’t changed of course and the SQL statement is also still the same. It’s also remarkable the preview works every time, but the data flow still fails.

    So what’s causing the issue? After playing around with the original query, a colleague and I guessed it might had something to do with the date literals used in the query, because seemingly it affected columns around the column using the date literal.

    I used the ANSI date literal to specify a hard-coded date value. For example: DATE’1900-01-01’. Check out the following Oracle documentation about date literals: Literals. Let’s replace this literal with the Oracle date value TO_DATE(‘1900-01-01’,’yyyy-mm-dd’):

    Suddenly, the package  runs without any problem:

    What’s more intriguing is that when I use the Microsoft OLE DB provider for Oracle using the original query, the package also runs succesfully. So it seems the Oracle OLE DB provider, downloaded from the Oracle website, has some sort of issue with the ANSI date literal. I haven’t found any documentation why the provider messes up the columns like that, but if someone has better Google skills, please drop a note in the comments.

    Conclusion

    Avoid using the ANSI date literals in your Oracle queries, but rather use the TO_DATE function. If it is an option, use the Microsoft OLE DB provider for Oracle.

    About the Author

    User bio imageKoen Verbeeck is a Microsoft Business Intelligence consultant at Ordina Belgium, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.
    Social SitingsTwitterLinkedInLTD RSS Feed
    1626 views
    InstapaperVote on HN

    No feedback yet

    Leave a comment


    Your email address will not be revealed on this site.

    To mislead the spambots.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)