Recently I was developing an SSIS package in BIDS 2008R2 which was part of a Oracle to SQL Server migration; my favorite kind of migration. This package had a very simple SQL select statement in the OLE DB source using the Oracle OLE DB provider (slightly altered to protect the innocent):
SELECT columnA ,columnB ,42 AS Code ,DATE'1900-01-01' FROM user.myTable mt WHERE mt.Status IN (10,20,30) AND mt.TransactionDate > DATE'2012-01-01';
The preview in the OLE DB worked flawlessly and I finished constructing my data flow. However, when I ran the package, I got all sorts of weird metadata errors indicating problems at the source. Basically the errors told me new columns needed to be added to the external columns collection (see the advanced editor of the source and check out the Input and Output properties tab) and that existing columns should be removed. Weird because the source and the SQL query hadn't changed.
Time to investigate the issue. To reproduce the issue, I created a very simple package with one data flow. In the data flow, I have this select statement in an OLE DB source retrieving data from the Oracle database:
SELECT DATE'2013-01-14' AS TestDate ,'This is a test' AS TestString ,42 AS TestInt FROM DUAL;
The output of the source component is connected to a multicast component, which serves as a trash destination.
As you can see in the following screenshot, the preview works without a hitch:
However, when I run the package, I get the following warnings and errors during validation:
_SSIS package “TestOracle.dtsx” starting. Information: 0x4004300A at (DFT) Read from Oracle, SSIS.Pipeline: Validation phase is beginning.
Warning: 0x800470C8 at (DFT) Read from Oracle, (OLE_SRC) Read from DUAL : The external columns for component “(OLE_SRC) Read from DUAL” (1) are out of synchronization with the data source columns. The column “TEST” needs to be added to the external columns.
The external column “TESTDATE” (191) needs to be removed from the external columns.
Error: 0xC004706B at (DFT) Read from Oracle, SSIS.Pipeline: “component “(OLE_SRC) Read from DUAL” (1)” failed validation and returned validation status “VS_NEEDSNEWMETADATA”.
Error: 0xC004700C at (DFT) Read from Oracle, SSIS.Pipeline: One or more component failed validation.
__ Error: 0xC0024107 at (DFT) Read from Oracle: There were errors during task validation.
_SSIS package “ TestOracle.dtsx ” finished: Failure.
I marked the issues in bold. Apparently, SSIS insists there's a new column called Test which isn't included in the SQL statement but should be added to the metadata anyway. The TestDate column, which is real, should be removed. Somehow, SSIS interpretes the SQL statement wrong and truncates the column name TestDate to Test.
Changing the DelayValidation property of the data flow to True didn't have much effect. The package starts running but crashes at the source component:
The output has also slightly changed:
_SSIS package “ TestOracle.dtsx ” starting.
Information: 0x4004300A at (DFT) Read from Oracle, SSIS.Pipeline: Validation phase is beginning.
Information: 0x40043006 at (DFT) Read from Oracle, SSIS.Pipeline: Prepare for Execute phase is beginning.
Information: 0x40043007 at (DFT) Read from Oracle, SSIS.Pipeline: Pre-Execute phase is beginning.
Error: 0xC0202005 at (DFT) Read from Oracle, (OLE_SRC) Read from DUAL : Column “TESTSTRING” cannot be found at the datasource.
Error: 0xC004701A at (DFT) Read from Oracle, SSIS.Pipeline: component “(OLE_SRC) Read from DUAL” (210) failed the pre-execute phase and returned error code 0xC0202005.
Information: 0x40043008 at (DFT) Read from Oracle, SSIS.Pipeline: Post Execute phase is beginning.
Information: 0x40043009 at (DFT) Read from Oracle, SSIS.Pipeline: Cleanup phase is beginning.
Task failed: (DFT) Read from Oracle
SSIS package “ TestOracle.dtsx ” finished: Failure._
I highlighted the important issue in bold and I removed some warnings about my input columns not being used in the data flow. Now, SSIS complains about another column suddenly missing at the datasource. However, the dual table in Oracle hasn't changed of course and the SQL statement is also still the same. It's also remarkable the preview works every time, but the data flow still fails.
So what's causing the issue? After playing around with the original query, a colleague and I guessed it might had something to do with the date literals used in the query, because seemingly it affected columns around the column using the date literal.
I used the ANSI date literal to specify a hard-coded date value. For example: DATE'1900-01-01'. Check out the following Oracle documentation about date literals: Literals. Let's replace this literal with the Oracle date value TO_DATE('1900-01-01','yyyy-mm-dd'):
Suddenly, the package runs without any problem:
What’s more intriguing is that when I use the Microsoft OLE DB provider for Oracle using the original query, the package also runs succesfully. So it seems the Oracle OLE DB provider, downloaded from the Oracle website, has some sort of issue with the ANSI date literal. I haven’t found any documentation why the provider messes up the columns like that, but if someone has better Google skills, please drop a note in the comments.
Avoid using the ANSI date literals in your Oracle queries, but rather use the TO_DATE function. If it is an option, use the Microsoft OLE DB provider for Oracle.