After quite a while it has finally returned: another Stupid Me™®©! A reminder:
Every time I do something "stupid", which happens from time to time, I'll do a little blog post on what happened and how I solved it. The reason for this is twofold: I'll have a solution online I can consult if it happens again and other people can benefit from my mistakes as well. Because remember the ancient Chinese proverb: "It's only stupid if you don't turn it into a learning experience". Okay, I might have made that last one up...
At a client, the test environment has the same daily SQL Server Agent job that runs the ETL just like in production. On a rainy morning, the job suddenly returned the following error:
Login Failed. The login is from an untrusted domain and cannot be used with Windows authentication.
Important to know is that each environment – Development, Test and Production – has its own Active Directory domain for security reasons. I immediately thought of an issue within Active Directory, but the AD admin assured me nothing had changed. The login itself – a regular domain account used as an execution account – was active and its password had not expired.
So where did this error come from?
Several packages of the same SSIS project had the same issues, but packages from another project did not. Time to investigate further, so I checked the versions of that particular project in the SSIS catalog.
Turned out I did a deployment of that project the day before the SQL Agent job started failing. The plot thickens. It means there was some change in the project that caused this strange error. But what?
I looked into the connection manager that was failing to connect. Connection managers in the project are project connection managers and they are configured using two parameters: one to set the database and one to set the server. Configuring project connection managers is easy: this can be done through any package in the project. Simply go to the properties of the connection manager and configure the expressions. These expressions will immediately apply in every package.
When the package is deployed to the SSIS catalog of another environment, the parameters there will have the corresponding values of that environment.
I took a look at the version control history of the project connection manager. And apparently I did make a change to the connection manager.
However, I did not recall making any change to the projection manager. And then it hit me. I generated all of the packages using BIML. In the BIML script, I also generate the connection managers as they are needed during compile time (as far as I know BIML doesn't realize they are actually already there in the project). I didn't include the expressions in the BIML script as I do not actually generate them. When the BIML script is finished, it asks you if you want to overwrite existing items. I simply deselect the checkboxes for the connection managers and everything is fine in the world.
Except when you forget to deselect the checkbox. In that case, the connection manager is overwritten and the expressions are gone. Which means that when you deploy the project to another server, the package tries to connect to the development server (which is on another domain). Hence the error. Whoops.
The obvious solution is to not forget to deselect the checkboxes for the connection managers, but a more durable solution is to expand my BIML script to include the expressions, just to make sure. Debugging this issue was made easy because of the automatic versioning in the SSIS catalog, but also because the SSIS projects in Visual Studio are checked into version control, making it straight forward to find any changes.