LessThanDot

Coming from a database background, I should’ve known better. But, I didn’t. Recently I found myself needing to write a GUI wrapper for a new Address Standardization component we got at work. This was purchased in order to clean addresses for a variety of data processing tasks we do on the way into the database, allowing us to completely automate these processes, so the DLL I wrote around it was designed with that in mind. Thinking in this mindset, it was a rough transition when I realized that for this GUI to work, I needed to read the information I need in from a database (in this case primarily MS Access), fix the addresses, and write it back.

Luckily I had already added a class to the .dll with the sole purpose of being used in mapping between database (or flat file) columns and fields expected by the address standardization tool, so that part was partly done for me (good thinking, Alex!). However, I also wrote the program to use a bare-bones .dll that I use for data access, and I was sending the queries generated by the application back through a method designed to execute a single query. The app worked, and performance was actually acceptable when using a SQL Server database, but when using an access database it was S-L-O-W. Luckily I remembered a recent discussion on a forum that shall not be named, where someone brought up how slow opening and closing access connections was for them, so I was able to identify that as the bottleneck right away.

My solution was to add an overload for the function that runs the single SQL Statement. Now, in addition to a string, it can also take an enum indicating whether it is the first transaction in a batch (and therefore a connection needs to be opened and transaction initialized), somewhere in the middle, or the last (so the transaction needs to be committed and the connection closed). It makes writing the calling code a bit more difficult, as you need to handle the first and last rows a bit differently, but after implementing this change, the application worked an astounding 10x faster against an access database. And more impressive (to me anyway) 4-6x faster than the commercial product we had used for this in the past!

Here is some code for an example:

public Object RunSQL(String sql, Enums.TransactionState tranState)
{
    IDbCommand cmd;

    try
    {
        if (tranState == Enums.TransactionState.Begin)
        {
            _connDb.Open();
            _tran = _connDb.BeginTransaction();
        }

        cmd = new OleDbCommand(sql, _connDb, _tran);

        return cmd.ExecuteNonQuery();
    }
    catch (Exception ex)
    {
        throw ex;
    }
    finally
    {
        if (tranState == Enums.TransactionState.Commit)
        {
            _tran.Commit();
            _connDb.Close();
            _tran = null;
        }

        cmd = null;
    }
}

and the Enum (pretty straightforward, but just in case):

public enum TransactionState
{
    /// <summary>
    /// Open a new transaction
    /// </summary>
    Begin,
    /// <summary>
    /// Continue using open transaction
    /// </summary>
    Continue,
    /// <summary>
    /// Commit transaction when finished
    /// </summary>
    Commit
}

So now hopefully you can avoid the embarassing mistake of failing to batch your queries properly. One would think a “database guy” would know better 😳

Well, he does now!

Send Action Queries to Your Database in Batches

About Alex Ullrich