Passionate about data

Data and its implications on software design and development.

MSSQL JDBC Driver Behavior

My latest project involves talking to MS-SQL Server using the JDBC driver and Java. While doing this we setup the database connection and had a simple SQL to get the first_name and last_name for a unique user_id from the application_user table in the database.

1
2
3
SELECT first_name,last_name 
FROM application_user 
WHERE user_id = ?

Given the above SQL, we did not think too much about performance as the user_id was indexed. The java code as below was used to run the SQL.

1
2
3
4
5
6
7
8
9
try {
    PreparedStatement stmt = prepare(conn,
            "SELECT first_name, last_name FROM application_user " +
            "WHERE user_id = ?");
    stmt.setString(1, username);
    return execute(stmt);
} catch (SQLException e) {
    e.printStackTrace();
}

When writing integration tests we started noticing that the SQL was taking about 6 seconds to execute. The same SQL would execute inside 100 milliseconds on the MSSQL query analyzer. The friendly DBA’s on our team pointed out that the SQL was doing some data type conversion as the user_id field was of the type VARCHAR but the SQL sent by the the JDBC driver set the data type to NVARCHAR because of this the index was not being used and the SQL took more than 6 seconds to execute. Researching this topic further we decided to cast the variable to VARCHAR as shown below.

1
2
3
4
5
6
7
8
try {
    PreparedStatement stmt = prepare(conn,
            "SELECT first_name, last_name FROM application_user WHERE user_id = cast(? AS VARCHAR)");
    stmt.setString(1, username);
    return execute(stmt);
} catch (SQLException e) {
    e.printStackTrace();
}

The above code executed under 100 milliseconds and showed us that the data types being used did not match the datatype in the database. We later found out that the MS-SQL JDBC driver does this to properly deal with unicode characters. This behavior can be turned off using the sendStringParametersAsUnicode flag on the database connection. Once this flag is set to false on the connection, then all the SQL we issue do not need the cast

1
2
3
4
5
6
7
8
9
Connection conn = db.conn(DATABASE_URL + ";sendStringParametersAsUnicode=false");
try {
    PreparedStatement stmt = prepare(conn,
        "SELECT first_name, last_name FROM application_user WHERE user_id = ? ");
    stmt.setString(1, username);
    return execute(stmt)
} catch (SQLException e) {
    e.printStackTrace();
}

Off course this only works if there is no unicode data in your database if there is any unicode data in the database, we will have to revert to casting individual SQL statements.