SQL Server 2005's Snapshot Isolation
Now that SQL Server 2005 is almost here, it is time for you to read up on and apply some of its new features to your databases. Many of the new features are pretty straightforward (synonyms, for example), but some are less so.
In this article, I'll explain the new Snapshot Isolation feature. It may not have the most obvious payoff, but implementing Snapshot Isolation on a suitable database has the potential to eliminate data contention and reduce deadlocks, lock contentions, and session waits.
If you answer yes to one or more of the following questions, your database is a possible candidate for Snapshot Isolation:
Do you have long-running batch jobs that run while your user queries are running?
Do you run statistics queries fairly frequently?
NOTE
Hint: Typical statistic queries have Sum, Count, Average and similar keywords. They also touch a larger number of rows than do normal transactional queries.
Do you have stored procedures or application logic that hold transactions open for longer periods while they are working? "Longer periods" is in database terms; it could be anywhere from a few seconds to a few minutes, maybe more.
I will not tax your patience with more questions. The general characteristic of these database queries is that they need to hold onto locks for longer period of time. As such, they block other readers (select queries) and writers (insert, update, and delete queries).
Snapshot Isolation provides a mechanism that eliminates the blocking of other "readers." It is akin to optimistic locking, in which you make a copy of the data (typically at the front end) and make changes to your copy of the data. When you are ready to save the changes back to the database, you (typically) check to see whether the original data has changed while you were working with it and decide whether you want to save.
Snapshot Isolation (also called Row Versioning) is optimistic locking, but it is completely transparent to your users and is handled by the database. The database keeps a copy of the original data while you are changing it, and serves up the original data to anybody who wants to read it in the interim.
NOTE
Snapshot Isolation is also called Row Versioning because SQL Server keeps "versions" of rows that are being changed; that is, the original version and the version being changed.
SQL Server 2000 Behavior
To see what this feature accomplishes, let's first examine how we were affected by its lack in previous SQL Server versions. We use the omnipresent PUBS database and the authors table. The table has au_id as the primary key and an index on the au_lname, au_fname columns.
Let's now run some queries in SQL Server 2000.
Session 1
Open a Query Analyzer window and run these queries. The server needs the sample pubs database:
USE PUBS BEGIN TRANSACTION UPDATE authors SET phone = '111 111-1111' WHERE au_id = '172-32-1176'
Session 2
Open another Query Analyzer window and run these queries. The server needs the sample pubs database:
USE PUBS SELECT * FROM authors
The SELECT statement cannot read the row updated in Session 1, so it blocks (that is, it waits until Session 1 commits or aborts).
Note that there are some variations to this behavior. Let's look at it right away; you may be puzzled if you did something slightly different and got a different result.
Run this query in the second session window:
SELECT * FROM authors WHERE au_id <> '172-32-1176'
This statement succeeds because there is an index on the au_id column. The execution path completely avoids the row locked by Session 1.
Run this query, also in the second session window:
SELECT * FROM authors WHERE phone = '111 111-1111'
This statement fails. The execution plan includes a table or clustered index scan because there is no index on the phone column. A table scan requires that every row be read, and the query fails.
Another scenario, in which even queries with indexes will fail, is when you update a large number of rows (more than 3,000 in SQL Server 2000), and the engine escalates the lock to a table level lock.