In prior articles of our series on SQL Server optimization methods, we have mainly discussed indexes and related issues. But not all SQL Server performance issues are causing by inadequate indexing.
Some problems are caused by contention between processes holding a lock on some data resource while other processes are waiting for the lock to be released so they can proceed.
There is substantial amount of tsql code in this article. The format of an email newsletter forces me to break lines in the tsql code so that they fit in a narrow column. The code is difficult to copy and difficult to read in this form. So, if you prefer Email Me and I will send you the code in a more usable format.
Blocking
In any database there will be some level of blocking, but usually it is brief and goes unnoticed. In some databases it can be severe and slow the entire application down to a crawl.
Often the biggest obstacle to fixing the blocking issues is finding the code that is blocking and the code that is being blocked. In this article we will discuss a method for identifying the blocking code and the blocked code.
In the case of blocking, it takes two (or more) to tango. Sometimes you need to fix the blocking code, sometimes it is easier to solve the problem by fixing the blocked code. But first you have to find out what code is involved. We will give you a diagnostic query that will show you exactly what kind of blocking is going on and exactly what code is causing it.
This code works for all versions of SQL Server later than SQL 2000. It will not work on SQL 2000 or earlier editions.
Capturing Blocking Information
Here is the code. We will use it to demonstrate a typical blocking situation in the AdventureWorks sample database. Later editions of this sample database should also work. You can get them HERE if you want to follow along.
We will first open three query windows. In the first window we will paste this diagnostic code:
- SELECT t1.resource_type
- ,db_name(resource_database_id) as [database]
- ,t1.resource_associated_entity_id as [blk object]
- ,t1.request_mode
- ,t1.request_session_id — spid of waiter
- ,(SELECT text FROM sys.dm_exec_requests as r — get sql for waiter
- cross apply sys.dm_exec_sql_text(r.sql_handle)
- WHERE r.session_id = t1.request_session_id) as waiter_text
- ,t2.blocking_session_id — spid of blocker
- , (SELECT TOP 1 request_mode
- FROM sys.dm_tran_locks t1
- JOIN sys.dm_os_waiting_tasks t2
- ON t1.request_session_id = t2.blocking_session_id
- WHERE request_mode NOT LIKE ‘IX%’
- AND resource_type NOT LIKE ‘DATABASE’
- AND resource_type NOT LIKE ‘METADATA%’
- ORDER BY request_mode desc) AS blocking_lock
- ,(SELECT text FROM sys.sysprocesses AS p — get sql for blocker
- cross apply sys.dm_exec_sql_text(p.sql_handle)
- WHERE p.spid = t2.blocking_session_id) AS blocker_text
- FROM
- sys.dm_tran_locks as t1,
- sys.dm_os_waiting_tasks as t2
- WHERE
- t1.lock_owner_address = t2.resource_address
In the second window we will paste this update statement. Notice that a transaction is begun, but the ROLLBACK TRAN command is commented out. This will persist the lock taken by the update statement so we can examine the blocking situation it causes.
- USE AdventureWorks;
- BEGIN TRAN
- UPDATE Person.Contact
- SET NameStyle = 1
- WHERE ContactID = 14
- — ROLLBACK TRAN
Then we will paste this code into the third window:
- USE AdventureWorks;
- SELECT * from Person.Contact
- WHERE ContactID = 14
First execute the update statement. It will give you a message that one row has been updated. However the transaction is still open because a COMMIT or ROLLBACK has not yet been issued.
Then execute the select statement. It will give you no message because it hasn’t finished. It needs a shared lock on the same row and it has to wait for the exclusive lock on that row to be released by the update statement.
Now we can execute the diagnostic query and see the result. Word wrap will mangle this output so I will pivot the result for clarity. I will also truncate the code columns as necessary for the display:
Object | Item |
Resource_type | KEY |
Database | AdventureWorks |
Blk Object | 72057594043695104 |
request_mode | S |
request_session_id | 58 |
waiter_text | (@1 tinyint)SELECT * FROM [Person].[Contact] WHERE [ContactID]=@1 |
blocking_session_id | 56 |
blocking_lock | X |
blocker_text | BEGIN TRAN UPDATE Person.Contact WITH (updlock)… |
It is clear what is going on in this simple example. A writing process with an exclusive lock on the row is blocking a process that needs a shared lock to get a clean read of the data. You can see what code you have to fix. In this case, if you don’t need strict READCOMMITED isolation for this query, you might add the query hint to the select column as below.
- SELECT * from Person.Contact WITH (READUNCOMMITED)
- WHERE ContactID = 14
The older form of this hint was NOLOCK. Although NOLOCK is still widely used, it has been deprecated by Microsoft and will not work in some future edition of SQL Server. It is best to use READUNCOMMITTED.
With this hint, the select statement will not take a shared lock on the data and consequently it will allow what is called a “dirty read”. The data can be changed while the select statement is running. If you run the exercise again with the query hint, you will find that the select statement ignores the exclusive lock and reads the row immediately.
Now, rollback the transaction in the update statement. You will see that the update statement updated the column before the select statement read it. The select statement read the changed value.Then the update statement was rolled back, leaving the select statement with inaccurate data. The value of 1 that the select statement read was never committed to the database.
In this example it may not matter much, but there are situations where you do not want the query to read uncommitted data. You must consider the data consistency consequences when you use a READCOMMITTED hint.
Monitoring Blocking Events
Blocks are transitory things. The blocking situation can change within milliseconds. Executing our diagnostic query randomly may or may not capture the blocking situation in which we are interested. But there is an alternative to randomly executing the diagnostic query. The code below wraps the diagnostic script in a WAITFOR loop and writes the results to a temporary table. In this example it executes the query every five seconds. You can change that as you like.
Note that the table is created with a double hash mark prefix. This creates a temporary table that is accessible from any session running on the server. This allows you to examine the table results of the query from another connection without having to stop the loop. You could also achieve the same effect if you created it as a permanent table.
This code runs in an endless loop, so remember to stop it when you are done collecting data.
- CREATE TABLE ##BlockingCode
- (resource_type nvarchar(120)
- ,[database] nvarchar(128)
- ,[blk object] bigint
- ,request_mode nvarchar(120)
- ,request_session_id int
- ,waiter_text text
- ,blocking_session_id int
- ,blocking_lock varchar(20)
- ,blocker_text text
- ,batchdate datetime default getdate())
- WHILE 1=1 — endless loop, remember to close connection
- BEGIN
- INSERT INTO ##BlockingCode
- (resource_type
- ,[database]
- ,[blk object]
- ,request_mode
- ,request_session_id
- ,waiter_text
- ,blocking_session_id
- ,blocking_lock
- ,blocker_text)
- SELECT t1.resource_type
- ,db_name(resource_database_id) as [database]
- ,t1.resource_associated_entity_id as [blk object]
- ,t1.request_mode
- ,t1.request_session_id — spid of waiter
- ,(SELECT text FROM sys.dm_exec_requests as r — get sql for waiter
- cross apply sys.dm_exec_sql_text(r.sql_handle) WHERE r.session_id = t1.request_session_id) as waiter_text
- ,t2.blocking_session_id — spid of blocker
- , (SELECT TOP 1 request_mode
- FROM sys.dm_tran_locks t1
- JOIN sys.dm_os_waiting_tasks t2
- ON t1.request_session_id = t2.blocking_session_id
- WHERE request_mode NOT LIKE ‘IX%’
- AND resource_type NOT LIKE ‘DATABASE’
- AND resource_type NOT LIKE ‘METADATA%’
- ORDER BY request_mode desc) AS blocking_lock
- ,(SELECT text FROM sys.sysprocesses AS p
- cross apply sys.dm_exec_sql_text(p.sql_handle)
- WHERE p.spid = t2.blocking_session_id) AS blocker_text
- FROM
- sys.dm_tran_locks as t1,
- sys.dm_os_waiting_tasks as t2
- WHERE
- t1.lock_owner_address = t2.resource_address
- WAITFOR DELAY ’00:00:05′
- END
- GO
Using the Diagnostic Query
Normally, I would execute the diagnostic query and let the loop run for a reasonable period of time capturing blocking behavior at 5 second intervals. While it was running I would open up another query window in Management studio and execute the code below periodically to check progress:
- SELECT [database], [blk object], count(*)
- FROM ##BlockingCode
- GROUP BY [database], [blk object]
- SELECT *
- FROM ##BlockingCode
- ORDER BY batchdate DESC
The first query returns a count of the number of times that access was blocked to a specific resource at each 5 second snapshot.
Although tables are by far the objects you will see most frequently, other types of objects can appear in the output. For example If the column [blk object] resolves to a stored procedure name it is highly likely that it is a procedure that recompiles a lot and you are seeing compile locks.
Depending on a number of things, [blk object] can be either an object id or a HOBT id. If it is an object id it can be resolved to an object name using SELECT Object_Name([blk object]) You must run this statement in the database where the object resides.
If [blk object is a HOBT id, resolving the object name is more involved. Here is the code that will resolve a hobt id to an object name:
- select OBJECT_NAME(object_id)
- from sys.partitions
- where hobt_id = 72057594043695104
- (substitute your hobt id)
What is a HOBT ID?
In the later versions of SQL Server, every table is partioned, although they have only one partition unless you have explicitly created a partition scheme on them. In order to support table partitioning, Microsoft had to create a unique identifier for each partition.
Partition information is exposed in a new system view, sys.partitions. If you query that view you will find that the partition id and the hobt id are always the same in each partition (go figure). Also in the view is the object id of the object that owns that partition.
In the results of our monitoring, you can tell whether [blk object] is an object id from the size of the number. A hobt id is something like 17 digits long. For user-created objects it usually begins with 72 The much smaller numbers you see in this column are object ids and can be resolved as described above.
Coming Attractions
We have looked at blocking in relation to SQL Server’s default pessimistic isolation mode which uses locks to enforce data consistency. In a future article we will talk about changing SQL Server’s isolation model to a somewhat more optimistic isolation model called Read Committed Snapshot Isolation (RCSI). RCSI resolves the contention between reading and writing processes without using locks at all.
We will discuss Read Committed Snapshot Isolation (RCSI) in the next article.
If you are having serious blocking problems right now and can’t wait a month, contact me to discuss RCSI as a possible solution to your blocking problem. I have a policy of giving anyone who calls up to 30 minutes of my time without cost.
Footnote
There is substantial amount of tsql code in this article. The format of an email newsletter forces me to break lines in the tsql code so that they fit in a narrow column. They are difficult to copy and difficult to read in this form. So, if you prefer you can Email Me and I will send you the code in a more usable format.