Tempdb is used by the database engine and all user-created databases for a wide variety of tasks. The most visible task is the creation of temporary tables by user code, but that is often the least part of the work done by tempdb. It is used by the database engine for worktables supporting sorts, hash joins, etc. Cursors, table variables, table valued functions are all materialized in tempdb. For databases using snapshot isolation modes, tempdb stores the row versions.
It is easy to see why good tempdb performance is essential for good performance in the entire SQL Server instance. However it is common to find that a heavily used tempdb is suffering from a problem called allocation contention that can impair overall SQL Server performance. This problem is rooted in the tempdb data file, not in the tempdb database itself. So, different methods must be used to optimize tempdb.
What is Allocation Contention?
When processes need to use tempdb, they first have to get an allocation of space in the tempdb data file. They get this allocation by updating a specific page in the file called the Shared Global Allocation Map or SGAM for short.
By default, tempdb is created with only one data file. Consequently there is only one SGAM page to service all the processes attempting to access tempdb. When a lot of processes are trying to get a latch on the SGAM page at the same time, they have to wait in line. This contention for the SGAM page can cause a slowdown in performance across the SQL instance.
Does Your Tempdb Have a Contention problem?
Here is a query that can tell you if you your tempdb is experiencing this problem. You may notice that we include resource descriptions for two other types of tempdb allocation pages as well as the SGAM. These pages can similarly be involved in allocation contention but usually not to the same extent as SGAM. We omit a discussion of these other allocation pages for reasons of space.
- SELECT session_id AS [Session],
- wait_duration_ms AS [WaitTime(ms)],
- resource_description AS [Type]
- FROM sys.dm_os_waiting_tasks
- WHERE wait_type LIKE ‘PAGELATCH_%’
- AND (resource_description LIKE ‘2:%:1’
- OR resource_description LIKE ‘2:%:2’
- OR resource_description LIKE ‘2:%:3’)
The result set of this query is a snapshot of contention in your tempdb at the instant the query is executed. Therefore you should execute it several times in succession when the server is under a heavy load. If you consistently get no rows returned, then it is unlikely that you have a problem. If the query returns rows regularly, performance of your SQL Server is probably being affected by allocation contention, the more rows returned, the greater the problem.
If you find you have this problem, there are steps you must take to fix it.
Researching the Problem and the Solution
There is a lot of information on the internet about fixing this problem. However, there is also a lot of misinformation and obsolete information on the internet as well. In particular you might find a Microsoft whitepaper on this subject that looks very authoritative but was written in the era of SQL 2000 or perhaps earlier. You don’t want to follow those directions with later editions of SQL Server and modern server hardware.
This whitepaper has been rewritten for modern versions of SQL Server, but the obsolete paper still shows up in searches. As a consequence, I have found that some DBA’s who formed their opinions about this issue from the original whitepaper have not changed their opinions.
The fix is simple enough. You need to create enough files for tempdb so a requesting process will always be able to quickly find a file with a free SGAM page and will not have to wait. All of the information, misinformation and obsolete information agree on this point. The point of disagreement involves the number of files necessary.
For example, in the older Microsoft Whitepaper I refer to above, it is stated flatly that the number of files should equal the number of processors in use by SQL Server. But at the time the older whitepaper was written, having two processors was state of the art and four processors was almost science fiction. The earlier paper states the requirement in terms of processors rather than cores because multi-core processors did not exist at the time it was written.
Given inexpensive modern multi-core hardware, the number of cpu cores commonly provisioned for SQL Server has increased more than tenfold since the original whitepaper was written.
On modern hardware provisioned with so many logical processors it is unlikely that all the available cores would be occupied with tempdb allocation at the same time.
The current Microsoft best practice is, with up to 8 logical processors, you should create the same number of files as there are logical processors. For SQL Servers with more than 8 logical processors they recommend 8 files total, then if contention on the allocation pages continues, you should increase by 4 files at a time up to the number of logical processors. You can use the query we presented earlier in this article to check contention after creating the files.
The other requirement for these files is that they be exactly the same size without the ability to grow automatically. No article, whitepaper or blog that I have read disagrees with this, but as an item of implementation it is sometimes ignored.
The reason for this requirement is that tempdb implements a round-robin algorithm to access multiple data files. If one file has a busy SGAM page it will move to the next file and so on until it finds an open SGAM page. This algorithm depends on all tempdb files being exactly the same size.
There are reasons why someone might be reluctant to disallow automatic growth on the tempdb files. The first is the fear that under an atypically heavy load, tempdb might run out of space with unpleasant consequences. I have to confess that I have been guilty of that fear. However, I quickly learned that this process just does not work if you don’t disable automatic file growth. Just be sure that you have made the files large enough to have confidence that tempdb is large enough to handle the heaviest load.
How Big Should the Files Be?
This question is impossible to answer precisely. Every SQL instance will be different. You might take a clue from the existing size of the tempdb data file on disk after the SQL server has been on line for a significant amount of time. Since tempdb is, by default, created very small, with autogrowth turned on, we can assume that its current size has been adequate for the heaviest load imposed on tempdb since the last time the SQL instance was started. As a matter of best practice you might consider creating the files at a total size of 10% to 20% larger than the current size of the single file.
Disk IO Has Nothing to Do With It
The second reason someone would skip this step is that they might mistake the purpose of this exercise. Adding database files and spreading them across many physical disks has been a well-known method for optimizing performance for any database, not just tempdb. The distribution of files across different physical disks was a way to spread the physical disk io over multiple disks and thereby improve performance.
With the advent of the SAN, the idea of spreading disk io across multiple physical drives has become much less important than it was in years past. However, old habits die hard and someone might mistake the purpose of multiple disks and decide that disabling autogrowth is not absolutely necessary to optimize tempdb. While you certainly can create the tempdb files on different disks, spreading disk io across multiple data files is not the primary goal of this exercise
A Further Optimization Suggestion
Solid state disks are an awfully good option if you are having performance problems that you can trace to tempdb. The price has come down dramatically. The current price of an SSD large enough to hold just the tempdb is quite reasonable and the SSD is usually quite effective in improving performance.
Fixing Allocation Contention in All Databases
You might want to take a look at the newer version of the whitepaper we discussed above. In addition to discussing the optimization of tempdb as we do in this article, it discusses reducing allocation contention on all databases using the trace flag —T1118.
This trace flag affects all databases and there is a downside risk to using it. We do not have the space here to provide an adequate discussion of the risks, but the subject might interest you. You can find it at this URL