How Often does SQL Server look for Deadlocks

Although life would be much easier for every DBA if deadlocks never happened, we all realize that they do happen.  There is a great deal of information out there on the causes of and solutions for deadlocks.  The focus of this post is more about how often does SQL Server check to see if a deadlock is happening.

Prior to getting into how SQL Server detects deadlocks, let’s take a few minutes to quickly go over exactly what a deadlock it.

Microsoft states this on their website:

“A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock”

Because deadlocks happen when two task permanently block each other, without a deadlock, both process will simply block forever. Of course this could never be good in a production system.  It is important that these situations be identified and dealt with in some manner.  This is where SQL Server database engine steps in, it is frequently searching the lock manager looking for deadlocks.

Initially the interval the deadlock check takes place is every 5 seconds. This interval may decrease depending on if it finds a deadlock.  If a deadlock is not found, the check will remain at every 5 seconds.  As mentioned, 5 seconds is the default interval, it is also the maximum number of seconds between checks. If SQL Server finds deadlock, then the interval will be decreased and checks will happen more frequently.  The minimum interval is 100 ms.  Finally, if deadlocks were found at one point and they longer exist the interval will return to 5 seconds.

While there are many options that can be changed in SQL Server, the default frequency for deadlock checks is not among the list.  This option is not configurable.  These checks are very light weight and should not cause any issues on the system.  If SQL Server is checking more frequently than 5 seconds, that means it is finding deadlocks and your time and energy should be focused on addressing that.

In addition to the interval for the check of deadlocks, there are a few other items related to deadlocks that I feel are important to know as a DBA.  The most obvious is what are some of the causes of deadlocks. If you Google what causes deadlocks you will find a ton of information. I also feel that it is important to know how to get information about the deadlock as well as how SQL Server determines which process well be the victim.  I have a blog post on one of the methods SQL Server uses to determine to victim, it can be found here.  This post is about the DeadLock_priority option.  SQL Server will also use the cost to rollback as one of the criteria to determine of the victim.

A logical question would be how do I capture information about deadlocks.  There are a number of ways to do this within SQL Server.  I like to use Extended Events.  I have a blog post on how to use Extended Events to capture the deadlock graph.  It can be found here. You can either create your own Extended Event session or use the System_health session that already exists.  You can also use traceflags 1222 and 1204.  In older versions of SQL Server, the trace flags were the only way to capture this information.

You can also use a third part monitoring tool such as the SQL Monitor by Red-Gate.  This is a great tool!  Here is a link to a demo.

Here are some really good links on deadlocks

 How to Fix SQL Server deadlocks – Grant Fritchey at Red-Gate

 Overview of Deadlocks – www.sqlshack.com

 Analyzing Deadlocks – Microsoft

I will also be doing future blog posts on deadlocks from time to time.  Thank you for visiting my blog!!!

 

 

Extended Events – Database Restore

Many of us have restored productions databases.  Of course no one wants to do it, but we all acknowledge that there are times in which that is exactly what we need to do, restore a database.  Have you ever wondered what steps SQL Server actually goes through to complete the restore?  Extended Events offers an event that will allow us to see inside the restore process.

When I did a search of what events are available for restore operations, I found two, backup_restore_progress_trace and databases_backup_restore_throughput. Backup_restore_progress_trace will be the focus of this post.

Backup_restore_progress_trace does exactly what you might think it would do, documents the internal steps of the restore process.

Below are all the steps that Backup_restore_progress_trace picked up during a restore of a very small database.  These are also in the order in which the steps took place.  These steps are for a restore to a database that does not exist yet.

RESTORE DATABASE started
Opening the backup set
Processing the leading metadata
Planning begins
Effective options: Checksum=0, Compression=0, Encryption=0, BufferCount=6, MaxTransferSize=64 KB
Planning is complete
Beginning OFFLINE restore
Attached database as DB_ID=5
Preparing containers
Containers are ready
Restoring the backup set
Estimated total size to transfer = 2838528 bytes
Transferring data
FileHandleCache: Initial CacheSize: 48
BackupStream(0): Processing MSDA of size 43 extents
6 percent (196608/2838528 bytes) processed
11 percent (327680/2838528 bytes) processed
16 percent (458752/2838528 bytes) processed
20 percent (589824/2838528 bytes) processed
25 percent (720896/2838528 bytes) processed
30 percent (851968/2838528 bytes) processed
36 percent (1048576/2838528 bytes) processed
41 percent (1179648/2838528 bytes) processed
46 percent (1310720/2838528 bytes) processed
50 percent (1441792/2838528 bytes) processed
55 percent (1572864/2838528 bytes) processed
60 percent (1703936/2838528 bytes) processed
66 percent (1900544/2838528 bytes) processed
71 percent (2031616/2838528 bytes) processed
76 percent (2162688/2838528 bytes) processed
80 percent (2293760/2838528 bytes) processed
85 percent (2424832/2838528 bytes) processed
90 percent (2555904/2838528 bytes) processed
96 percent (2752512/2838528 bytes) processed
BackupStream(0): Completed MSDA
Waiting for log zeroing to complete
Log zeroing is complete
BackupStream(0): Processing MSTL (FID=2, VLFID=36, size=65536 bytes)
100 percent (2838528/2838528 bytes) processed
Data transfer is complete
Backup set is restored
Offline roll-forward begins
Processing 8 VLF headers
Processing VLF headers is complete
First LSN: 36:4232:37, Last LSN: 36:4272:1
Stop LSN: 36:4272:1
Offline roll-forward is complete
Database fixup is complete
Transitioning database to ONLINE
Restarting database for ONLINE
PostRestoreContainerFixups begins
PostRestoreContainerFixups is complete
PostRestoreReplicationFixup begins
PostRestoreReplicationFixup is complete
Database is restarted
Resuming any halted Fulltext crawls
Writing history records
Writing history records is complete (elapsed = 58 ms)
MSDB maintenance is complete
RESTORE DATABASE finished

Many these steps are pretty straight forward.  They can also easily group the steps into phases.

The first few steps are simply to prepare for the restore.  Once SQL Server is done preparing, it then starts the restore.  You can see there are a number of steps in this process, including the assigning of a database ID.  Then you can see the increments in percent of where the restore stands.  Once the restore is complete, SQL Server moves on to the steps in which SQL Server a rolls forward transactions, brings the database online and starts the database.  After this, SQL Server the writes the history into the MSDB system tables.

While these steps are for a new database, many of the same steps will take place for the restore of an existing database.  The most noticeable additions to the steps are the acquiring of database locks.  As you can see below, these are Exclusive locks.

Acquiring X lock on the database

Acquired X lock on the database

SQL Server will spend varied amounts of time on each step.  As expected, the most time is spent on the actual transfer of data to the restored database. Below you will see a table with two columns, timestamp for each and the step name for the restored I completed for this test.  While this process only took a few seconds, a larger database would obviously take much more time.

Time Stamp At Start of Step Operation
00:11:38.457 RESTORE DATABASE started
00:11:38.457 Opening the backup set
00:11:38.473 Processing the leading metadata
00:11:38.473 Planning begins
00:11:38.490 Effective options: Checksum=0,

Compression=0, Encryption=0, BufferCount=6, MaxTransferSize=64 KB

00:11:38.490 Planning is complete
00:11:38.490 Beginning OFFLINE restore
00:11:38.490 Attached database as DB_ID=5
00:11:38.490 Preparing containers
00:11:38.577 Containers are ready
00:11:38.640 Restoring the backup set
00:11:38.640 Estimated total size

to transfer = 2838528 bytes

00:11:38.640 Transferring data
00:11:38.640 FileHandleCache: Initial CacheSize: 48
00:11:38.640 BackupStream(0): Processing

MSDA of size 43 extents

00:11:38.653 6 percent (196608/2838528 bytes) processed
00:11:38.657 11 percent (327680/2838528 bytes) processed
00:11:38.660 16 percent (458752/2838528 bytes) processed
00:11:38.667 20 percent (589824/2838528 bytes) processed
00:11:38.687 25 percent (720896/2838528 bytes) processed
00:11:38.687 30 percent (851968/2838528 bytes) processed
00:11:38.700 36 percent (1048576/2838528 bytes) processed
00:11:38.713 41 percent (1179648/2838528 bytes) processed
00:11:38.713 46 percent (1310720/2838528 bytes) processed
00:11:38.713 50 percent (1441792/2838528 bytes) processed
00:11:38.740 55 percent (1572864/2838528 bytes) processed
00:11:38.740 60 percent (1703936/2838528 bytes) processed
00:11:38.763 66 percent (1900544/2838528 bytes) processed
00:11:38.763 71 percent (2031616/2838528 bytes) processed
00:11:38.763 76 percent (2162688/2838528 bytes) processed
00:11:38.790 80 percent (2293760/2838528 bytes) processed
00:11:38.790 85 percent (2424832/2838528 bytes) processed
00:11:38.797 90 percent (2555904/2838528 bytes) processed
00:11:38.817 96 percent (2752512/2838528 bytes) processed
00:11:38.837 BackupStream(0): Completed MSDA
00:11:38.840 Waiting for log zeroing to complete
00:11:38.877 Log zeroing is complete
00:11:38.877 BackupStream(0): Processing MSTL (FID=2, VLFID=36, size=65536 bytes)
00:11:38.880 100 percent (2838528/2838528 bytes) processed
00:11:38.897 Data transfer is complete
00:11:38.910 Backup set is restored
00:11:38.957 Offline roll-forward begins
00:11:38.957 Processing 8 VLF headers
00:11:38.960 Processing VLF headers is complete
00:11:38.963 First LSN: 36:4232:37, Last LSN: 36:4272:1
00:11:38.963 Stop LSN: 36:4272:1
00:11:38.967 Offline roll-forward is complete
00:11:38.990 Database fixup is complete
00:11:38.993 Transitioning database to ONLINE
00:11:38.993 Restarting database for ONLINE
00:11:39.093 PostRestoreContainerFixups begins
00:11:39.097 PostRestoreContainerFixups is complete
00:11:39.100 PostRestoreReplicationFixup begins
00:11:39.267 PostRestoreReplicationFixup is complete
00:11:39.277 Database is restarted
00:11:39.297 Resuming any halted Fulltext crawls
00:11:39.307 Writing history records
00:11:39.367 Writing history records

is complete (elapsed = 58 ms)

00:11:39.367 MSDB maintenance is complete
00:11:39.370 RESTORE DATABASE finished

Hidden within the steps are some very useful tidbits of information.

    • Database ID
    • Number of Extents
    • Number of bytes restored
    • Beginning and Ending LSN for the roll forward
    • Total time to write history records

While the very small test database took about 60 steps to complete, I was wondering what if the database was bigger.  Using the AdventureWork2014 database, I completed a restore and there were many more steps, a total of 114 steps.  Below are the steps in order.

backup_restore_progress_trace 2019-06-18 11:17:01.5974587 +00:00 BACKUP LOG started

backup_restore_progress_trace 2019-06-18 11:17:01.5975148 +00:00 Acquiring U lock on the database

backup_restore_progress_trace 2019-06-18 11:17:01.5975471 +00:00 Synchronizing with other operations on the database is complete

backup_restore_progress_trace 2019-06-18 11:17:01.5988528 +00:00 Halting Fulltext crawls

backup_restore_progress_trace 2019-06-18 11:17:01.5994759 +00:00 Acquiring X lock on the database

backup_restore_progress_trace 2019-06-18 11:17:01.5995329 +00:00 Acquired X lock on the database

backup_restore_progress_trace 2019-06-18 11:17:01.6135276 +00:00 Writing a failover checkpoint

backup_restore_progress_trace 2019-06-18 11:17:01.6753875 +00:00 Failover checkpoint is complete

backup_restore_progress_trace 2019-06-18 11:17:01.6755993 +00:00 Failover FlushCache is complete

backup_restore_progress_trace 2019-06-18 11:17:01.6756137 +00:00 Opening the backup media set

backup_restore_progress_trace 2019-06-18 11:17:01.6801075 +00:00 The backup media set is open

backup_restore_progress_trace 2019-06-18 11:17:01.6801651 +00:00 Preparing the media set for writing

backup_restore_progress_trace 2019-06-18 11:17:01.6811968 +00:00 The media set is ready for backup

backup_restore_progress_trace 2019-06-18 11:17:01.6812025 +00:00 Effective options: Checksum=0, Compression=0, Encryption=0, BufferCount=7, MaxTransferSize=1024 KB

backup_restore_progress_trace 2019-06-18 11:17:01.6812297 +00:00 Start LSN: 500:27608:118, SERepl LSN: 0:0:0

backup_restore_progress_trace 2019-06-18 11:17:01.6812385 +00:00 First LSN: 630:28048:1

backup_restore_progress_trace 2019-06-18 11:17:01.6812477 +00:00 Estimated total size = 542920704 bytes (data size = 0 bytes, log size = 542920704 bytes)

backup_restore_progress_trace 2019-06-18 11:17:01.6812508 +00:00 Work estimation is complete

backup_restore_progress_trace 2019-06-18 11:17:01.6812559 +00:00 Last LSN: 663:7624:1

backup_restore_progress_trace 2019-06-18 11:17:01.6812606 +00:00 Scanning filestream data

backup_restore_progress_trace 2019-06-18 11:17:01.6819216 +00:00 Scanning filestream data is complete

backup_restore_progress_trace 2019-06-18 11:17:01.6819294 +00:00 Writing the leading metadata

backup_restore_progress_trace 2019-06-18 11:17:01.6819931 +00:00 BackupStream(0): Writing leading metadata to the device C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\Backup\AdventureWorks2014_LogBackup_2019-06-18_06-16-59.bak

backup_restore_progress_trace 2019-06-18 11:17:01.6821581 +00:00 Copying filestream data

backup_restore_progress_trace 2019-06-18 11:17:01.6905789 +00:00 Copying filestream data is complete

backup_restore_progress_trace 2019-06-18 11:17:01.6905984 +00:00 Copying transaction log

backup_restore_progress_trace 2019-06-18 11:17:01.6907865 +00:00 MediaFamily(0): FID=2, VLFID=630, DataStreamSize=2424832 bytes

backup_restore_progress_trace 2019-06-18 11:17:01.7037592 +00:00 MediaFamily(0): FID=2, VLFID=631, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:01.7704909 +00:00 MediaFamily(0): FID=2, VLFID=632, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:01.8050246 +00:00 5 percent (27590656/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:01.8396719 +00:00 MediaFamily(0): FID=2, VLFID=633, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:01.9147575 +00:00 MediaFamily(0): FID=2, VLFID=634, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:01.9238444 +00:00 10 percent (54853632/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:01.9810017 +00:00 MediaFamily(0): FID=2, VLFID=635, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.0323014 +00:00 15 percent (82116608/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.0498153 +00:00 MediaFamily(0): FID=2, VLFID=636, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.1195952 +00:00 MediaFamily(0): FID=2, VLFID=637, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.1464469 +00:00 20 percent (109379584/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.1877744 +00:00 MediaFamily(0): FID=2, VLFID=638, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.2570845 +00:00 25 percent (136642560/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.2573724 +00:00 MediaFamily(0): FID=2, VLFID=639, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.3263001 +00:00 MediaFamily(0): FID=2, VLFID=640, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.3702739 +00:00 30 percent (163905536/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.3941585 +00:00 MediaFamily(0): FID=2, VLFID=641, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.4635771 +00:00 MediaFamily(0): FID=2, VLFID=642, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.4768947 +00:00 35 percent (190119936/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.5315044 +00:00 MediaFamily(0): FID=2, VLFID=643, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.5872516 +00:00 40 percent (217382912/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.6020868 +00:00 MediaFamily(0): FID=2, VLFID=644, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.6732336 +00:00 MediaFamily(0): FID=2, VLFID=645, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.7043113 +00:00 45 percent (244645888/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.7416025 +00:00 MediaFamily(0): FID=2, VLFID=646, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.8128799 +00:00 MediaFamily(0): FID=2, VLFID=647, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.8199368 +00:00 50 percent (271908864/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.8834042 +00:00 MediaFamily(0): FID=2, VLFID=648, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:02.9338510 +00:00 55 percent (299171840/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:02.9552501 +00:00 MediaFamily(0): FID=2, VLFID=649, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.0292680 +00:00 MediaFamily(0): FID=2, VLFID=650, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.0524488 +00:00 60 percent (326434816/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.0999388 +00:00 MediaFamily(0): FID=2, VLFID=651, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.1645396 +00:00 65 percent (353697792/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.1683370 +00:00 MediaFamily(0): FID=2, VLFID=652, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.2374707 +00:00 MediaFamily(0): FID=2, VLFID=653, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.2742992 +00:00 70 percent (380960768/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.3044229 +00:00 MediaFamily(0): FID=2, VLFID=654, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.3723517 +00:00 MediaFamily(0): FID=2, VLFID=655, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.4062676 +00:00 75 percent (408223744/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.4598903 +00:00 MediaFamily(0): FID=2, VLFID=656, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.5105134 +00:00 80 percent (434438144/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.5290138 +00:00 MediaFamily(0): FID=2, VLFID=657, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.6010196 +00:00 MediaFamily(0): FID=2, VLFID=658, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.6293332 +00:00 85 percent (461701120/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.6878991 +00:00 MediaFamily(0): FID=2, VLFID=659, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.7633069 +00:00 90 percent (488964096/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.7636287 +00:00 MediaFamily(0): FID=2, VLFID=660, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.8362761 +00:00 MediaFamily(0): FID=2, VLFID=661, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.8837162 +00:00 95 percent (516227072/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:03.9154381 +00:00 MediaFamily(0): FID=2, VLFID=662, DataStreamSize=16777216 bytes

backup_restore_progress_trace 2019-06-18 11:17:03.9883764 +00:00 MediaFamily(0): FID=2, VLFID=663, DataStreamSize=3932160 bytes

backup_restore_progress_trace 2019-06-18 11:17:04.0073959 +00:00 100 percent (542920704/542920704 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:04.0187093 +00:00 Copying transaction log is complete

backup_restore_progress_trace 2019-06-18 11:17:04.0187231 +00:00 Writing the trailing metadata

backup_restore_progress_trace 2019-06-18 11:17:04.0188013 +00:00 BackupStream(0): Writing trailing metadata to the device C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\Backup\AdventureWorks2014_LogBackup_2019-06-18_06-16-59.bak

backup_restore_progress_trace 2019-06-18 11:17:04.0188753 +00:00 Writing the end of backup set

backup_restore_progress_trace 2019-06-18 11:17:05.1136445 +00:00 Resuming any halted Fulltext crawls

backup_restore_progress_trace 2019-06-18 11:17:05.1145570 +00:00 Writing history records

backup_restore_progress_trace 2019-06-18 11:17:05.1570406 +00:00 Writing history records is complete (elapsed = 43 ms)

backup_restore_progress_trace 2019-06-18 11:17:05.1571172 +00:00 BACKUP LOG finished

backup_restore_progress_trace 2019-06-18 11:17:05.1846587 +00:00 Acquiring U lock on the database

backup_restore_progress_trace 2019-06-18 11:17:05.1846855 +00:00 RESTORE DATABASE started

backup_restore_progress_trace 2019-06-18 11:17:05.1846983 +00:00 Acquiring X lock on the database

backup_restore_progress_trace 2019-06-18 11:17:05.1847096 +00:00 Acquired X lock on the database

backup_restore_progress_trace 2019-06-18 11:17:05.1848243 +00:00 Opening the backup set

backup_restore_progress_trace 2019-06-18 11:17:05.1929402 +00:00 Processing the leading metadata

backup_restore_progress_trace 2019-06-18 11:17:05.1941467 +00:00 Planning begins

backup_restore_progress_trace 2019-06-18 11:17:05.2423481 +00:00 Effective options: Checksum=0, Compression=0, Encryption=0, BufferCount=6, MaxTransferSize=1024 KB

backup_restore_progress_trace 2019-06-18 11:17:05.2423558 +00:00 Planning is complete

backup_restore_progress_trace 2019-06-18 11:17:05.2426545 +00:00 Beginning OFFLINE restore

backup_restore_progress_trace 2019-06-18 11:17:05.8613121 +00:00 Preparing containers

backup_restore_progress_trace 2019-06-18 11:17:05.9191974 +00:00 Preparing FileStream container: c:\data\imoltp_mod1

backup_restore_progress_trace 2019-06-18 11:17:06.1012537 +00:00 Prepared FileStream container: c:\data\imoltp_mod1

backup_restore_progress_trace 2019-06-18 11:17:06.1013026 +00:00 Containers are ready

backup_restore_progress_trace 2019-06-18 11:17:06.1842403 +00:00 Restoring the backup set

backup_restore_progress_trace 2019-06-18 11:17:06.1842624 +00:00 Estimated total size to transfer = 2909552640 bytes

backup_restore_progress_trace 2019-06-18 11:17:06.1855198 +00:00 Transferring data

backup_restore_progress_trace 2019-06-18 11:17:06.1856031 +00:00 FileHandleCache: Initial CacheSize: 48

backup_restore_progress_trace 2019-06-18 11:17:06.1883666 +00:00 BackupStream(0): Processing MSDA of size 8288 extents

backup_restore_progress_trace 2019-06-18 11:17:07.5410761 +00:00 5 percent (145752064/2909552640 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:08.9161089 +00:00 10 percent (291504128/2909552640 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:10.3224426 +00:00 15 percent (437256192/2909552640 bytes) processed

backup_restore_progress_trace 2019-06-18 11:17:11.4572357 +00:00 BackupStream(0): Completed MSDA

backup_restore_progress_trace 2019-06-18 11:17:11.4580438 +00:00 BackupStream(0): Processing PH6(filestream) data

backup_restore_progress_trace 2019-06-18 11:17:11.5794020 +00:00 BackupStream(0): Completed PH6(filestream) data

backup_restore_progress_trace 2019-06-18 11:17:11.5894867 +00:00 Waiting for log zeroing to complete

There are a lot more steps to a restore than I ever thought there would be.  If you are interested, here is the code I used to create the Extended Event session.  If you decide to use this, please make sure you change the name to something that works for you and change the path for the target files.

CREATE EVENT SESSION [RestoreDatabase] ON SERVER
ADD EVENT sqlserver.backup_restore_progress_trace(
ACTION(sqlos.worker_address,sqlserver.client_hostname,sqlserver.nt_username,sqlserver.sql_text))
ADD TARGET package0.event_file(SET filename=N‘RestoreDatabase’)
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
GO

To read the data, you can utilize SSMS or use TSQL.  If you use SSMS, once you right click on the target and view the data, you can then go to the Extended Events menu item and at the bottom, you will see an option to Export the data.  With this, you can export the data to a table so you can then use TSQL to view the data.

Or you can use the code below.  If you place the .xel file in a different location than the default, you will need to update the path to the files in the function, fn_xe_file_target_read_file.

SELECT event_data = convert(xml, event_data)
INTO #eeTable
FROM sys.fn_xe_file_target_read_file(N’RestoreDatabase*.xel’, null, null, null);

SELECT * FROM #eeTable

SELECT
ts = event_data.value(N'(event/@timestamp)[1]’, N’datetime’),
[StepDescription] = event_data.value(N'(event/date[@name=”trace_message”]/value)[1]’, N’nvarchar(max)’)
FROM #eeTable
ORDER BY ts

DROP TABLE #eeTable

 

Thanks for stopping by my blog!!!

 

Extended Events Session Filters in SSMS

Recently during a demo at a SQL Saturday the query to pull the Extended Event session data, didn’t return the expected results. The session I used for the demo was the create database statement.

Prior to the session, I deleted the Create Database session, however did not delete the target files because they are part of the demo.  Then I recreated the session, just as I had done before.  However, this time was there was a difference when I attempted to read the target data.  The entry for the newly created database was not showing up when I used the GUI, however was showing up when I read the XML.  During the session, I was not able to figure out why that was the case.

When I got home I did some more testing.  As I turns out, when I had done the same session the previous week at a different SQL Saturday, I created a filter for the data and that filter persisted, even though I had deleted the session and created it again.

So I decided to do some testing.  I made sure there were no longer any filters on the session data, then deleted the Create Database session.  Just to have a clean slate, I then restarted the SQL Server service.

Then I created two databases, test4 and test5.  As you can see below, they are both showing up when reading the event file using the GUI.

Then I added a filter to only show the test5 database by right clicking on the test5 value and then clicking, “Filter by this value”.

This resulted in only the test5 database showing in the grid.

Now that the session was up and running, I create another database, test6.   I restarted the Extended Event session to force the new database event to be written to the target.  Then I read the target again.  As you can see below, not only is test4 not showing up, but the newly created test6 is also not appearing.

I right clicked the test5 value and then to “Filter by this value”.  What I saw kind of surprised me a bit.  Below is a screenshot of what I saw, two entries for the test5 value.  This means that one had to exist before I tried to filter by the value again.

I clicked the Clear All button, and sure enough, all three databases now appear.  Just as in the image below.

What I found interesting is that the filter persists, even after deleting the session and creating it again.  Of course, deleting sessions is not a common task and recreating it with the same name and storing the target files in the same location is even less likely to happen.  However, I did find this interesting.

 

Thanks for stopping by my blog!!!!

Create Database Event – When does it fire?

Recently when presenting my Extended Event session at a SQL Saturday an attendee asked if the Create Database session I used as a demo would pick up the Tempdb that is recreated at start up. The answer is that it will not.  But this question got me thinking, what about a restore or attaching a database, would the event pick that up?

I have to admit, while I suspected that it would not pick up the TempDB creation, I was not 100% certain on whether or not the Create Database event fires on start up.  So I wanted to do a quick test and the answer is that the event does not fire when the SQL Server service is restarted.

What about when a database is attached?  When attaching a database, this event does indeed fire.  Which of course makes sense, at least to me, you are creating a new database when attaching the database files.

What about when a database is restored?  Although a restore is in reality a drop and recreate the database, it does NOT fire the Create Database event.  I tried both restoring the database with the same name and a different name and neither one was picked up by the Create Database event.

Thanks for visiting my blog!!!

 

Track Failed Logins with Extended Events

Many things in life have opposites.  There are stop and go, beginning and end and of course in and out.  SQL Server is not any different, you have BEGIN TRAN…COMMIT TRAN, successful login and failed login as well as a few others.  Using Extended Events there is just a login event, not a successful or failed login event.  If you look at the description of the login event this is what you will see:

Occurs when a successful connection is made to the Server. This event is fired for new connection or when connections are reused from a connection pool

Notice is states successful connections.  What if I want to use Extended Events to track failed logins. Logic would say that if there is an event for successful logins, there would be an event for failed logins.  In this case, logic would be wrong.

If you do a search for the work “fail” in the events a number of events will be returned.  Many of the are related to column store and none of them are related to failed logins, most are related to column store.

Although, when I think of Extended Events, I usually think about how I can use it to capture performance related information. However, there are somethings, like failed logins, that could be helpful as well.  Well how can I capture failed logins if there is not an event to do so.

To do this, you have to use the error_reported event and use predicates(filters) to return just failed logins.

As you can see in the above image, there are two filters we need to configure.

Severity is set to equal 14, this one is optional but might help eliminate unnecessary rows

error_number is set to equal 18456

state is set to greater than 8, this can be set to 1 as well

These combined allow you to capture failed logins. If you do a Google search on the error number you will find that the Microsoft documentation states this.

“When a connection attempt is rejected because of an authentication failure that involves a bad password or user name, a message similar to the following is returned to the client: “Login failed for user ‘<user_name>’. (Microsoft SQL Server, Error: 18456)”.”

This might have you wondering why the other predicates are needed.  Well, it is to help eliminate similar rows. If you look at the image below, the two rows that are highlighted are the same failed login event.  One has a severity of 1 and the other has a severity of 8.  The other difference is the error message. As you can see, there is just a bit more information with severity 8 than with severity 1.

Severity 1 message: Login failed for user ‘testgrant’.

Severity 8 message: “Login failed for user ‘testgrant’. Reason: Password did not match that for the login provided. [CLIENT: <local machine>]”

Just a little side note, to read the event file for Extended Events, simply right click on the target file in SQL Server Management Studio and click “View Target Data”.

 

This can be very useful information and in my opinion, might be a bit easier to query than reading the log file.  It is always nice to have options. This works for an on-prem instance of SQL Server, I have not been able to get it to work on an Azure SQL Database yet.  This could be due to the fact that in an Azure SQL Database, Extended Events are set up in the database.  I am still working on that so look for a future blog post on that.

Thanks for visiting my blog.

DEADLOCK_PRIORITY

When two processes are caught up in a deadlock, one will be chosen as the victim and the process will be terminated, while the other process will be successful.  SQL Server will, by default, pick the process that has the lowest cost to rollback.  This can present a potential issue if the terminated process is a business critical process.  The DEADLOCK_PRIORITY allows us to have more control over whether or not a process is chosen as the victim.

Before getting into how to set the DEADLOCK_PRIORITY, let’s quickly go over what the options are.  There are two ways to set the priority of a process. The first option is to use one of the keywords, LOW, NORMAL or HIGH. The second approach is to use a numeric value between -10 and 10.  The lowest value will be chosen as the victim.  For example, LOW will be the victim of the other process is HIGH and -1 will be the victim if the other process is greater than -1.

If you look at the chart below you will see the winners and the losers in the deadlock. These charts identify what process will be allowed to complete and not be the victim.  As you can see, when one process is LOW and the other is MEDIUM, the LOW process will be the victim.  If there is a tie, SQL Server will use the cost to rollback the transaction.

The chart below is somewhat incomplete.  The allowable ranges for the priority as a numeric value is the range between -10 and 10.  As you can see below, if one process is -9 and the other is -7, the -9 will be the victim.  Rather than take the chart all the way out to 10, I think this gives you the idea of how to figure it out beyond the numbers that are in the chart.

So now that we know how SQL Server determines the victim, let’s take a look at how we can determine what the priority was for the processes in a deadlock.

One way is to use the deadlock graph.  The two highlighted items are the deadlock priority for each of the processes.  As you can see, the priority 5 survived while the priority -5 was chosen as the victim.

You can also look at the XML to determine the priority of each of the processes.  In the XML you will look for “taskpriority” in the Process-List node.  As you can see below there are two of them, one for the process that succeeded and the victim. The “victimProcess id” identifies the process that was chosen as the victim.

To set the priority for a process, you would add the code below at the beginning for the batch. This setting is for the session.

SET DEADLOCK_PRIORITY 10

This is what the code would look like if using a numeric priority.  Remember, the value must between -10 and 10, inclusive.

This is what it would look like if you use the LOW, NORMAL, HIGH option.

Thanks for stopping by my blog!!!

 

Extended Events and Execution Plans

The ability to view execution plans is a critical part of any successful troubleshooting effort. It is in the execution plans that you can find a great deal of information.

Just a sample of what can be found in execution plans.

      • Data access type – Scan or a seek
      • What indexes were used
      • What type of physical join type – Hash Match, merge or nested loops
      • If key lookups are being used

Of course this is a very small sample.  Even in this short list, you can see the value in execution plans.  Given the value of execution plans, it is important to understand how these can be captured.  There are a number of methods that can be used.

SQL Server Management Studio is an effective method for capturing a single plan.  When you have a query you would like to see the plan on you have two options, estimated and actual.  The differences is pretty simple, estimated is what SQL Server thinks it will do while the actual execution plan is what it actually did to get the data you are requesting.

To get the estimated execution plan, click the “Display Estimated Execution Plan” under the Query menu.  At this point, SQL Server will get all the information it needs to create the plan, however will not actually execute the query. Statistics are a key part of this and should kept up to date.

When getting the estimated plan, SQL Server does not actually execute the query.  However, when the actual plan is used, SQL Server does execute the query. To get the actual execution plan you can do one of two things, click the toolbar icon highlighted below and execute the query. Or you can click the menu item under the Query menu and then execute the query.

As mentioned this method works great if you have the query and will capture the plan for a single query.  However, what if you want to capture the execution plans over a period of time.

Within SQL Server you have two methods that can be used to accomplish this.  One is Query Store and the other is Extended Events.  Since this is a post about Extended Events, here is some good documentation on Query Store, click here.

When using Extended Events to capture the plan you have two options.

query_pre_execution_showplan – this is the estimated plan

query_post_execution_showplan – this is the actual plan

Either one of these should work.  Once you have chosen the event, the XML for the plan can be found in the “Event Fields” tab and the showplan_XML column.

The easiest way to view this is to use SSMS.  If you navigate to the target and right click on the target.

In this case, right click on the event file and then click “view target data”.  When you do you will then you will see this.  If you look towards the bottom you will see the Query Plan tab. Once you click this, you will then see the plan.

What is missing?  If you look at the two plans below you will see a noticeable difference.  The SELECT operator is missing on the top one.  The top plan is from Extended Events and the bottom one is from SSMS.

Missing the SELECT operator is a big deal.  There is a great deal of information that can be found here. This alone would be a good reason to not use Extended Events to capture execution plans.

If that alone isn’t enough to avoid using Extended Events to capture execution plans, take a look at this warning that can be found in the description of the two events mentioned above.

“Using this event can have a significant performance overhead so it should only be used when troubleshooting or monitoring specific problems for brief periods of time.”

So if I shouldn’t use Extended Events to capture the execution plans, what should I use?  The answer is Query Store.

Thanks for stopping by my blog and I hope you learned something. More importantly, you learned what not to use!

All my Extended Event Options are Greyed out

Have you ever tried to change the options for an Extended Events session, only to see this.

There are a number of options that are greyed out and can’t be changed.  This is because the session is running.  There are some options that can be changed, just none on the Advanced tab.  This is an easy fix, if you need to change one of these options, simply stop the session and make your change.  Don’t forget to restart the session.

Something to keep in mind.  If you are using the file target, stopping and restarting the session could delete your oldest file and will create a new session file.  This could potentially result in the loss of key data.

Thanks for stopping by my blog and I hope you learned something.

Extended Events Max Dispatch Latency

When you create an Extended Event session, there are a number of properties that you can set.  These properties are spread through out the tabs in the Session Properties window.

Below you will see the four tabs,  General, Events, Data Storage and Advanced.

Just as a brief summary let’s go over a few of the key properties in each tab.

In the session tab, the obvious properties is the “Session Name”.  But there is also another key property, “Start the event session at server start up”.  This is an important property that can assure the session starts in the event of a server restart.

The event tab is exactly what you think it might be, it is where you will decide what events you want to capture and what data columns you would like to capture.

There are a number of options for storage and you will set this in the Data Storage tab.  Where the data collected is stored is referred to as targets  Two of the commonly used targets are ring_buffer, this is memory, and file target.

Which brings to the last table, Advanced.  This is where the Max_Dispatch_Latency property can be found.  The max_dispatch_latency property is the maximum duration that an event, once captured, would reside in the buffer before written to the target.  This doesn’t mean it will always take this long, just the maximum amount of time it will take.  The default for this property is 30 seconds.  For the most part, this is a good number and should avoid changing it.

You would logically think that the minimum setting would be zero seconds.  If you think that way, you are correct.  However, 0 does not mean 0 seconds.  When this is set to 0 it means the event will stay in the buffer until the buffer becomes full.  This is the same set setting the “Unlimited” option you see below. Given this, the true minimum is one second.

When I present my Extended Event session I usually use the Create Database event in my demo.  Once, I create the database, I go immediately to the target to see if the event has been written to the target.  Usually that takes just a few seconds.  I only see the event written to the target about 50% of the time.  This is because of the max_dispatch_latency.

Here is a really nice article by Jonathan Kehayias on the properties of an extended event session, click here.

Thanks for visiting my blog and I hope you learned something!

Capture Deadlock Graph with Extended Events

Over the past few years, I have presented a number of sessions on Extended Events.  As I prepare to present the topic a few more times in the coming weeks, I thought I would do a blog post on one particular event that has been very helpful to me when working with issues in a production environment.  That event is related to the Deadlock graph.

There are a number of ways you can capture this.  Profiler has been a very good and very effective way of capturing the Deadlock graph. However, starting with SQL Server 2008, there is now another option, Extended Events.  RedGate also puts out a great tool named SQL Monitor that can be used.

I am not going to get into the ins and outs of Extended Events in the post, I am simply going to cover how to use Extended Events to capture the deadlock graph.  There are a total of five events that have the word “deadlock” in the event name.

Let’s take a moment and quickly go over each of these five events.

database_xml_deadlock_report – This will be the same as the xml_deadlock_report, with one noticeable exception.  The xml_deadlock_report event has an additional tab, while the database_xml_deadlock_report does not have this.  This tab is where we will find the graph.

xml_deadlock_report – This is very similar to the database_xml_deadlock_report with the addition of the deadlock graph.  To see the deadlock graph, simply click on the Deadlock tab.

lock_deadlock – “Occurs when an attempt to acquire a lock is canceled for the victim of a deadlock.”  This definition is from SQL Server. Of the events we have discussed so far, this event has the most data points.

 

lock_deadlock_chain – “Occurs when an attempt to acquire a lock generates a deadlock. This event is raised for each participant in the deadlock.” This definition is from SQL Server.

scheduler_monitor_deadlock_ring_buffer_recorded – “The session tracks other useful events which makes it easy to trace back the series of events which led to the deadlocked schedulers condition!” – From www.troubleshootingsql.com

Since this post is about the deadlock graph, we will focus on the xml_deadlock_report event. As mentioned before, this is the event you will need to capture the deadlock graph.  There are two points to look at, the column is “XML_report” and the other is the graph itself. In the image below you will see that there are two places to look at the actual XML.  If you do not see this column in the grid, you can right click on it in the details at the bottom and then click “Show column in table”.  It might be very challenging to read due to formatting.  If you are having a difficult time reading the XML, this online formatting tool can be very helpful to get good formatting.

This is the actual XML that is used to build the graph.  This is quite a bit of useful information in the XML.  You can see the victim query as well as many other good data points.  The victim process can be found in the “victim-list” node.  In the image below, you will then see two processes, the first one listed matches the victim process id.  The UPDATE just below it is the T-SQL that was chosen as the victim.

 

The deadlock graph, while initially looks very useful, it does have a significant limitation.  The query with the blue X on it is the victim.  If you float your cursor above it, the query will then appear in a context pop up.

If you left click and hold one shapes you can move it around. This can be useful when there are more shapes than you see above.  Now on to the limitation, you can’t zoom it.  Without this, it can be difficult to see some of the details.

Here are just a few interesting deadlock graphs I have been able to capture.


While these are kind of interesting to look at, the scary part is what caused these.  These were all caused by a single stored procedure.  Now you have the graph.  I think now the fun is just beginning trying to figure out what is causing this and a solution.

Thank you for stopping by my blog.