Auditors are at your desk, Now what? – Job Failures

Having worked for both a casino and a publicly traded company, audits have been a big part of my job for quite a few years.  These audits come in many different shapes and sizes, however there are some things that they all have in common, the auditors want evidence!!!  And lots of it!

Surviving an audit and passing an audit are two distinctly different things.  While this blog post is more about surviving, many of these same tools can also be used to help you pass an audit. To me surviving an audit is more about keeping your sanity and being as relaxed as possible.

Please keep this in mind as your read this post.  This is solely based on my experience of working with auditors.  You may have different experiences that might be a bit outside what I am going to talk about here and in future posts.

Before getting into these useful tools, I want to cover a few basics that I have followed for a number of years.

Get auditors out ASAP

The longer the auditors are in your environment the more opportunity there is for them to find something.  While we all work very hard to do things the right way, we are human and we do make mistakes.  We just don’t want the auditors to find our mistakes. You might be wondering, how can to accomplish this?   The best way to do this is to have the evidence readily available.

Don’t volunteer information

Only provide the evidence they are asking for, nothing more.  You want to avoid giving them things that they do not ask for.  By doing so, you might expose something that you don’t want them to see.  The last thing you want to do, is give them a reason for documenting a finding. Another thing to remember is the audits have a defined scope.

Be honest

Don’t attempt to hide something from the auditors.  They will find what ever the issue is eventually.  Most importantly, we don’t want them thinking we are hiding something from them.  If the auditor doesn’t trust your work, it is NEVER a good thing.

Auditors don’t want to hear what you are going to do, they want to see what you have done

If you find something that you feel might be an issue during an audit, start the steps to rectify the situation.  Sometimes if you can show the auditors that you have already taken steps to address the issue, they may be more understanding and things might work out a bit better for you and your company.

Do the right thing every day.

Sadly, doing this might make your job a bit more difficult on a daily basis, but it will make audits much easier to get through.  When the auditors walk in the door, it is too late.  Any thing that might need to be fixed will more than likely not be able to be addressed before they find the issue.  However, if we do the right thing daily, there won’t be much for them to find.

Address Issues in a timely manner

Things fail, that is a reality that we must except. Many times the audits I have been part of, the auditors not only look for documentation that the issue was addressed, but also the it is addressed in a timely manner.  It is important to document what you do, including timestamps.

Remember they are just doing their jobs

Just like you, auditors are held accountable by their management. While audits can be challenging, it is important to remember that the auditors are just simply doing their jobs.

Over the years auditors have asked for lots of information. Sometimes this information is very easy to get, sometimes it might be a bit more challenging.  What I have found that works for me is to have a process already defined and the code ready to go.

So what types of things have auditors asked for?  Below are a few of the common items I have been asked to present to the auditors over the years..  Of course there are a few more items, but these are the main ones that seem to be part of most of the audits I have participated in.  I will go over each of these a bit more in depth in this post and in future posts.

      • Key SQL Server job failures
      • Backup history
      • Orphaned Logins
      • Orphaned database users
      • Who has sysadmin rights
      • Who has db_owner, DDLAdmin and DML rights on key databases
      • Separation of duties

There are a number of tools that you can utilize to capture the required information. However the focus of this series of blog posts is to utilize native SQL Server functionality.  Below are some of the tools that are available in the SQL Server environment.

      • SQL Server Management Studio
      • Azure Data Studio
      • Azure Data Studio Notebooks
      • Transact SQL
      • Data Management Views
      • SQL Server Audit

Key SQL Server job failures

So exactly what is a key job?  In most of the audits I have been a part of, a key job is defined as any job that could impact the financial reporting or financial decision making of the business.  As a DBA, there are a number of jobs that we might have as part of our daily maintenance routine.  For the most part, these usually are not considered key jobs from an audit perspective, with the sole exception of database backups.  We will cover backups a bit more in depth in a later post.

On numerous occasions the auditors have not only asked for job failures, they have also looked for automated ways for notifications that the job failed.  This is where operators can help.  Having the job email the DBA team that it failed has been very helpful.  We have a PowerShell script that automatically creates an incident in Service Now.  However, sometimes that will fail because of something beyond our control.  In these rare cases, having the email that was automatically generated when the job failed was critical to meeting the requests of the auditors.  It is important to also be aware of the company email retention policy.

We can use Transact SQL to gather this information. However, before getting to the TSQL, we need to take a moment to go over the job history retention settings.

Another method you could use is to query the registry.  To do this you will need to use the xp_instance_regread Extended Stored procedure.  The maximum for the job history log size is 99,999 with the minimum for the job history rows per job is 2.  Here is a very good post on this topic.

DECLARE @JobHistMaxTotalRows INT = NULL,
@JobHistMaxTotalRowsPerJob INT = NULL

EXECUTE master.dbo.xp_instance_regread N’HKEY_LOCAL_MACHINE’,
N’SOFTWARE\Microsoft\MSSQLServer\SQLServerAgent’,
N’JobHistoryMaxRows’,
@JobHistMaxTotalRows OUTPUT

EXECUTE master.dbo.xp_instance_regread N’HKEY_LOCAL_MACHINE’,
N’SOFTWARE\Microsoft\MSSQLServer\SQLServerAgent’,
N’JobHistoryMaxRowsPerJob’,
@JobHistMaxTotalRowsPerJob OUTPUT

SELECT @JobHistMaxTotalRows AS ‘Max Rows’
, @JobHistMaxTotalRowsPerJob AS ‘Max Rows Per Job’

This code will return a data set similar to what is below.

Just a couple of gotchas on this.  If you have a maintenance plan that cleans just job history, that may delete what could be essential data for an audit.  Also, if you make it too big and you have a lot of job executions, looking at the history using the GUI could time out.  You will need to adjust these settings to values that are appropriate for your environment.  My suggestion is to establish a standard and use that on all the servers.

Now let’s get to the actual job failures.  In order to pull this information using TSQL you will need to use the sysjobs table, sysjobhistory table and the agent_datetime function. All three of these objects can be found in the MSDB database.

Most of the audits I have been involved in have been for a specific quarter and we have only had to pull job failures for that quarter. Since the date and time of the job execution are in separate columns and are INT data type, we can use the agent_datetime function to convert it to a date. Since this is an undocumented function, it is recommended to not use this in production code in case in changes in future versions of SQL Server. If you look at the data types of the run_date and run_time columns you will see that that are the INT datatype.  This function will convert them from INT to Datetime. Pinal Dave has a nice alternative you might want to check out as well, click here.  It is always good to have more than one option.

Below is the code you can use to return job failures for a quarter.

SELECT dbo.agent_datetime(h.run_date,h.run_time) AS ‘Date Of Failure’
, j.name AS ‘Job Name’
, h.message AS ‘Error’
FROM msdb.dbo.sysjobs j
INNER JOIN msdb.dbo.sysjobhistory h
ON h.job_id = j.job_id
WHERE h.run_status = 0
AND h.step_id = 0
AND dbo.agent_datetime(h.run_date,h.run_time) >= ’04/01/2020′
AND dbo.agent_datetime(h.run_date,h.run_time) <= ’06/30/2020′
ORDER BY dbo.agent_datetime(h.run_date,h.run_time) DESC

In the WHERE clause you will see both the run_status and step_id are used as part of the criteria.  The run_status column is in the sysjobhistory table. Notice that we are looking for a value of 0.  This is the value for a failure.  Below are the other options and more information can be found here.

0 = Failed

1 = Succeeded

2 = Retry

3 = Canceled

4 = In Progress

This brings us to the step_id column.  Like the run_status column, this column can also be found in the sysjobhistory column.  The values in this column will range from 0 to the number of steps in any job.  If you have a job that has 10 steps, there will be 11 rows entered into the sysjobhistory table every time the job executes. When a job executes, SQL Server will place a record for each step of the job and one record for the over all execution of the job.  It is this over all execution that we are looking for.  That record will have a step_id value of 0.  If you look below you will see three job failures.  You also see step ID 0 for each of the jobs.

Notice that the ClassDemo job also has two rows, but only has the step ID’s of 0 and 2.  This is because the job actually has two steps, the first step was skipped during the execution of the job.  Remember that we are only looking for the jobs that have failed, not necessarily what step it failed on.  Because of this, we only need the over all status, step_id 0.

While I prefer to use TSQL to get the job failures, some auditors may still want a screenshot in SQL Server Management Studio.  To get this simply right click on the job and go to Job History.  You will probably need to apply a filter to the results.  When you do, it will look similar to what is below.

Occasionally auditors don’t just look for the job failures, they also looked for incidents in our ticketing system that showed that the failure was properly addressed. So you should be prepared to show them those as well.

This is the first of several posts on “How to Survive an Audit using Native SQL Server Tools”.  I will cover these topics in future posts.

            • Backup history
            • Orphaned Logins
            • Orphaned database users
            • Who has sysadmin rights
            • Who has db_owner, DDLAdmin and DML rights on key databases
            • Separation of duties

Audits are not as painful if we are prepared!

Thanks for visiting my blog!!!