SQL Agent Job Repeat Failures

SQL Agent job failure can be easy to fix but the problem many production DBA’s encounter is the repeat failures of SQL agent job failures.

Identifying that there’s a problem would be your first step to fix the problem permanently. Many customers have asked if there’s a way to find out how many times a SQL Agent job failure has occurred in the last few months.

This information can easily be obtained and can be extremely helpful to identify and permanently fix repeat SQL Agent job failures.

SQLOPS Risk and Health Audit

Below SQLOPS script can be used to identify if you have repeat job failures and if so which are those jobs and how many times have they failed and since when have they been failing.

SQL Agent Job Failure

SQL Agent Job Failure

This is a very lightweight script that can be used by DBAs on production servers. You can also automate this script to run at a given frequency and alert the DBA team when it finds a policy violation.


--This SQLOPS script helps you identify multiple job failures.
--Very useful script for DBAs to help fix problems permanently.
--Credits: https://sqlops.com
--The Most Advanced Risk and Health Audit for Production Databases
--Only $1999 per Production Server
SET NOCOUNT ON
SELECT
@@Servername as 'SERVERNAME',
'SQL AGENT JOBS' as CATEGORY,
'['+UPPER(SysJOBS.Name)+'] Agent Job Multiple Failures Detected'
AS 'JOBS WITH MULTIPLE FAILURES',
'SQLOPS Policy Violation: This job has failed ['+Ltrim(str(Count(SysJobHIST.run_Status)))+'] times since '+ISNULL(Convert(Varchar(12),Cast(cast(min(SysJobHIST.run_date)
as varchar(8)) as Datetime)),'0') AS 'VIOLATION DETECTED'
from MSDB.dbo.sysJobHistory SysJobHIST
right join MSDB.dbo.SysJobs SysJOBS
on SysJOBS.Job_ID=SysJobHIST.Job_ID and SysJobHIST.Run_Status=0
Group by SysJOBS.Name Having Count(SysJobHIST.run_Status) > 1;

Once you identify the list of jobs that are failing repeatedly, you can identify the root cause and fix them permanently.

Credits: https://sqlops.com/sql-server-best-practices/