Which brings me to this post. I really enjoyed the concept of Tom’s Apr02 post on the sysjobhistory table in msdb. He had a query that found the SQL Agent Jobs that are outliers… in other words, jobs that are atypical in terms of their run duration… to put it still another way, this query found jobs that ran longer than what would seem to be statistically “normal”.
Tom’s query looked at the previous 24 hours’ worth of job history and found those job steps that ran longer than the average duration plus two standard deviations (i.e. the mean plus two sigmas). In the picture of the bell curve above, that is in the green and gray zone at the right.
I thought this was a great concept, but I wanted to expand on it and introduce some additional features.
First of all, as Tom mentioned, the sysjobhistory table stores dates and times as integers, which is a real pain in the posterior. The dates are stored as YYYYMMDD integers and the times (and durations) are stored as HHMMSS integers. So, for example, if I look at a random entry in my sysjobhistory table…
select top 1 run_date, run_time, run_durationthe run_time of 74856 means that the item started at 07:48:56 and the run_duration of 106 means that the item ran for a total of 00:01:06, or 1 minute and 6 seconds. So my first job was to translate these goofy integer representations into something meaningful.
from msdb.dbo.sysjobhistory
/*
run_date run_time run_duration
-------- -------- ------------
20110302 74856 106
*/
The run_date is easy, since we are beyond the Middle Ages and all years that we deal with are 4 digits, so I could just directly translate a date by converting the integer to a VARCHAR(8) and then converting that to a DATETIME:
select top 1 convert(datetime,convert(varchar(8),run_date))The run_time takes a little more work, because the CONVERT function insists on times being in HH:MM:SS format, with 2 digits for each element, meaning leading zeroes had to be there. So first I would introduce the leading zeroes I need by adding 1000000 to the number and then taking the RIGHT-most 6 characters (T-SQL will implicitly convert the integer to a VARCHAR before doing the RIGHT function):
from msdb.dbo.sysjobhistory
/* 2011-03-02 00:00:00.000 */
select top 1 right(1000000+run_time,6)Then it’s a matter of introducing the colons to separate the elements:
from msdb.dbo.sysjobhistory
/* 074856 */
select top 1 stuff(stuff(right(1000000+run_time,6),3,0,':'),6,0,':')Now, when I combine that with my VARCHAR representation of the run_date and put them together in YYYYMMDD HH:MM:SS format, I can CONVERT it into a complete date and time of the item:
from msdb.dbo.sysjobhistory
/* 07:48:56 */
select top 1 convert(datetime,convert(varchar(8),run_date)The run_duration requires a little more manipulation to translate the HHMMSS integer representation into just a number of seconds (so the value of 106, representing 00:01:06, is translated into 66 seconds):
+' '
+stuff(stuff(right(1000000+run_time,6),3,0,':'),6,0,':'))
from msdb.dbo.sysjobhistory
/* 2011-03-02 07:48:56.000 */
select top 1 run_duration/10000*3600 --Hours*3600 = SecondsSo now that I have those formulas down, I can put together a query that would give me more meaningful information. This query will give me information for successfully-completed jobs… Entries with a Step_ID of 0 indicate a date/time/duration of the job as a whole:
+run_duration%10000/100*60 --Minutes*60 = Seconds
+run_duration%100 --Seconds
from msdb.dbo.sysjobhistory
/* 66 */
select job_idNow that I have that information, I can put that query into a CTE and make good use of it.
,date_executed=convert(datetime,convert(varchar(8),run_date))
+' '
+stuff(stuff(right(1000000+run_time
,6)
,3,0,':')
,6,0,':')
,secs_duration=run_duration/10000*3600
+run_duration%10000/100*60
+run_duration%100
from msdb.dbo.sysjobhistory
where step_id=0 --Job Outcome
and run_status=1 --Succeeded
/*
job_id date_executed secs_duration
------------------------------------ ----------------------- -------------
ECA51518-7144-49DE-9153-EB12D42AE0D9 2011-03-02 07:59:17.000 66
ECA51518-7144-49DE-9153-EB12D42AE0D9 2011-03-02 08:23:49.000 74
ECA51518-7144-49DE-9153-EB12D42AE0D9 2011-03-02 13:07:24.000 61
etc, etc, etc
*/
In order to find the outliers, I wanted the ability to define a certain “history window” or “baseline window” of a range of dates of job runs from which I could calculate the Average and the Standard Deviation. So the query would accept a @HistoryStartDate and @HistoryEndDate date range (I would convert the dates to full days starting/ending at midnight). Also, for a halfway meaningful sample, I wanted an option to calculate the statistics for jobs that had run a minimum number of times (represented by @MinHistExecutions). And, since we may only want to look at outliers of jobs that are kind of “beefy”, I wanted to introduce the option to only pay attention to jobs that had a minimum average duration of some kind (@MinAvgSecsDuration).
So this query would give us our history/baseline calculations:
declare @HistoryStartDate datetime = '19000101'And now that we have that, we can look for jobs in an “analysis window” date range (@AnalysisStartDate and @AnalysisEndDate)… perhaps the last 24 or 48 hours… to see which ones have a duration that exceeds the AvgPlus2StDev of the baseline.
,@HistoryEndDate datetime = getdate()
,@MinHistExecutions int = 10
,@MinAvgSecsDuration int = 30
;
with JobHistData as
(
select job_id
,date_executed=convert(datetime,convert(varchar(8),run_date))
+' '
+stuff(stuff(right(1000000+run_time
,6)
,3,0,':')
,6,0,':')
,secs_duration=run_duration/10000*3600
+run_duration%10000/100*60
+run_duration%100
from msdb.dbo.sysjobhistory
where step_id=0 --Job Outcome
and run_status=1 --Succeeded
)
select job_id
,AvgDuration=avg(secs_duration*1.)
,AvgPlus2StDev=avg(secs_duration*1.)+2*stdevp(secs_duration)
from JobHistData
where date_executed>=dateadd(day,datediff(day,'19000101',@HistoryStartDate),'19000101')
and date_executed<dateadd(day,1+datediff(day,'19000101',@HistoryEndDate),'19000101')
group by job_id
having count(*)>=@MinHistExecutions
and avg(secs_duration*1.)>=@MinAvgSecsDuration
/*
job_id AvgDuration AvgPlus2StDev
------------------------------------ ----------- ----------------
ECA51518-7144-49DE-9153-EB12D42AE0D9 69.373333 108.739927181991
0BC1CB96-602F-417A-AB42-CAF98E1E39A1 47.266666 82.860423255389
etc, etc, etc
*/
Here is a stored procedure to do just that.
create procedure usp_SQLAgentJobOutliersIf you don’t specify any parameters, it will use the ENTIRE sysjobhistory table for both the “history/baseline window” and the “analysis window”. Try calling the procedure with various parameters and see what you come up with. See if any jobs of yours have run with an atypically long duration.
@HistoryStartDate datetime = null
,@HistoryEndDate datetime = null
,@AnalysisStartDate datetime = null
,@AnalysisEndDate datetime = null
,@MinHistExecutions int = 10
,@MinAvgSecsDuration int = 30
as
if @HistoryStartDate is null set @HistoryStartDate='19000101' ;
if @HistoryEndDate is null set @HistoryEndDate=getdate() ;
if @AnalysisStartDate is null set @AnalysisStartDate='19000101' ;
if @AnalysisEndDate is null set @AnalysisEndDate=getdate() ;
with JobHistData as
(
select job_id
,date_executed=convert(datetime,convert(varchar(8),run_date))
+' '
+stuff(stuff(right(1000000+run_time
,6)
,3,0,':')
,6,0,':')
,secs_duration=run_duration/10000*3600
+run_duration%10000/100*60
+run_duration%100
from msdb.dbo.sysjobhistory
where step_id=0 --Job Outcome
and run_status=1 --Succeeded
)
,JobHistStats as
(
select job_id
,AvgDuration=avg(secs_duration*1.)
,AvgPlus2StDev=avg(secs_duration*1.)+2*stdevp(secs_duration)
from JobHistData
where date_executed>=dateadd(day,datediff(day,'19000101',@HistoryStartDate),'19000101')
and date_executed<dateadd(day,1+datediff(day,'19000101',@HistoryEndDate),'19000101')
group by job_id
having count(*)>=@MinHistExecutions
and avg(secs_duration*1.)>=@MinAvgSecsDuration
)
select jd.job_id
,JobName=j.name
,ExecutionDate=jd.date_executed
,[Duration (secs)]=jd.secs_duration
,[Historical Avg Duration (secs)]=AvgDuration
,[Min Threshhold (secs)]=AvgPlus2StDev
from JobHistData jd
join JobHistStats jhs on jd.job_id=jhs.job_id
join msdb.dbo.sysjobs j on jd.job_id=j.job_id
where date_executed>=dateadd(day,datediff(day,'19000101',@AnalysisStartDate),'19000101')
and date_executed<dateadd(day,1+datediff(day,'19000101',@AnalysisEndDate),'19000101')
and secs_duration>AvgPlus2StDev
If you want to get more granular and analyze individual Job Steps as opposed to Jobs as a whole, the following procedure will do that for you.
create procedure usp_SQLAgentJobStepOutliersHowever, a word of warning: I was disappointed to find that the sysjobhistory table stores only the Step_ID and not the Step_UID unique identifier. So a Job’s Step_ID #1 today may not have been its Step_ID #1 a month ago. If you move steps up/down in the step list of the SQL Agent Job dialog, it changes their Step_ID value from that point forward, but it doesn’t change anything in history. So you may be comparing apples and oranges.
@HistoryStartDate datetime = null
,@HistoryEndDate datetime = null
,@AnalysisStartDate datetime = null
,@AnalysisEndDate datetime = null
,@MinHistExecutions int = 10
,@MinAvgSecsDuration int = 30
as
if @HistoryStartDate is null set @HistoryStartDate='19000101' ;
if @HistoryEndDate is null set @HistoryEndDate=getdate() ;
if @AnalysisStartDate is null set @AnalysisStartDate='19000101' ;
if @AnalysisEndDate is null set @AnalysisEndDate=getdate() ;
with JobHistData as
(
select job_id
,step_id
,date_executed=convert(datetime,convert(varchar(8),run_date))
+' '
+stuff(stuff(right(1000000+run_time
,6)
,3,0,':')
,6,0,':')
,secs_duration=run_duration/10000*3600
+run_duration%10000/100*60
+run_duration%100
from msdb.dbo.sysjobhistory
where step_id<>0
and run_status=1 --Succeeded
)
,JobHistStats as
(
select job_id
,step_id
,AvgDuration=avg(secs_duration*1.)
,AvgPlus2StDev=avg(secs_duration*1.)+2*stdevp(secs_duration)
from JobHistData
where date_executed>=dateadd(day,datediff(day,'19000101',@HistoryStartDate),'19000101')
and date_executed<dateadd(day,1+datediff(day,'19000101',@HistoryEndDate),'19000101')
group by job_id
,step_id
having count(*)>=@MinHistExecutions
and avg(secs_duration*1.)>=@MinAvgSecsDuration
)
select jd.job_id
,JobName=j.name
,jd.step_id
,s.step_name
,ExecutionDate=jd.date_executed
,[Duration (secs)]=jd.secs_duration
,[Historical Avg Duration (secs)]=AvgDuration
,[Min Threshhold (secs)]=AvgPlus2StDev
from JobHistData jd
join JobHistStats jhs on jd.job_id=jhs.job_id and jd.step_id=jhs.step_id
join msdb.dbo.sysjobs j on jd.job_id=j.job_id
join msdb.dbo.sysjobsteps s on jd.job_id=s.job_id and jd.step_id=s.step_id
where date_executed>=dateadd(day,datediff(day,'19000101',@AnalysisStartDate),'19000101')
and date_executed<dateadd(day,1+datediff(day,'19000101',@AnalysisEndDate),'19000101')
and secs_duration>AvgPlus2StDev
Anyway, I hope you find these procedures useful for finding those outliers. Thanks again to Thomas LaRock for the idea.
My next blog post will also be dealing with SQL Agent Jobs… I think you’ll really like that one… Stay tuned!
Very nice, Brad! This is really useful stuff build on top of @SQLRockStar's original work.
ReplyDeleteKudos,
-Kev
Great post. As always, I'm impressed by what you put into these posts :-)
ReplyDeleteI have to admit, though, I'm too lazy for some of the roll your own tools like this. I let SQL Sentry do the math and tell me when a job runs over or under a % of the average run time now. :-)
Looking forward to the next one. Thanks for the comments on my post last week, sir.
Keep up the AwesomeSauce
@Kevin and @Mike:
ReplyDeleteThanks so much for the great feedback!
--Brad
Great stuff, thank you and Tom for putting some of this together so I can stop rolling my own solutions ;) One little tidbit I found while cursing the whole integer values for date and time is MS ships a function that reconstitutes the date and time into a datetime object. msdb.dbo.agent_datetime(@date, @time)
ReplyDeleteselect job_id
, date_executed = msdb.dbo.agent_datetime(run_date, run_time), run_duration
,secs_duration=run_duration/10000*3600
+run_duration%10000/100*60
+run_duration%100
from msdb.dbo.sysjobhistory
where step_id=0 --Job Outcome
and run_status=1 --Succeeded
Posted a blog about it but all credit goes to Lori & Tim Edwards
@Bill:
ReplyDeleteAhhh... It makes sense that there would be a built-in function for that... Thanks for the info!
--Brad
Hi Brad, I came across your post while looking into usage of the sysjobhistory table. I made a post on my (newly created) blog and thought you may be interested.
ReplyDelete- Ben
@BringMeData:
ReplyDeleteWelcome to the blogging world... I'm looking forward to reading your posts.
--Brad
Brad, I found a missing join condition between JobHistData and JobHistStats for the JobSteps script:
ReplyDeleteFROM JobHistData jd
JOIN JobHistStats jhs ON jd.job_id=jhs.job_id AND jd.step_id=jhs.step_id
Thanks for publishing this, I really like the use of the CTE!
Oh, and one more bug:
ReplyDeleteIF @AnalysisStartDate IS NULL SET @AnalysisStartDate='19000101' ;
IF @AnalysisEndDate IS NULL SET @AnalysisEndDate=GETDATE() ;
Thanks for finding the missing JOIN condition, Chuck. I corrected the post.
ReplyDeleteAs I was changing thigs, I also happened to notice the @AnalysisStartDate and @AnalysisEndDate initialization problem, and when I went back to enter this comment, I see that you caught it too!
Thanks for catching those, and thanks for the feedback!
According to Stanford Medical, It's really the SINGLE reason this country's women get to live 10 years longer and weigh 19 kilos less than we do.
ReplyDelete(And by the way, it has totally NOTHING to do with genetics or some secret-exercise and EVERYTHING to do with "HOW" they are eating.)
BTW, I said "HOW", and not "WHAT"...
TAP on this link to reveal if this brief test can help you decipher your true weight loss possibility
Thanks for the great content Sir, I wil also share with my friends & onece again thanks a lot.
ReplyDeleteSeo optimization (Seo Ready) Best Responsive Blogger Template FlatMag
how to increase blog traffic fast
remove realme c3 frp bypass 2021
How To optimize Twitter Meta Tags in Blogger Update -2021
trabzon
ReplyDeleteedirne
van
bingöl
yalova
C4NNK
I wan’t going to comment as this posts a bit old now, but just wanted to say thanks.
ReplyDeleteGreetings! Very helpful advice within this post!
ReplyDeleteYour article has answered the question I was wondering about!
ReplyDeleteThis website is very useful and beautiful. thank you.
ReplyDeletewoooow, amazing blog to read, thanks for sharing, and move on. finally
ReplyDeleteThe information related to biology assignment help is available here.
ReplyDeleteWell I truly enjoyed reading it. This subject offered by you is very helpful and accurate.
ReplyDelete