BookmarkSubscribeRSS Feed
calger
Fluorite | Level 6

Hello SAS Community,

 

I have a fairly substantial program (in terms of length, size of data) that I've scheduled to run daily at 8am. My company uses SAS Enterprise Guide. A few months ago, the run time started to be extremely variable, even without any changes to the code. My company added additional cores about a month ago, but the variation continues. What could be the culprit? It seems like a system limitation, but how do I diagnose/determine that? Is there anything I can do from a code perspective to reduce this variability? We have a couple of "problem queries" aka ones that take long to run, either due to complexity or size. These queries used to run regularly in 2-3 minutes. So, not fast, but fast enough for our purposes. Now they can take anywhere from 3 minutes to 30 minutes to run. I am an analyst with limited IT background - wondering what options I have to explore? Thanks in advance for any guidance!

6 REPLIES 6
SASKiwi
PROC Star

First you should review your SAS log. Compare a normal run time log with a slow one. What steps are taking longer than they used to? If all steps are running slower, then a likely culprit could be the load from other jobs on the SAS server. Talk to your SAS administrator regarding SAS server performance when your job is running. They may be able to provide insights that explain the variation in run times.

calger
Fluorite | Level 6
Thanks. Everything runs slower, but most steps take seconds or fractions of seconds to run, so even though they run more slowly, they still run quickly. The longest queries obviously take much, much longer to run. Would breaking them down into simpler queries solve the issue?
Kurt_Bremser
Super User

Complicated, multi-table joins in SQL can develop into massive time-wasters. If these joins are in fact lookups into tables with unique keys, consider doing the lookups in a DATA step with hash objects.

For more detailed help, we would need to see the log of one such long-running step, including information about the involved tables (key relationships, variables).

calger
Fluorite | Level 6

I am attaching the problem query below & some of the log output. I've changed some of the variable and table names. This is the worst runtime I've ever seen for it, but then the next day it ran in ~7 minutes. 

A unique identifier for call_data & problem_table would be employee_date_key + call_skill, but the employee data won't have call_skill, it only has employee_date_key (this is unique for that table, it will only have 1 row per employee_date_key).

 

PROC SQL;
CREATE TABLE gridwork.problem_table as
SELECT DISTINCT
t1.employee_date_key
, t1.EMP_NNUMBER
, t1.department
, t1.job_name
, t1.targets_key
, t1.employee_tenure
, t2.call_date
, intnx('month',t2.call_date,0,"BEGINNING") as begofmonth
, t2.call_skill
, t2.Skill_JoinFmtCSO
 , t2.TRANSFER_LOCATION
, . as RESOLVED_NUM
, sum(t2.ACD_CALL_COUNT)
, sum(case when exc.exclude_AHT_YN = 'Y' then . else t2.REP_TALK_SEC end) as TALK_TIME_SEC
, sum(case when exc.exclude_AHT_YN = 'Y' then .
 when t2.ACW_SEC > 3599 then 3600 + t2.REP_TALK_SEC
else t2.REP_TALK_SEC + t2.ACW_SEC end) as KPI_NUM
, sum(case when exc.exclude_aht_yn = 'Y'
 or t2.ACD_CALL_COUNT = 0 then .
 else t2.ACD_CALL_COUNT end) as KPI_DENOM
, sum(case when exc.exclude_acw_yn = 'Y' then .
when t2.ACW_SEC > 3599 then 3600
 else t2.ACW_SEC end) as KPI2_NUM
, sum(case when t2.ACD_CALL_COUNT = 0
 or exc.exclude_acw_yn = 'Y' then .
else t2.ACD_CALL_COUNT end) as KPI2_DENOM
, sum(t2.TRANSFER_CALL_COUNT1) as TRANSFER_NUM
, sum(t2.ACD_CALL_COUNT) as TRANSFER_DENOM
, sum(t2.LEAD_COUNT)
, sum(t2.ACCEPTABLE_CALL_COUNT)
, count(t2.ACCEPTABLE_CALL_COUNT)
, sum(t2.ANSWER_TIME_SEC) as KPI3_NUM
, sum(t2.LEADS_ELIG_COUNT)
, t1.Blended_tenure_bucket
, t1.Blended_StartDate
from CUBE.employee_data t1
 join GRIDWORK.call_data t2
on (t1.EMP_NNUMBER = t2.EMP_PIN
and t1.CLNDR_DT = t2.call_date
 and t1.l1_status = 'Active')
left join dat.bl_exceptions exc
 on (t1.CLNDR_DT between exc.FROM_DATE and exc.TO_DATE
and t1.department = exc.TEAM
and (exc.exclude_aht_yn = 'Y' or exc.exclude_acw_yn = 'Y'))
 where t1.CLNDR_DT >= /*&begofCurrentYear.*/ mdy(1,1,&CurrentYear)
2023-06-29T08:06:07,943 - SYMBOLGEN: Macro variable CURRENTYEAR resolves to 2023
 and t1.CHANNEL = 'Business Lines'
group by 1,2,3,4,5,6,7,8,9,10,11
;
2023-06-29T08:06:08,364 - Libref=CUBE Engine=V9 Member=employee_data MemberType=DATA Openmode=INPUT Path={file path}
2023-06-29T08:06:08,372 - Libref=GRIDWORK Engine=V9 Member=call_data MemberType=DATA Openmode=INPUT Path={file path}
2023-06-29T08:06:08,676 - Libref=DAT Engine=V9 Member=BL_EXCEPTIONS MemberType=DATA Openmode=INPUT Path={path}
2023-06-29T08:06:08,678 - NOTE: The query requires remerging summary statistics back with the original data.
2023-06-29T08:06:08,680 - Libref=GRIDWORK Engine=V9 Member=problem_table MemberType=DATA Openmode=OUTPUT Path={path}
2023-06-29T08:06:08,688 - NOTE: SAS threaded sort was used.
2023-06-29T08:48:32,032 - NOTE: Compressing data set GRIDWORK.problem_table decreased size by 96.89 percent.
2023-06-29T08:48:32,090 - Compressed is 134 pages; un-compressed would require 4305 pages.
2023-06-29T08:48:32,122 - NOTE: Table GRIDWORK.problem_table created, with 43042 rows and 27 columns.
2023-06-29T08:48:32,122 -
2023-06-29T08:48:35,166 - 1593 +QUIT;
2023-06-29T08:48:35,196 - NOTE: PROCEDURE SQL used (Total process time):
2023-06-29T08:48:35,196 - real time 42:27.25
2023-06-29T08:48:35,196 - user cpu time 56.07 seconds
2023-06-29T08:48:35,196 - system cpu time 2:33.86
2023-06-29T08:48:35,196 - memory 6533820.03k
2023-06-29T08:48:35,196 - OS Memory 6555652.00k
2023-06-29T08:48:35,196 - Timestamp 06/29/2023 08:48:35 AM
2023-06-29T08:48:35,196 - Step Count 20 Switch Count 0
2023-06-29T08:48:35,196 - Page Faults 0
2023-06-29T08:48:35,196 - Page Reclaims 3245409
2023-06-29T08:48:35,196 - Page Swaps 0
2023-06-29T08:48:35,196 - Voluntary Context Switches 107048
2023-06-29T08:48:35,196 - Involuntary Context Switches 37457
2023-06-29T08:48:35,196 - Block Input Operations 0
2023-06-29T08:48:35,196 - Block Output Operations 0

SASKiwi
PROC Star

If everything is running slower then that suggests that there is increased load on the the SAS Compute server which is impacting your job. I do have to ask the question - does it really matter that it takes 30 minutes on some days versus a few minutes on others? You could spend a lot of time trying to tune your slow steps, but at the end of the day if SAS is heavily loaded your job is still going to run slow. In my experience efficiently-written code tends to be less impacted by heavy server loading.

calger
Fluorite | Level 6
Thank you very much! We have forwarded our issue to an IT team.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1071 views
  • 5 likes
  • 3 in conversation