BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shoin
Lapis Lazuli | Level 10

SAS FULLSTIMER stat, system CPU.  I am investigating a particular SAS step where the user CPU is 1 hour and system CPU 1 hr. 45 min. and the difference between the total of CPU time and real time is > 15%.  All other steps do not have this issue.  Same code on a Linux server though takes a shorter time but user CPU is 35 min, system CPU 47 min and only 2% difference in CPU time and real time.  

 

I have researched extensively to look into the causes why would the system CPU category is higher and that long(er).  I am inclined towards looking at IO, network 9nic and firmware), excessive audit and pursue ETW.  I am also asking here for any of good folks here  may have any thoughts, ideas to investigate further.

 

Thank you in advance.

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
MargaretC
SAS Employee
When you are comparing real time (wall clock time) to CPU time you have to compare it to the sum of system CPU time (amount of time used by the operating system on behalf of SAS) and user CPU time (amount of time SAS used). You compared the combined CPU times to the real time.
System CPU time is the time used by the operating system on behalf of SAS. For the most part, this time is associated with reads and writes of data. So, you are correct that they are associated with IO.
My team is happy to help you review your SAS job in question. Please just send the SAS log to my attention.
Margaret
Margaret.Crevar@sas.com

View solution in original post

5 REPLIES 5
MargaretC
SAS Employee
When you are comparing real time (wall clock time) to CPU time you have to compare it to the sum of system CPU time (amount of time used by the operating system on behalf of SAS) and user CPU time (amount of time SAS used). You compared the combined CPU times to the real time.
System CPU time is the time used by the operating system on behalf of SAS. For the most part, this time is associated with reads and writes of data. So, you are correct that they are associated with IO.
My team is happy to help you review your SAS job in question. Please just send the SAS log to my attention.
Margaret
Margaret.Crevar@sas.com

SASKiwi
PROC Star

I suggest you post the SAS log of the step (including FULLSTIMER notes) on the community as well to get feedback from community experts.

boemskats
Lapis Lazuli | Level 10

Hi,

 

At Boemska we offer a product called Enterprise Session Monitor for SAS. It's a piece of software that plugs into your SAS Environment and profiles the resource utilisation of individual SAS jobs, producing timeseries data which our customers use to optimise job performance, often focusing on single problem steps like the one you describe.

 

ESM records and visualises the CPU/memory utilisation, temp directory size and IO throughput of each individual job, allowing you to contrast it with the resource profile of the node it's executing on, showing metrics like iowait, per-device throughput (for both storage and network devices), disk queue lengths and cache/swap size. The data is very granular (2s intervals) and the interactive investigative workflow makes root cause analysis a relatively pleasant experience.

 

We're a SAS partner organisation & this is a separate proprietary product, but we offer a free 60 day trial, meaning you could take it for a spin for a couple of months with a view to resolve your immediate issue, no strings attached. Feel free to contact me privately if you're interested.

 

Nik

jklaverstijn
Rhodochrosite | Level 12

There is a third CPU timing metric, wait-for-IO. You should include that in your evaluation as well. Depending on your topology (any flavour of NFS or CIFS storage will be detrimental) this may give a more accurate interpretation of what you are observing. Actually it is not clear if what you are seeing is in fact a problem.

 

The elapsed (wall clock) time is dependant on much more than just your code. If you are running on a highly loaded system the ratio will be higher. The same job running at a different time of day may show vastly different results. So look at the system activity next to your job. As you already suspected IO and network can be at play. Run a vmstat or nmon or whatever at your disposal alongside your job to see what's going on. The tooling from @boemskatscan do this even better. Your metrics depend heavily not only on your job but on others as well.

 

If you have a challenge (which, again, is not entirely clear from what info you provided) I would work with @MargaretC and her team. They are excellent. If they haven't seen it before it probably doesn't exist.

 

Regards,

-- Jan.

shoin
Lapis Lazuli | Level 10
thank you for the suggestion, we are doing a comparison of the storage setup and also setting up the SAS IO test. Once I have the results, I will post them here.

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

CLI in SAS Viya

Learn how to install the SAS Viya CLI and a few commands you may find useful in this video by SAS’ Darrell Barton.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 7485 views
  • 6 likes
  • 5 in conversation