Architecting, installing and maintaining your SAS environment

System Performance – user CPU time vs system CPU time and real time

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

System Performance – user CPU time vs system CPU time and real time

[ Edited ]

SAS FULLSTIMER stat, system CPU.  I am investigating a particular SAS step where the user CPU is 1 hour and system CPU 1 hr. 45 min. and the difference between the total of CPU time and real time is > 15%.  All other steps do not have this issue.  Same code on a Linux server though takes a shorter time but user CPU is 35 min, system CPU 47 min and only 2% difference in CPU time and real time.  

 

I have researched extensively to look into the causes why would the system CPU category is higher and that long(er).  I am inclined towards looking at IO, network 9nic and firmware), excessive audit and pursue ETW.  I am also asking here for any of good folks here  may have any thoughts, ideas to investigate further.

 

Thank you in advance.

 

 

 


Accepted Solutions
Solution
‎01-19-2018 04:47 PM
SAS Employee
Posts: 58

Re: System Performance – The conversation you really want to have

When you are comparing real time (wall clock time) to CPU time you have to compare it to the sum of system CPU time (amount of time used by the operating system on behalf of SAS) and user CPU time (amount of time SAS used). You compared the combined CPU times to the real time.
System CPU time is the time used by the operating system on behalf of SAS. For the most part, this time is associated with reads and writes of data. So, you are correct that they are associated with IO.
My team is happy to help you review your SAS job in question. Please just send the SAS log to my attention.
Margaret
Margaret.Crevar@sas.com

View solution in original post


All Replies
Solution
‎01-19-2018 04:47 PM
SAS Employee
Posts: 58

Re: System Performance – The conversation you really want to have

When you are comparing real time (wall clock time) to CPU time you have to compare it to the sum of system CPU time (amount of time used by the operating system on behalf of SAS) and user CPU time (amount of time SAS used). You compared the combined CPU times to the real time.
System CPU time is the time used by the operating system on behalf of SAS. For the most part, this time is associated with reads and writes of data. So, you are correct that they are associated with IO.
My team is happy to help you review your SAS job in question. Please just send the SAS log to my attention.
Margaret
Margaret.Crevar@sas.com

Super User
Posts: 3,866

Re: System Performance – The conversation you really want to have

I suggest you post the SAS log of the step (including FULLSTIMER notes) on the community as well to get feedback from community experts.

Frequent Contributor
Posts: 133

Re: System Performance – The conversation you really want to have

Hi,

 

At Boemska we offer a product called Enterprise Session Monitor for SAS. It's a piece of software that plugs into your SAS Environment and profiles the resource utilisation of individual SAS jobs, producing timeseries data which our customers use to optimise job performance, often focusing on single problem steps like the one you describe.

 

ESM records and visualises the CPU/memory utilisation, temp directory size and IO throughput of each individual job, allowing you to contrast it with the resource profile of the node it's executing on, showing metrics like iowait, per-device throughput (for both storage and network devices), disk queue lengths and cache/swap size. The data is very granular (2s intervals) and the interactive investigative workflow makes root cause analysis a relatively pleasant experience.

 

We're a SAS partner organisation & this is a separate proprietary product, but we offer a free 60 day trial, meaning you could take it for a spin for a couple of months with a view to resolve your immediate issue, no strings attached. Feel free to contact me privately if you're interested.

 

Nik

Valued Guide
Posts: 531

Re: System Performance – The conversation you really want to have

There is a third CPU timing metric, wait-for-IO. You should include that in your evaluation as well. Depending on your topology (any flavour of NFS or CIFS storage will be detrimental) this may give a more accurate interpretation of what you are observing. Actually it is not clear if what you are seeing is in fact a problem.

 

The elapsed (wall clock) time is dependant on much more than just your code. If you are running on a highly loaded system the ratio will be higher. The same job running at a different time of day may show vastly different results. So look at the system activity next to your job. As you already suspected IO and network can be at play. Run a vmstat or nmon or whatever at your disposal alongside your job to see what's going on. The tooling from @boemskatscan do this even better. Your metrics depend heavily not only on your job but on others as well.

 

If you have a challenge (which, again, is not entirely clear from what info you provided) I would work with @MargaretC and her team. They are excellent. If they haven't seen it before it probably doesn't exist.

 

Regards,

-- Jan.

Contributor
Posts: 33

Re: System Performance – The conversation you really want to have

Posted in reply to jklaverstijn
thank you for the suggestion, we are doing a comparison of the storage setup and also setting up the SAS IO test. Once I have the results, I will post them here.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 660 views
  • 6 likes
  • 5 in conversation