BookmarkSubscribeRSS Feed
MariaD
Barite | Level 11

Hi folks,

 

A user has a process, a union of SAS tables, that present different performance depending on the time when it's executed. Follows the extended log on the worse performance:

 

NOTE: PROCEDURE SQL used (Total process time):
real time           35:54.97
user cpu time       4:18.97
system cpu time     6:30.08
memory              1072251.18k
OS Memory           1095100.00k
Timestamp           05/17/2021 12:22:17 PM
Step Count                        41  Switch Count  469
Page Faults                       940
Page Reclaims                     33418374
Page Swaps                        0
Voluntary Context Switches        5126600
Involuntary Context Switches      28234
Block Input Operations            434723184
Block Output Operations           473226072

Same process, executed at a different time of the day:

NOTE: PROCEDURE SQL used (Total process time):
real time           6:11.66
user cpu time       3:43.92
system cpu time     4:07.77
memory              1086770.84k
OS Memory           1110844.00k
Timestamp           05/14/2021 10:14:16 PM
Step Count                        125  Switch Count  131
Page Faults                       482
Page Reclaims                     9405378
Page Swaps                        0
Voluntary Context Switches        4417528
Involuntary Context Switches      1862
Block Input Operations            17269416
Block Output Operations           473248224

Our server is Linux with 9.4. Could you please give me some advice on how to interpret the details of the log?

15 REPLIES 15
Tom
Super User Tom
Super User

Are those queries actually the same?

The first one is taking longer, but it is also doing much more INPUT I/O, so that is probably why it takes longer.

MariaD
Barite | Level 11

Hi @Tom,

 

Are exactly the same process, with almost the same number of rows (something between 28 to 29 million) on the final SAS table. 

 

The user combined five tables, all SAS tables. Four of them are in the WORK area and one in the user area. The result is saved on WORK too. The area user is almost full. 

 

Regards, 

Kurt_Bremser
Super User
If that is all one SQL query, you may gain a lot by dissecting it into several SORT/DATA steps.
Especially if some of the joins are lookups that can be done with hash objects.
MariaD
Barite | Level 11

Hi,

 

I agree with you but, considering the process is exactly the same, I'm concern about the difference in the performance.

 

Regards,

MariaD
Barite | Level 11

Thanks! I'll try to verify it. 

nathan_dye
Calcite | Level 5

MariaD,

 

You said, "The area user is almost full," and I'll assume you mean what I've heard called SASUSER (not SASWORK, not UTILLOC)

.

Most filesystems suffer degrading performance as the filesystem structure fills.  Some filesystems slow at 60% full, some filesystems are still happy above 95% full.

 

Since I saw no one else mention it . . . .

 

(The other replies are correct to call out "shared" storage, network, CPU, RAM, and so on.  "Noisy neighbors" and so on.)

 

G'luck!

MariaD
Barite | Level 11

Thanks @nathan_dye. It's correct, the area almost full is the user area (not SASUSER but other custom libname). Today, I know that the specific area is more than 99% usage. The table in that area is read during the process, not write.

Kurt_Bremser
Super User
At high usage rates, filesystems will experience a lot of fragmentation, which will cause all accesses to slow down, specifically when you still use spinning metal instead of SSD's.
SASKiwi
PROC Star

The job performance statistics indicate that it performs adequately at off-peak times - user cpu time is less than double elapsed time. However at peak times your system appears to be severely IO constrained and performs very poorly. Reviewing your Environment Manager performance dashboards may be able to provide further insights. It would be worth reviewing % CPU usage and % memory usage as well as IO at different time periods.   

Kurt_Bremser
Super User
The more-or-less same CPU times combined with the very different real times let me suspect that your I/O subsystem can't keep up with the load of concurrent requests at certain times.
MargaretC
SAS Employee

I agree that it seems to be a IO throughput issue to storage.

What OS are you running on?  There are tools to let you know if you are having IO throughput issues.

 

MargaretC
SAS Employee

Changes in real time happen when run in a virtual environment when other applications that are sharing the resources your SAS jobs.  For instance, shared storage can be affected if a backup happens when you are running your SAS job.  Or if your SAS WORK is on a shared storage device, other SAS jobs doing lots of writes to SAS WORK will impact your SAS job.  

 

Sajid01
Meteorite | Level 14

Hello @MariaD 
The first process is running at noon, peak business hour and the other is executing at late night.
This is to be expected. Looks like there is a data access (I/O) appears to be the limiting factor.
CPU and memory appear to be OK.
Please see the analysis below.  

process_comp.PNG

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 1515 views
  • 3 likes
  • 7 in conversation