BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bebess
Quartz | Level 8
i've tested the sort procedure having the table in my Work sesson ( local ) and i haven't notice an improvment of performance, i've still a big difference cpu and real time , the sort takes around 8 hours !
do you think it can improve performance if i use option SPDESORTSIZE , actually i can see the value is very low , what do you think ?

SPDESORTSIZE=33554432
Specifies the memory size that is used for sorting by the SPD Engine.
Kurt_Bremser
Super User

@bebess wrote:
i've tested the sort procedure having the table in my Work sesson ( local ) and i haven't notice an improvment of performance, i've still a big difference cpu and real time , the sort takes around 8 hours !


Which means that your storage is not up to the task. Get in touch with your SAS administrators, they need to improve the performance if you need to work with such data sizes regularly.

bebess
Quartz | Level 8

another test in a better env seems to show better performances 

 

NOTE: PROCEDURE SORT used (Total process time):

      real time           3:56:21.17

      user cpu time       1:13:29.12

      system cpu time     19:33.06

      memory              1059895.26k

      OS Memory           1080224.00k

      Timestamp           01/24/2022 01:56:51 PM

      Step Count                        3  Switch Count  22047

LinusH
Tourmaline | Level 20

Sorting - usually y9u need around 2-3 times the size of the table in your UTILLOC. Assuming it has the same location as your saswork?

2TB should be enough, have monitored during execution?

 

I can see that SQL "only" uses 1GB om RAM. If you have more available to you, maximize MEMSIZE and SORTSIZE options.

 

TAGSORT is resource effective, but it will take substantiable longer time than the default sort.

 

And I guess that you really need all input columns as output?

And the data is in a Base SAS library?

Consider using SPDE. It might not solve your current issue, but for large data sets it faster for many use cases. Then you should take a look at the SPDESORTSIZE option.

Data never sleeps
bebess
Quartz | Level 8

bebess_0-1642596997952.png

only this job was running on the server and i've seen the error in this time .

i need 90% of columns from the table but as i 'm also created new calculated variables so at the end it's like i am having same number of columns but with less observations . 

 

Yes it's a BASE SAS library , a basic SAS table

Kurt_Bremser
Super User

In an "ideal sort", you need three times the size of the table

  • original table
  • utility file
  • resulting table

If the original table is compressed (dataset or system option COMPRESS=yes), the utility file will be larger, sometimes MUCH larger (think of a compression ratio of 95%, the utility file will be 20 times as large). In this case, use the TAGSORT option of PROC SORT. Since this option is not available in SQL, you are better off running your summation in two steps:

  • Use PROC SORT with BY id list_of_var
  • Use PROC SUMMARY with the same BY

If there are more variables in your source dataset than those used in BY and for the summation, drop all other variables when sorting, and create an intermediary table:

proc sort
  data=xxx (keep=list_of_var var_n)
  out=yyy
  /* (compress=yes)  tagsort if long character variables exist */
;
by list_of_var;
run;

proc summary data=yyy;
by list_of_var;
var var_n;
output out=zzz sum()=;
run;

You also seem to have non-adequate storage or other CPU-consuming processes, recognizable here:

real time 1:08:47.47
user cpu time 11:37.53
system cpu time 2:51.71

Your real time is more than 4 times the CPU time, which points either to wait states caused by the storage, or contention for CPU resources.

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 1555 views
  • 3 likes
  • 5 in conversation