DATA Step, Macro, Functions and more

How to make SAS faster?

Reply
New Contributor
Posts: 2

How to make SAS faster?

Hi, 

 

I am running simple data step with WHERE statement. 

Since the source data is enormous(10million rows), it takes time forever.

However, I realized that my computer use only 30% of RAM (out of 16GB), and 7% of CPU.

SASV9.cfg has been modifed but still use the same capacity. 

 

Is there anyway to increase the computer resource use for SAS so that make the process faster?

 

Thank you. 

 

 

Super User
Posts: 3,260

Re: How to make SAS faster?

Since your processing is neither memory or CPU constrained, your conclusion should be that it is IO constrained - your hard drive is maxing out. If you are running this on a PC then installing an SSD (Solid State Drive) and putting your SAS data on that as well as your SAS WORK folder should speed things up.

Super User
Posts: 5,437

Re: How to make SAS faster?

10 million roes doesn't sound that large, so if it's simple data step processing, I think that your problems are HW related.

If you access a fairly small subset, consider creating an index on your data set.

You could try increase the MEMSIZE SAS system option, but that will probably not make any difference for sequential processing such as data steps.

To look into more detail of your resource consuption, use

options msglevel=i fullstimer;
Data never sleeps
Super User
Posts: 11,343

Re: How to make SAS faster?

What does your where clause look like? Separate where statement or data set option?

How many variables are in the data set? Are they all needed? Dropping variables may reduce disk IO somewhat.

PROC Star
Posts: 1,760

Re: How to make SAS faster?

There many ways to speed up I/Os.

 

Storage format (SPDE engine vs V9 engine, compression type) has a huge influence.

An SPDE library with binary compression could speed things up dramatically.

 

Other issues since you have a where clause are the table sort order and the indexes.

 

Also, maybe a IF clause would be faster than a WHERE clause. See here:

https://communities.sas.com/t5/Base-SAS-Programming/Using-MERGE-with-LIKE-command/m-p/301654#M63878

 

Another critical factor is the way you retrieve your data in terms of memory utilisation.

The data set's buffer size, the number of buffers used, whether or not Windows's Direct-IO is used all have a huge effect.

To illustrate this point, here is a benchmark example taken from this book about making SAS faster:

 

https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490

 

 

Table 5.5: Run times for various values of BUFSIZE, BUFNO and SGIO for a 10,000 MB data set

10,000,000 kB

 

BUFNO

1

5

25

100

500

BUFSIZE

SGIO

195.45

.

.

.

.

0

no

4k

no

235.25

236.25

235.56

239.78

235.39

yes

1177.55

572.32

157.28

93.60

93.96

8k

no

163.77

162.05

164.22

164.05

161.34

yes

542.85

268.86

95.45

81.07

80.84

16k

no

138.03

140.26

136.16

137.49

136.11

yes

265.46

142.55

75.28

74.61

81.70

32k

no

143.70

148.85

137.96

147.67

143.98

yes

140.56

93.84

73.35

73.33

75.48

64k

no

180.47

171.39

166.63

151.69

173.16

yes

88.62

70.14

71.62

72.00

.

128k

no

235.25

236.25

235.56

239.78

235.39

yes

70.72

71.30

71.34

70.33

.

 

 The default run time is 195 seconds, which can be reduced to 70s by simply changing the memory usage settings.

Super User
Posts: 10,044

Re: How to make SAS faster?

It is best for multi-thread PROC DS2.



data test; 
do x=1 to 12316;
 j=x*x; 
output; 
end; 
run; 
 
proc ds2; 
  thread newton/overwrite=yes; 
    dcl double y count; 
    dcl bigint thisThread;
    drop count; 
    method run(); 
      set test; 
      /*  if  x=1 ; */
      thisThread=_threadid_; 
      count+1;
    end; 
    method term();
      put '**Thread' _threadid_ 'processed'  count 'rows:';
    end;
  endthread; 
  run; 
quit; 
 
proc ds2; 
  data want; 
    dcl thread newton frac; 
    method run(); 
         set from frac threads=3; 
    end; 
  enddata;
  run; 
quit;
PROC Star
Posts: 1,760

Re: How to make SAS faster?

[ Edited ]

@Ksharp I am pretty sure this would slow down the process considering the data is not spread onto different nodes.

I can't benchmark atm, but if you can...

Super User
Posts: 10,044

Re: How to make SAS faster?


Chris,
No. I can't. This is just intuitive to me . Multiple sessions is best for WHERE / IF situation.

PROC Star
Posts: 1,760

Re: How to make SAS faster?

OK, I had time for a quick run.

 

DS2 improves things slightly because my storage is fast enough (or my CPU cores slow enough) to be CPU-bound for single-table processing.

This doesn't change whether there is a IF/WHERE subset or not, the pattern is the same.

The 10% elapse time reduction happens at a serious CPU cost (in this case, CPU usage doubles).

 

I still suspect that throwing more CPU at a process that is not CPU-bound would not help, and that DS2 threads would seriously degrade performance on slower storage as the I/Os would be more spread.

 

My numbers:

 


options nofullstimer compress=no;

data HAVE; 
  length A $200;
  do I=1 to 1e8;
    output; 
  end; 
run; 
 
proc ds2; 
  thread NEWTON/overwrite=yes; 
    method run(); 
      set HAVE;
    end; 
  endthread; 
  run; 
quit; 
 
proc ds2; 
  data WANT; 
    dcl thread NEWTON FRAC; 
    method run(); 
      set from FRAC threads=3; 
    end; 
  enddata;
  run; 
quit;
%*      real time           27.38 seconds ;
%*      cpu time            51.44 seconds ;

data WANT2; 
  set HAVE;
run;
%*      real time           29.31 seconds ;
%*      cpu time            29.31 seconds ;
PROC Star
Posts: 1,760

Re: How to make SAS faster?

Different (slower, LIN64 instead of WIN64) server, (very) different results, but still no reason to use DS2 for this process.

 

NOTE: PROCEDURE DS2 used (Total process time):
      real time           1:03.27
      cpu time            1:05.44
    
NOTE: DATA statement used (Total process time):
      real time           1:05.45
      cpu time            34.15 seconds

 

Super User
Posts: 10,044

Re: How to make SAS faster?

Chris,
You are right. Sad for DS2 .


New Contributor
Posts: 2

Re: How to make SAS faster?

Hi All, 

 

First of all, thank you so much for your tips and advises. 

It was my first time to post my question here. 

 

I am a die-hard DATA step user. 

It took me an hour to process very simple Data Step with 1 WHERE statement w several ANDs.

I could not figure it out why the memory usage is so low. 

NOTE: DATA statement used (Total process time):
real time 1:01:47.84
user cpu time 27:13.95
system cpu time 1:49.85
memory 1540.78k
OS Memory 31988.00k

 

 

This is the time record when I used Proc SQL. 

This takes only about 6min which is way quicker than 1 hour with high Memory usage.

NOTE: PROCEDURE SQL used (Total process time):
real time 6:19.32
user cpu time 1:37.65
system cpu time 7.61 seconds
memory 637487.35k
OS Memory 668740.00k

 

I am not sure what makes this huge real time difference.....

 

Again, I appreciate for all the replies.

Thank you.  

Super User
Posts: 10,044

Re: How to make SAS faster?


PROC SQL is multi-session oriented PROC, maybe that is reason.
Also you can use OPTIONS MSGLEVEL=I ; to check if SQL used index too.

PROC Star
Posts: 1,760

Re: How to make SAS faster?

SQL has a slightly different profile than the data step (on the faster windows server).

10% longer elapse and 10% less cpu used.

 

NOTE: PROCEDURE SQL used (Total process time):
      real time           31.58 seconds
      cpu time            25.55 seconds

 

It would be interesting to know more.

Index usage is indeed the first thing to come to mind for such a dramatic improvement, though there is no reason only SQL should use the index.

 

Super User
Posts: 5,437

Re: How to make SAS faster?

If we were to make anything out of that you need to attach the code...And Perhaps a proc contents.
Again, data steps usually process one observation at a time and doesn't utilize memory much.
SQL on the other hand doesn't need to preserve observations order so I guess it can procees many observations at the same time.
Data never sleeps
Ask a Question
Discussion stats
  • 14 replies
  • 923 views
  • 0 likes
  • 6 in conversation