Twenty Ways to Run Your SAS® Program Faster and Use Less Space

1 Like

When we run SAS® programs that use large amounts of data or have complicated algorithms, we often are frustrated by the amount of time it takes for the programs to run and by the large amount of space required for the program to run to completion. In this 20-minute tutorial, you'll learn 20 techniques that can reduce the time and space required for a program without requiring an extended period of time for the modifications. They're a mixture of space-saving and time-saving techniques, and many are a combination of the two approaches. They do not require advanced knowledge of SAS, only a reasonable familiarity with Base SAS and a willingness to delve into the details of the programs.

Video highlights

02:18 - 4 efficiency approaches

03:06 - How to reduce the number of observations and variables

09:18 - Helpful sorting efficiencies

11:39 - How to reduce variable size

15:51 - How to reuse system space

18:35 - Advantage of using LIBNAME

Read the Paper

Related resources

Techniques for optimizing I/O (documentation)

Efficient Use of Disk Space in SAS Application Programs (2018 SESUG paper)

mkeintz · ‎07-17-2020

It's good to have all these techniques listed in a single paper. Thanks to Steven Sloan for putting them together.

I can think of a couple others that could be added:

hash objects as speedier alternatives for some formats and for some ordered IF THEN/ELSE IF sequences).
Faster Index for Sorted SAS® Dataset which demonstrates compressed indexes - this might be appropriate for rather specific situations.

But I'm actually writing this comment to suggest refraining from adopting advice #8:

8. Switch variables from numeric to character if they are integers and range
in value from -9 to 99. The minimum length for numeric variables is 3, so
you can save space if the variable can fit into one or two characters.

True, you can save space vs storing the variable as a numeric with minimum length 3. But it's not very much space (one byte per obs per qualifying variable).

More importantly, sort order is lost when there are negative values (i.e. "-1" is less than '-8"). And the same is true with irregular use of leading zeroes or blanks ("02" is less than "1 ", " 2" is less than "-1", and "20" is less than "3").

So an annoying amount of care (easily overlooked) needs to be taken in justification and formatting of the character-ized integers, just to make the lexicographic ordering of the character values replicate the order of the original numerics.

True, I imagine there are situations in which $2 conversions of integers are ok, like when the integers are nothing more than labels or id's, but I'm not convinced this is common enough to make this technique part of a list of generally useful efficiencies.

Ksharp · ‎07-18-2020

And using SPDE engine .

libname x spde 'c:\temp\';

Or could buy SPDSever