BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Rhodochrosite | Level 12

Hi Everyone,

I am just curious which method is fast:

1 - Put multiple (If else then) (If else then)  in 1 data step

2 - Using multiple data step and each contains only 1 (If else then) 

 

Or in general when combining multiple operations in 1 data step is faster than breaking into multiple data steps.

Thank you for your advice.

HHC

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

In addition to execution speed there are several things to consider.

If you create a new named data set for each "if" block then you are duplicating data. With largish sets in some work environments that mean you might run out of your allotted work space. If keep using the same named data set then you have the case of corrupting values or losing records if one of the sets has a logic problem. Which means that you have to go back to a prior step as this would completely replace the existing data. Another issue with same named sets is that if a minor code change is needed you have to rerun multiple sets of code.

 

At one time I inherited code that had a number of blocks of "if/then/else" separated out as you describe. By some minor changes, such as use of ARRAYS and SELECT/WHEN instead of If/then I reduced a program from about 10,000 lines of code (that frankly was a bit hard to follow for logic) to about 800 lines.

 

If you find a need for multiple if/then/else if/ then/ else all using a single variable or expression you may want to research the SELECT/WHEN coding block

View solution in original post

5 REPLIES 5
Astounding
PROC Star
Using a single data step will always be faster. No comparison! Why not test it yourself?

Here's a program that let's you create a large data set for testing purposes. It's easy to adjust the size of the data set as needed:

data want;
do var1 = 1 to 10000000;
Output;
end;
stop;
retain var2-var500 0;
run;
SASKiwi
PROC Star

One of the golden rules for efficient SAS programming is to minimise the number of passes through your data and that applies to both DATA and PROC steps. So that means the less DATA and PROC steps you have the more efficient they will be.

ballardw
Super User

In addition to execution speed there are several things to consider.

If you create a new named data set for each "if" block then you are duplicating data. With largish sets in some work environments that mean you might run out of your allotted work space. If keep using the same named data set then you have the case of corrupting values or losing records if one of the sets has a logic problem. Which means that you have to go back to a prior step as this would completely replace the existing data. Another issue with same named sets is that if a minor code change is needed you have to rerun multiple sets of code.

 

At one time I inherited code that had a number of blocks of "if/then/else" separated out as you describe. By some minor changes, such as use of ARRAYS and SELECT/WHEN instead of If/then I reduced a program from about 10,000 lines of code (that frankly was a bit hard to follow for logic) to about 800 lines.

 

If you find a need for multiple if/then/else if/ then/ else all using a single variable or expression you may want to research the SELECT/WHEN coding block

hhchenfx
Rhodochrosite | Level 12

Thank you All for the information.

HHC

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1966 views
  • 7 likes
  • 5 in conversation