Hi Everyone,
I am just curious which method is fast:
1 - Put multiple (If else then) (If else then) in 1 data step
2 - Using multiple data step and each contains only 1 (If else then)
Or in general when combining multiple operations in 1 data step is faster than breaking into multiple data steps.
Thank you for your advice.
HHC
In addition to execution speed there are several things to consider.
If you create a new named data set for each "if" block then you are duplicating data. With largish sets in some work environments that mean you might run out of your allotted work space. If keep using the same named data set then you have the case of corrupting values or losing records if one of the sets has a logic problem. Which means that you have to go back to a prior step as this would completely replace the existing data. Another issue with same named sets is that if a minor code change is needed you have to rerun multiple sets of code.
At one time I inherited code that had a number of blocks of "if/then/else" separated out as you describe. By some minor changes, such as use of ARRAYS and SELECT/WHEN instead of If/then I reduced a program from about 10,000 lines of code (that frankly was a bit hard to follow for logic) to about 800 lines.
If you find a need for multiple if/then/else if/ then/ else all using a single variable or expression you may want to research the SELECT/WHEN coding block
One of the golden rules for efficient SAS programming is to minimise the number of passes through your data and that applies to both DATA and PROC steps. So that means the less DATA and PROC steps you have the more efficient they will be.
A resounding yes for 1).
Create a sufficiently large (~ a million obs) dataset and apply Maxim 4.
In addition to execution speed there are several things to consider.
If you create a new named data set for each "if" block then you are duplicating data. With largish sets in some work environments that mean you might run out of your allotted work space. If keep using the same named data set then you have the case of corrupting values or losing records if one of the sets has a logic problem. Which means that you have to go back to a prior step as this would completely replace the existing data. Another issue with same named sets is that if a minor code change is needed you have to rerun multiple sets of code.
At one time I inherited code that had a number of blocks of "if/then/else" separated out as you describe. By some minor changes, such as use of ARRAYS and SELECT/WHEN instead of If/then I reduced a program from about 10,000 lines of code (that frankly was a bit hard to follow for logic) to about 800 lines.
If you find a need for multiple if/then/else if/ then/ else all using a single variable or expression you may want to research the SELECT/WHEN coding block
Thank you All for the information.
HHC
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.