Is there any difference between coding something like:
If PAY in (1,2) then PAY_CAT = 1;
If PAY in (3) then PAY_CAT = 2;
If PAY in (.) then PAY_CAT = .;
Versus:
If PAY in (1,2) then PAY_CAT = 1;
Else if PAY in (3) then PAY_CAT = 2;
Else PAY_CAT = .;
I've seen various code go back and forth (seemingly) with this approach, and I'm wondering if there's any logical reason to use it in one situation vs the other? Or is that all it is, a coding style preference that depends on the coder's own "taste"?
I'm not talking about sub-setting a dataset, necessarily. Just when categorizing things into groups for further analysis.
With simple relatively clean data and equals comparisons there might not be.
The IF/Then/Else will stop processing with the first resulting "true" in the sequence. Separate IF means that each statement is compared.
But when inequalities are involved you can get drastically different results.
Consider:
data have; do x=1 to 6; output; end; run; data example; set have; if x < 3 then y=5; else if x<4 then y=6; else if x<6 then y=7; if x<3 then z=5; if x<4 then z=6; if x<6 then z=7; run;
The first data step is just to build some value.
The second uses the two different types of logic. See the difference in Y and Z.
The second set of "if" statements evaluates every single "if" and the last one in this case basically means that the first 2 Z assignments are overwritten.
The if/then/else performs better, as it reduces the amount of conditions that need to be evaluated.
Evaluation of conditions is among the most costly operations, CPU-wise.
In addition to its performance advantage, the second approach would make a difference if PAY_CAT already had a value other than . (ordinary missing) before the IF/THEN statements are executed and PAY not in (1, 2, 3, .): In this case that previous value of PAY_CAT would be overwritten with the missing value, whereas in the first approach it would persist.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.