I have such code and a warning in log
data test;
c='';
a = .;
b = 7;
if NOT MISSING(c) and (a+b)>6 then l = 'A';
run;
but when i use another code whith a small change, I have no warning in log file
data test;
c='';
a = .;
b = 7;
if c^='' and (a+b)>6 then l = 'A';
run;
I guess you don't get a WARNING, but a NOTE about the operation on missing values?
Yes, you are right
I have a guess, but can't confirm it. You would need someone from a development team at SAS to confirm or deny ...
When SAS sees two conditions joined by AND, it has options as to which one it evaluates first.
In the top code: since the first condition uses a function, SAS "figures" that it might be speedier to start by evaluating the second condition. It tries to add a+b, and generates the missing value with the accompanying note.
In the bottom code: since the first condition is a simple comparison, SAS evaluates that first. Since it is false, there is no need to try to add a+b. Once the first condition is false, the entire compound condition must be false.
So do SAS ignores conditions joined by AND when first condition in FALSE?
That's right. Doesn't it make sense? Once a false condition is found, there is no need to check the other conditions.
Here's a program you can run to verify this. Note how long it takes to run each DATA step:
data _null_;
do i=1 to 100000000;
if 1=1 and 5=4 then x=1;
end;
run;
data _null_;
do i=1 to 10000000;
if 1=2 and 5=4 then x=1;
end;
run;
Since the first DATA step has to check both conditions, it will take longer. Since the second DATA step only needs to check one condition, it will be faster.
@Raffik wrote:
So do SAS ignores conditions joined by AND when first condition in FALSE?
This type of optimization is used in all programming environments. The decision which condition is evaluated first will differ of course between, say, SAS and a C compiler.
You can even help with this optimization by explicitly forcing the sequence of evaluation by the way you code the conditions, as has already been stated in another post.
It due to missing value in either A or B, Try SUM() to avoid it. if NOT MISSING(c) and sum(a,b)>6 then l = 'A';
Looks like SAS is smart enough to know how to "short-circuit" logic evaluation when using normal operators like = or ^=, but it does not try to do it when you using the MISSING() function. If you want it to always short-circuit the logic then code it that way yourself.
IF not missing(c) THEN
IF (a+b)>6 THEN put 'FOUND'
;
What about this one,
in that case I have Note in log
data test;
c='';
a = .;
b = 7;
if NOT MISSING(c) and (a+b)>6 then l = 'A';
run;
In this code I have no Note
data test;
c='';
a = .;
b = 7;
if MISSING(c)=0 and (a+b)>6 then l = 'A';
run;
data test;
c='';
a = .;
b = 7;
if MISSING(c)=0 and (a+b)>6 then l = 'A';
run;
MISSING(c) is True so returns 1 as the value, compared to 0 is FALSE so the comparisons stop and do not execute teh remainder of the line.
You also need to consider the behavior of (a+b). If either a or b are missing the result is missing so you never have anything > 6 .
Look at this for a slightly larger example:
data test; input a b c $; if MISSING(c)=0 and (a+b)>6 then l = 'A'; if MISSING(c)=0 and sum(a,b)>6 then l2 = 'A'; datalines; . 7 . . 7 b 1 1 1 8 1 1 8 1 .1 8 . 1 1 2 . . 2 . ; run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.