DATA Step, Macro, Functions and more

Missing Values Generated Note in LOG file

Reply
Occasional Contributor
Posts: 10

Missing Values Generated Note in LOG file

I have such code and a warning in log

 

data test;
c='';
a = .;
b = 7;
if NOT MISSING(c) and (a+b)>6 then l = 'A';
run;

 

but when i use another code whith a small change, I have no warning in log file

 

data test;
c='';
a = .;
b = 7;
if c^='' and (a+b)>6 then l = 'A';
run;

 

Super User
Posts: 7,782

Re: Missing Values Generated Note in LOG file

I guess you don't get a WARNING, but a NOTE about the operation on missing values?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 10

Re: Missing Values Generated Note in LOG file

Posted in reply to KurtBremser

Yes, you are right

Super User
Posts: 5,505

Re: Missing Values Generated Note in LOG file

I have a guess, but can't confirm it.  You would need someone from a development team at SAS to confirm or deny ...

 

When SAS sees two conditions joined by AND, it has options as to which one it evaluates first.

 

In the top code:  since the first condition uses a function, SAS "figures" that it might be speedier to start by evaluating the second condition.  It tries to add a+b, and generates the missing value with the accompanying note.

 

In the bottom code:  since the first condition is a simple comparison, SAS evaluates that first.  Since it is false, there is no need to try to add a+b.  Once the first condition is false, the entire compound condition must be false.

Occasional Contributor
Posts: 10

Re: Missing Values Generated Note in LOG file

Posted in reply to Astounding

So do SAS ignores conditions joined by AND when first condition in FALSE?

Super User
Posts: 5,505

Re: Missing Values Generated Note in LOG file

That's right.  Doesn't it make sense?  Once a false condition is found, there is no need to check the other conditions.

 

Here's a program you can run to verify this.  Note how long it takes to run each DATA step:

 

data _null_;

do i=1 to 100000000;

   if 1=1 and 5=4 then x=1;

end;

run;

data _null_;

do i=1 to 10000000;

   if 1=2 and 5=4 then x=1;

end;

run;

 

Since the first DATA step has to check both conditions, it will take longer.  Since the second DATA step only needs to check one condition, it will be faster.

Super User
Posts: 7,782

Re: Missing Values Generated Note in LOG file


Raffik wrote:

So do SAS ignores conditions joined by AND when first condition in FALSE?


This type of optimization is used in all programming environments. The decision which condition is evaluated first will differ of course between, say, SAS and a C compiler.

You can even help with this optimization by explicitly forcing the sequence of evaluation by the way you code the conditions, as has already been stated in another post.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 10,028

Re: Missing Values Generated Note in LOG file

It due to missing value in either A or B, Try SUM() to avoid it.

if NOT MISSING(c) and sum(a,b)>6 then l = 'A';


Super User
Super User
Posts: 7,050

Re: Missing Values Generated Note in LOG file

[ Edited ]

Looks like SAS is smart enough to know how to "short-circuit" logic evaluation when using normal operators like = or ^=, but it does not try to do it when you using the MISSING() function.  If you want it to always short-circuit the logic then code it that way yourself.

IF not missing(c) THEN
  IF (a+b)>6 THEN put 'FOUND'
;

 

Occasional Contributor
Posts: 10

Re: Missing Values Generated Note in LOG file

[ Edited ]

What about this one,

in that case I have Note in log

data test;
c='';
a = .;
b = 7;
if NOT MISSING(c) and (a+b)>6 then l = 'A';
run;

 

In this code I have no Note

 

data test;
c='';
a = .;
b = 7;
if MISSING(c)=0 and (a+b)>6 then l = 'A';
run;

Super User
Posts: 11,343

Re: Missing Values Generated Note in LOG file

data test;
c='';
a = .;
b = 7;
if MISSING(c)=0 and (a+b)>6 then l = 'A';
run;

 

MISSING(c) is True so returns 1 as the value, compared to 0 is FALSE so the comparisons stop  and do not execute teh remainder of the line.

You also need to consider the behavior of (a+b). If either a or b are missing the result is missing so you never have anything > 6 .

Look at this for a slightly larger example:

data test;
   input a b c $;
   if MISSING(c)=0 and (a+b)>6 then l = 'A';
   if MISSING(c)=0 and sum(a,b)>6 then l2 = 'A';
datalines;
. 7 .
. 7 b
1 1 1
8 1 1
8 1 .1
8 . 1
1 2 .
. 2 .
;
run;
Ask a Question
Discussion stats
  • 10 replies
  • 326 views
  • 6 likes
  • 6 in conversation