Re: Understand logic in where clause

Abishekaa · Posted 01-21-2022 12:06 PM

Hi all,

Could you please help me understand the logic of these 'where' clauses in SAS?

proc print data=test;
where (test1="No") ne (. ne test2<=2);
run;

proc print data=test;
where (test1="No") ne (test2<=2);
run;

Thanks in advance!

PaigeMiller · Posted 01-21-2022 12:12 PM

You don't tell us what you want. We can only say that SAS is doing exactly what you told it to do.

I will guess that you want either AND or OR instead of NE (but really, we should have to guess).

where (test1="No") and (test2<=2);

or

where (test1="No") or (test2<=2);

--
Paige Miller

Abishekaa · Posted 01-21-2022 12:27 PM

Hi Paige. Thank you! I couldn't wrap my head around the logic of using . and ne.

Eg: what would be subset if you used the statement: where (. ne test<2)? Does it subset records only where 'test<2' or where 'test2' and missing data?

Hope this helps!

PaigeMiller · Posted 01-21-2022 12:31 PM

@Abishekaa wrote:

Hi Paige. Thank you! I couldn't wrap my head around the logic of using . and ne.

Eg: what would be subset if you used the statement: where (. ne test<2)? Does it subset records only where 'test<2' or where 'test2' and missing data?

where (. ne test<2) tests to see if a missing value is not equal to (test<2) ,and test<2 is either 0 if test>=2 and 1 if test<2. So the test is if the missing value is not equal to 0 or 1. I'm sure that's not what you are trying to do, and if it is what you are trying to do, more intuitive and less complicated code would probably work.

--
Paige Miller

Abishekaa · Posted 01-21-2022 12:35 PM

Thanks for explaining 🙂 Could you please explain what would be the less complicated equivalent for these where clauses?

proc print data=test;
where (test1="No") ne (. ne test2<=2);
run;

proc print data=test;
where (test1="No") ne (test2<=2);
run;

Reeza · Posted 01-21-2022 12:50 PM

where (test1="No") ne (. ne test2<=2);

Two conditions:

Test1=No	. ne test2 <=2	(test1="No") ne (. ne test2<=2)
1	0	0
1	1	1
0	0	1
0	1	0

I think we'd have to know more about the opposite values (ie the 0/0 conditions) to be able to simplify this logic.

Tom · Posted 01-21-2022 03:13 PM

The SAS log will show you how it interpreted the WHERE statement.

716  data check;
1717    set test;
1718    where (test1="No") ne (. ne test2<=2) ;
1719  run;

NOTE: There were 3 observations read from the data set WORK.TEST.
      WHERE (test1='No') not = ((test2 not = .) and (test2<=2));

SAS will evaluate boolean expressions to 0 or 1.

So testing if two boolean values are not equal is like doing an exclusive or. (A or B but not both).

Now let's look at the meaning of this expression that SAS had decided the last clause means:

 ((test2 not = .) and (test2<=2))

SAS treats missing values as smaller than any actual numeric value. There are 28 possible missing values. The regular missing value represented by a period in SAS code. Plus special missing values ._ and .A thru .Z.

So basically that is true when TEST2 is smaller than 2 or has any special missing value.

Most likely they really didn't mean to include the special missings, but you would need to ask the author their intent.

So if you want to exclude all of the missing values also:

(test2<=2 and not missing(test2))

It might be easier to code as one of these

(.Z < test2 <=2)
(not (test2>2))

Understand logic in where clause