DATA Step, Macro, Functions and more

"Where" and "If" statements gave different results?

Reply
Contributor RVA
Contributor
Posts: 21

"Where" and "If" statements gave different results?

I wanted to subset data from a data set in an Oracle data base.  "Where" and "If" clauses then gave me different results.  Please see below:

data a; set odsstu.registration_course;

WHERE grade_type not in ('A','X');

run;

data b; set odsstu.registration_course;

IF grade_type not in ('A','X');

run;

In data set a, I got only records whose grade_type was not blank and whose grade_type was not A/X.  In data set b, I got records whose grade_type not A/X, including blank grade_type.  That means data set b had more records than data set a.

I know that WHERE applies before the data get to the PDV, and IF applies after the data get to the PDV.  However, I still cannot figure out why WHERE and IF gave me different results as described above.

The next step, I figured out that I needed to bring the Oracle data set into SAS first if I wanted to get the same results using WHERE.  See below:

data test; set odsstu.registration_course; run; /* bring "odsstu.registration_course" from Oracle to SAS */

data c; set test;

WHERE grade_type not in ('A','X');

run;

Now my data set c would have the same number of records as data set b above.  Does anyone have any ideas why this happened?  Many thanks for any ideas you can share.

Trusted Advisor
Posts: 1,231

Re: "Where" and "If" statements gave different results?

SAS and Oracle treat missing data differently. Null is not evaluated in comparison and Oracle ignores observations where field has Null values. While SAS interprets both Nulls and blanks coming from Oracle as missing values.


Trusted Advisor
Posts: 3,215

Re: "Where" and "If" statements gave different results?

@ stat@sas The difference is a bit more complicated DBMS systems are using a three value logic not binary true/false. The concept or that reason is not existing). http://en.wikipedia.org/wiki/Null_(SQL).   Within statistics it can be very logical to have different reasons a values is not applicable (multiple missing codings) eg not applicable/ refused to answer / failed to get to this question. 
Stasticians and SQL DBMS technical issues is often not the best combination.

---->-- ja karman --<-----
Super User
Posts: 10,047

Re: "Where" and "If" statements gave different results?

use the following options to see what SQL statement SAS have passed to ORACLE .

options sastrace=',,d,' sastraceloc=saslog nostsuffix;

options sastrace=' ,,d,' sastraceloc=saslog nostsuffix;

Super User
Super User
Posts: 7,083

Re: "Where" and "If" statements gave different results?

To get the same results from Oracle (or really almost any non-SAS database system) you need to be explicit about how to handle missing values.

To SAS  grade_type not in ('A','X') will include cases where GRADE_TYPE is null (missing).

Change your WHERE clause to

grade_type not in ('A','X') or grade_type is null

Ask a Question
Discussion stats
  • 4 replies
  • 280 views
  • 12 likes
  • 5 in conversation