BookmarkSubscribeRSS Feed
RVA
Fluorite | Level 6 RVA
Fluorite | Level 6

I wanted to subset data from a data set in an Oracle data base.  "Where" and "If" clauses then gave me different results.  Please see below:

data a; set odsstu.registration_course;

WHERE grade_type not in ('A','X');

run;

data b; set odsstu.registration_course;

IF grade_type not in ('A','X');

run;

In data set a, I got only records whose grade_type was not blank and whose grade_type was not A/X.  In data set b, I got records whose grade_type not A/X, including blank grade_type.  That means data set b had more records than data set a.

I know that WHERE applies before the data get to the PDV, and IF applies after the data get to the PDV.  However, I still cannot figure out why WHERE and IF gave me different results as described above.

The next step, I figured out that I needed to bring the Oracle data set into SAS first if I wanted to get the same results using WHERE.  See below:

data test; set odsstu.registration_course; run; /* bring "odsstu.registration_course" from Oracle to SAS */

data c; set test;

WHERE grade_type not in ('A','X');

run;

Now my data set c would have the same number of records as data set b above.  Does anyone have any ideas why this happened?  Many thanks for any ideas you can share.

4 REPLIES 4
stat_sas
Ammonite | Level 13

SAS and Oracle treat missing data differently. Null is not evaluated in comparison and Oracle ignores observations where field has Null values. While SAS interprets both Nulls and blanks coming from Oracle as missing values.


jakarman
Barite | Level 11

@ stat@sas The difference is a bit more complicated DBMS systems are using a three value logic not binary true/false. The concept or that reason is not existing). http://en.wikipedia.org/wiki/Null_(SQL).   Within statistics it can be very logical to have different reasons a values is not applicable (multiple missing codings) eg not applicable/ refused to answer / failed to get to this question. 
Stasticians and SQL DBMS technical issues is often not the best combination.

---->-- ja karman --<-----
Ksharp
Super User

use the following options to see what SQL statement SAS have passed to ORACLE .

options sastrace=',,d,' sastraceloc=saslog nostsuffix;

options sastrace=' ,,d,' sastraceloc=saslog nostsuffix;

Tom
Super User Tom
Super User

To get the same results from Oracle (or really almost any non-SAS database system) you need to be explicit about how to handle missing values.

To SAS  grade_type not in ('A','X') will include cases where GRADE_TYPE is null (missing).

Change your WHERE clause to

grade_type not in ('A','X') or grade_type is null

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1072 views
  • 14 likes
  • 5 in conversation