BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
DUBS
Fluorite | Level 6

I got question 13 from the practice exam because I filtered the data with the statement "where group in ('A','B');"  instead of "if group in ('A','B');" .  Can you please help me to understand the difference between the IF and WHERE statements and when each one should be used? 

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

There are many cases where either WHERE or IF could be used, and the results would be the same.  There are some cases where the results would be different.  If you can share the text of the question, it might help us explain why using a WHERE might have been wrong for this example.

The Boston Area SAS Users Group is hosting free webinars!
Next up: Troy Martin Hughes presents Calling Open-Source Python Functions within SAS PROC FCMP: A Google Maps API Geocoding Adventure on Wednesday April 23.
Register now at https://www.basug.org/events.

View solution in original post

5 REPLIES 5
antonbcristina
SAS Super FREQ

I like to think of the subsetting WHERE statement like a pre-processing of the data, and the subsetting IF statement as a post-processing of the data. You can only filter using WHERE on variables that already exist in the Program Data Vector (PDV) at the beginning of the DATA step. To filter on any variables created in the DATA step, you must use IF. The other important thing to note is that the IF statement can only be used inside of the DATA step, whereas the WHERE statement can be used in both DATA and PROC steps. 

 

Tom
Super User Tom
Super User

Let's fix this misstatement:

You can only filter using WHERE on variables that already exist in the Program Data Vector (PDV)

You can only filter using WHERE on variables that already exist in the DATASET being read.

 

That is because the WHERE filters the data BEFORE it is read into the PDV.

 

antonbcristina
SAS Super FREQ
That's right @Tom, thanks for clarifying!
Kurt_Bremser
Super User

WHERE is handled by the part of SAS reading the dataset(s), that's why you can (and have to) use SQL syntax, and can only use variables present in incoming datasets. It is also the reason why WHERE can be used in any procedure that reads data from datasets, without involvement of the DATA step compiler.

The subsetting IF is part of the DATA step, can use all variables in the PDV (incoming and newly created), but is not available anywhere else.

Quentin
Super User

There are many cases where either WHERE or IF could be used, and the results would be the same.  There are some cases where the results would be different.  If you can share the text of the question, it might help us explain why using a WHERE might have been wrong for this example.

The Boston Area SAS Users Group is hosting free webinars!
Next up: Troy Martin Hughes presents Calling Open-Source Python Functions within SAS PROC FCMP: A Google Maps API Geocoding Adventure on Wednesday April 23.
Register now at https://www.basug.org/events.

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

LIBNAME 101

Follow along as SAS technical trainer Dominique Weatherspoon expertly answers all your questions about SAS Libraries.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 5786 views
  • 15 likes
  • 5 in conversation