BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
DUBS
Fluorite | Level 6

I got question 13 from the practice exam because I filtered the data with the statement "where group in ('A','B');"  instead of "if group in ('A','B');" .  Can you please help me to understand the difference between the IF and WHERE statements and when each one should be used? 

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

There are many cases where either WHERE or IF could be used, and the results would be the same.  There are some cases where the results would be different.  If you can share the text of the question, it might help us explain why using a WHERE might have been wrong for this example.

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

View solution in original post

5 REPLIES 5
antonbcristina
SAS Employee

I like to think of the subsetting WHERE statement like a pre-processing of the data, and the subsetting IF statement as a post-processing of the data. You can only filter using WHERE on variables that already exist in the Program Data Vector (PDV) at the beginning of the DATA step. To filter on any variables created in the DATA step, you must use IF. The other important thing to note is that the IF statement can only be used inside of the DATA step, whereas the WHERE statement can be used in both DATA and PROC steps. 

 

Tom
Super User Tom
Super User

Let's fix this misstatement:

You can only filter using WHERE on variables that already exist in the Program Data Vector (PDV)

You can only filter using WHERE on variables that already exist in the DATASET being read.

 

That is because the WHERE filters the data BEFORE it is read into the PDV.

 

antonbcristina
SAS Employee
That's right @Tom, thanks for clarifying!
Kurt_Bremser
Super User

WHERE is handled by the part of SAS reading the dataset(s), that's why you can (and have to) use SQL syntax, and can only use variables present in incoming datasets. It is also the reason why WHERE can be used in any procedure that reads data from datasets, without involvement of the DATA step compiler.

The subsetting IF is part of the DATA step, can use all variables in the PDV (incoming and newly created), but is not available anywhere else.

Quentin
Super User

There are many cases where either WHERE or IF could be used, and the results would be the same.  There are some cases where the results would be different.  If you can share the text of the question, it might help us explain why using a WHERE might have been wrong for this example.

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

LIBNAME 101

Follow along as SAS technical trainer Dominique Weatherspoon expertly answers all your questions about SAS Libraries.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 4577 views
  • 15 likes
  • 5 in conversation