I got question 13 from the practice exam because I filtered the data with the statement "where group in ('A','B');" instead of "if group in ('A','B');" . Can you please help me to understand the difference between the IF and WHERE statements and when each one should be used?
There are many cases where either WHERE or IF could be used, and the results would be the same. There are some cases where the results would be different. If you can share the text of the question, it might help us explain why using a WHERE might have been wrong for this example.
I like to think of the subsetting WHERE statement like a pre-processing of the data, and the subsetting IF statement as a post-processing of the data. You can only filter using WHERE on variables that already exist in the Program Data Vector (PDV) at the beginning of the DATA step. To filter on any variables created in the DATA step, you must use IF. The other important thing to note is that the IF statement can only be used inside of the DATA step, whereas the WHERE statement can be used in both DATA and PROC steps.
Let's fix this misstatement:
You can only filter using WHERE on variables that already exist in the Program Data Vector (PDV)
You can only filter using WHERE on variables that already exist in the DATASET being read.
That is because the WHERE filters the data BEFORE it is read into the PDV.
WHERE is handled by the part of SAS reading the dataset(s), that's why you can (and have to) use SQL syntax, and can only use variables present in incoming datasets. It is also the reason why WHERE can be used in any procedure that reads data from datasets, without involvement of the DATA step compiler.
The subsetting IF is part of the DATA step, can use all variables in the PDV (incoming and newly created), but is not available anywhere else.
There are many cases where either WHERE or IF could be used, and the results would be the same. There are some cases where the results would be different. If you can share the text of the question, it might help us explain why using a WHERE might have been wrong for this example.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Follow along as SAS technical trainer Dominique Weatherspoon expertly answers all your questions about SAS Libraries.
Find more tutorials on the SAS Users YouTube channel.