i want know, what is the difference between these two programs.
1. data c;
set a;
where city = "london";
run;
2. data c;
set a;
if city = "london";
run;
In your example I would use option 1 as it can utilize an index on the city variable in dataset a (if such an index exists and would be beneficial). There are several articles on the differences between where and if and with more complex scenarios. A nice short one to get you started is SAS Usage Note 24286: When do I use a WHERE statement instead of an IF statement to subset a data se...
Personally I use where where I can and if if I have to 🙂
if both the programs are executing. then, why to use both.
Lets assume that city is some variable that you yourself have calculated further uo in your datastep. Then example 1 will not work, since city does not exist in the dataset in your set statement. Example 2 however will work, since this is a subsetting if statement.
In example 1 you utilize that dataset a contains the variable city.
You are not sure about that in example 2 where you are using a socalled "subsetting if statement". This means that you reach the implied output and return statements at the end of your datastep if that condition is true 🙂
if we use 'where statement'. then, it we read only one observation (where city = "london";), and if we use 'subsetting if statement'. then, it will read all the observations and gives the result.
If I understand your post correctly, you want to use BOTH the where statement and the subsetting if statement. This is not a good idea.
My rule of thumb:
If the variable city exists in your input dataset: Use where statement
If you created the variable city yourself: Use the Subsetting if statement.
When in doubt: Use the subsetting if statement 🙂
A developer shouldn't be in doubt.
WHERE is always preferred because of better performance.
IF should only be used for subsetting if WHERE isn't possible to use.
WHERE-- filter obs before it enter PDV.
IF -- filter obs after it enter PDV.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.