BookmarkSubscribeRSS Feed
clq
Calcite | Level 5 clq
Calcite | Level 5

I know that where can be used in both DATA and PROC step, whereas if can only be used in DATA step. However, what is the difference between where and if? 

6 REPLIES 6
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

The where statement can be used in procedures to subset data.  The if statement cannot be used in procedures.

The where statement is more efficient than the if statement. It tells SAS not to read all observations from the data set.

ChrisNZ
Tourmaline | Level 20

@VDD The where statement is more efficient than the if statement. It tells SAS not to read all observations from the data set.

 

Not always. When functions are used, IF is faster than WHERE.

 

This has to do with functions being optimised to access PDV data, but not so fast when reading from the read buffer.

If someone has more details I'd be curious to know more actually.

 

So   if A=1;   is slower than where A=1;

 

But   if round(A)=1;   is faster than  where round(A)=1;

 

data TEST;
  do I=1 to 1e8;
    output;
  end;
run;
   
sasfile TEST load;
                                    
data _null_; set TEST; if          I =0; run;  *  4 seconds;
data _null_; set TEST; where       I =0; run;  *  2 seconds;
data _null_; set TEST; if    round(I)=0; run;  *  5 seconds;
data _null_; set TEST; where round(I)=0; run;  * 11 seconds;

sasfile TEST close;

 

Reeza
Super User

http://www2.sas.com/proceedings/sugi31/238-31.pdf

 

And many, many more:

https://www.lexjansen.com/search/searchresults.php?q=where%20vs%20if

 


@clq wrote:

I know that where can be used in both DATA and PROC step, whereas if can only be used in DATA step. However, what is the difference between where and if? 


 

Kurt_Bremser
Super User

Also, if you have multiple datasets coming in (merge), a where statement may not be possible:

data sexes;
set sashelp.class;
keep name sex;
run;

data others;
set sashelp.class (drop=sex);
run;

data want;
merge
  others
  sexes
;
by name;
where age = 16;
run;

but the if will work:

data want;
merge
  others
  sexes
;
by name;
if age = 16;
run;

If you need to build a combined condition using variables from multiple datasets that are not present in all, then the subsetting if statement is the only option.

ballardw
Super User

WHERE statements or data set option can only reference variables that appear in the incoming data set(s), If can use newly created variables.

 

Example: The first data step generates an error, the second doesn't.

data junk;
   set sashelp.class;
   ns= catt(name,sex);
   where ns in ('AlfredM' 'CarolF');
run;

data junk2;
   set sashelp.class;
   ns= catt(name,sex);
   if ns in ('AlfredM' 'CarolF');
run;
hak18
Calcite | Level 5

When SAS reads the row of a data into PDV, two variables will be created i.e. the FIRST.variable and LAST.variable.
These variables are not in the input table . Therefore, you can't use the WHERE statement to subset rows of variables which are not yet
in the input data. Instead, you can use the subsetting IF statement.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 5883 views
  • 5 likes
  • 7 in conversation