BookmarkSubscribeRSS Feed
HeatherNewton
Quartz | Level 8
data samp;
set samp;
if samp_ex="" then exclude='N';
else exclude='Y';
if samp_ex="";
run;

can you let me know if the statement "if samp_ex="": means the data step only runs if samp_ex="" is true, ie samp_ex is null is true...

is the data step the same if this if statement "if samp_ex" is place directly under "set samp;"

 

I am confused because:

1. I was thinking the statement after "if samp_ex="" " only runs when "if samp_ex="" is true. the statement after it is "run", does that mean the whole data step only runs when "if samp_ex="" is true, ie the statement if samp_ex="" then exclude='N'; else exclude='Y'; wouldnt run either if samp_ex is not null?

 

or if samp_ex="" then exclude='N'; else exclude='Y'; runs anyway as it is before "if sam_Ex-"""

 

2, I was told the "if samp_ex=""" or a where statement could be put anywhere in the data step and the data step only runs if the if statement is true or the where condition is fulfilled... 

 

please kindly assist, thanks.

also 

2 REPLIES 2
Tom
Super User Tom
Super User

You are using two different types of IF statements there.

First is the IF/THEN/ELSE form.

Second is the subsetting IF form.

 

If the condition in a subsetting IF is TRUE then the data step iteration continues to the rest of the statements in the step.  But if it is FALSE then the data step stops the current iteration.  Which means that the rest of the statements in the data step are skipped and control goes back to the start of the data step.  In your code that means the SET statement executes again and the next observation is processed.

 

That is the same thing that happens when you execute the DELETE statement.

 

So your subsetting IF statement:

if samp_ex=" ";

Is the same thing as this IF/THEN statement.

if not (samp_ex=" ") then delete;

 

So none of the observations where EXCLUDE was set to Y by the IF/THEN/ELSE series of statements will be written to the new version of the SAMP dataset.

 

Kurt_Bremser
Super User

Your code is equivalent to this:

data samp;
set samp;
where samp_ex="";
exclude='N';
run;

Since only observations that meet the condition will make it, there is no need to create EXCLUDE conditionally.

And since the subsetting IF is wholly dependent on variables from the incoming dataset, a WHERE is clearer to understand and performs better.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 729 views
  • 0 likes
  • 3 in conversation