BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
a_k93
Fluorite | Level 6

Hi,

 

Can anybody explain the reason for, when the code colored orange is executed it returns 5 records. Whereas code colored blue is executed it returns all records i.e. 7 records (Name and TotalMarks are missing)

 

Data ds_grade;
    infile cards;
    input grades $6. @;
    *BELOW IF RETURNS 5 RECORDS;
    if grades = 'Grade1' or grades = 'Grade2';
        input name : $6. Total_Marks;
    
    *BELOW IF THEN RETURNS ALL 7 RECORDS BUT NAME AND MARKS FOR GRADE3 AND GRADE4 MISSING;
    /*
    if grades = 'Grade1' or grades = 'Grade2' then
        input name : $6. Total_Marks;
    */
    cards;
Grade1 abcdef 500
Grade3 pqrstu 375
Grade1 xyz 510
Grade2 abc123 430
Grade2 eeeee 450
Grade4 hhhh 270
Grade1 jjjjjj 505
;
run;

 

Thanks in advance.

Anand

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
SAS Super FREQ
Hi:
Imagine that you have slightly different data and each row is for a different country. When the value for country is US, then you need to read the data line one way and when the value for country is UK, then you need to read the data line another way.

Imagine data like this:
US,John,Doe,$35000,yearly
UK,Smythe,George,annual,40000
US,Jane,Smith,$36000,yearly
US,Elle,Vator,$39000,yearly
UK,Windsor,Marge,annual,50000

and you need to read each record, but clearly, you need a different INPUT statement to be used for each country. That is when you would use IF...THEN, as in:
data readit;
infile 'c:\temp\diffdata.txt' dlm=',' dsd;
input country $ @;
if country = 'US' then
input first $ last $ sal : comma. type $;
else if country = 'UK' then
input last $ first $ type $ sal;
run;

proc print data=readit;
var country last first sal type;
run;

cynthia

View solution in original post

10 REPLIES 10
RW9
Diamond | Level 26 RW9
Diamond | Level 26

The first if statement you have operates like a where clause:

 if grades = 'Grade1' or grades = 'Grade2';
        input name : $6. Total_Marks;

It is a bit like saying: only keep records where grades is Grade1 or Grade2.  Hence you get fewer records.  

The second if statement:

 if grades = 'Grade1' or grades = 'Grade2' then 
        input name : $6. Total_Marks;

Is considered to be a conditional read.  I.e. every row is read in, but the input only reads the data elements into the variables if the grades is Grade1 or Grade2.  

So the first only records rows where the if is true, the second reads all rows, but populates variables only when the condition is true.

data_null__
Jade | Level 19

@RW9 wrote:

The first if statement you have operates like a where clause:

 if grades = 'Grade1' or grades = 'Grade2';
        input name : $6. Total_Marks;

It is a bit like saying: only keep records where grades is Grade1 or Grade2.  Hence you get fewer records.  

 


That might be considered true in this example but subsetting IF and WHERE should not be considered synonymous.

data_null__
Jade | Level 19

The subsetting IF the one with NO THEN is like a gate.  When true the gate is closed and execution is "returned to the top of the implied data step loop".

Astounding
PROC Star

It's possible that I'm saying nothing new here ... just saying it differently.

 

As you have observed:

 

  • IF without THEN is totally different from IF/THEN.
  • IF without THEN deletes observations. 

Using IF without THEN, you would expect to find only observations having GRADES of "Grade1" or "Grade2".  All others would be deleted.

 

Using IF/THEN, you would expect to find all the observations that appear in the incoming data.  However, the extra variables (NAME and TOTAL_MARKS) would be populated only for the "Grade1" and "Grade2" observations.

 

It's perhaps unfortunate that both statements contain the word IF.  They have entirely different meanings.

 

Good luck.

a_k93
Fluorite | Level 6

Thanks Guys,

 

But what is the practical use of 2nd if statment (IF/THEN) as it is returning 7 observation in current case? In what situation you will go for 2nd if condition.

Cynthia_sas
SAS Super FREQ
Hi:
Imagine that you have slightly different data and each row is for a different country. When the value for country is US, then you need to read the data line one way and when the value for country is UK, then you need to read the data line another way.

Imagine data like this:
US,John,Doe,$35000,yearly
UK,Smythe,George,annual,40000
US,Jane,Smith,$36000,yearly
US,Elle,Vator,$39000,yearly
UK,Windsor,Marge,annual,50000

and you need to read each record, but clearly, you need a different INPUT statement to be used for each country. That is when you would use IF...THEN, as in:
data readit;
infile 'c:\temp\diffdata.txt' dlm=',' dsd;
input country $ @;
if country = 'US' then
input first $ last $ sal : comma. type $;
else if country = 'UK' then
input last $ first $ type $ sal;
run;

proc print data=readit;
var country last first sal type;
run;

cynthia
Astounding
PROC Star

Typical use of the IF/THEN statement would be where:

 

  • You want all observations in the data set
  • But the structure of the data is different for some.

For example, how would you read in data that looked like this:


    cards;
Grade1 abcdef 500
Grade3 No Data Available
Grade1 xyz 510
Grade2 abc123 430
Grade2 eeeee 450
Grade4 No Data Available
Grade1 jjjjjj 505
;

 

You woud want to read in NAME and TOTAL_MARKS, but that should only apply when GRADE is "Grade1" or "Grade2".

a_k93
Fluorite | Level 6

Thanks everybody for reply.

 

In short if we want to exclude few observation then we have to use if with semicolan,  but if we do not want to exclude any observation and take certain action basis certain condition than we use IF/THEN.

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

In this instance think of:

if <condition>;

 

as shorthand for:

if <condition> then output;

 

To be honest I would recommend the full text as it makes it clearer what it is doing, but thats just my preference.

data_null__
Jade | Level 19

I think your example would only be true if 

   IF <condition> THEN OUTPUT; 

is the very last statement in the dataset as subsetting IF has implied RETURN.

 

I think it would be more accurate to say 

  IF <condition>;

is similar in function to

  IF not(<condition>) then delete;

 

Both DELETE and the subsetting IF have the implied RETURN.

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 6157 views
  • 3 likes
  • 5 in conversation