DATA Step, Macro, Functions and more

Reading raw files

Reply
Contributor
Posts: 31

Reading raw files

How many observations will be there in output dataset temp?
data temp;
x=4;
if x=5 then do;
y=2;
output;
end;
run;
a) 1
b) 2
c) 0
d) 3
Ans: c

 

Is the reason for this answer that as OUTPUT has been invoked, output will only be given if the condition is true and never otherwise?

Respected Advisor
Posts: 3,124

Re: Reading raw files

Correct. Explicit 'output' disables implicit 'output' at the end of the data step. And when Explicit 'output' is associate with a condition, then it depends on the condition whether there will be an output.

Occasional Contributor
Posts: 17

Re: Reading raw files

Here, output means "if x=5 then output the whole data set; else, do not output anything".

 

If you take away output, like this:

 

data temp;
      x=4;
      if x=5 then do;
            y=2;
      end;
run;

 

Then there is a new data set~

 

Contributor
Posts: 31

Re: Reading raw files

Great! Thanks for adding up. I was thinking that the output will only be for that observation but now I know that the condition id for entire dataset. Smiley Happy

Also when you say (in the case where we take away output) that there is a new dataset, then the new dataset is temp? right?
Respected Advisor
Posts: 3,124

Re: Reading raw files

Quote: "Here, output means "if x=5 then output the whole data set; else, do not output anything"."

 

This statement is a little bit confusing. 'output' statement only works on ONE record that is residing in the PDV. It does not work on 'whole data set', per se. I know you may mean it has systematic effect, here is just to clarify.

Contributor
Posts: 31

Re: Reading raw files

Thanks again for the clarification Smiley Happy
Occasional Contributor
Posts: 17

Re: Reading raw files

Got it! Thank you so much for the clarification~~

SAS Super FREQ
Posts: 8,743

Re: Reading raw files

Hi,

  The title of the original post was "Reading raw files", which is not what the sample program shows. As an additional example, this program has 3 different steps, 1a, 1b and 2 and the values being read via DATALINES vary. In step 1a, the values being read in for X are never 5, so the output statement never gets executed so the first dataset has 0 obs (your original question) and there is nothing for the PROC PRINT to print.

 

  Then, step 1b is the same program, but this time, the data being read in via DATALINES has values for X that range from 3 to 6. So for 1 observation and ONLY 1 observation the condition X=5 is true, so Y is assigned a value of 2 and ONLY the observation for X=5 is output-- so now, the output dataset has 1 obs.

 

  Finally, in step 2, the values for X via DATALINES are the same as in the previous step, but, the OUTPUT statement is removed. As you can see in the results, only the row where X=5 has a value for Y. All the other rows have missing values for Y. But you can see that with the "explicit" OUTPUT statement removed, the implicit output at the bottom of the DATA step takes over. And so 4 datalines were read and 4 observations were written out.

 

cynthia

 

** here is the code;

 

** 1a) no values meet x=5 condition;
data temp_explicit_output_0rec;
   infile datalines;
   input x;
   if x=5 then do;
      y=2;
      output;
   end;
return;
datalines;
1
2
3
4
run;
 
proc print data=temp_explicit_output_0rec;
  title '1a) value of X goes from 1 to 4';
  title2 'No obs have x=5 so no obs are output and PROC PRINT fails';
run;
 
** 1b) have a value where x=5 condition will be true;
data temp_explicit_output;
   infile datalines;
   input x;
   if x=5 then do;
      y=2;
      output;
   end;
return;
datalines;
3
4
5
6
run;
 
proc print data=temp_explicit_output;
  title '1b) value of X goes from 3 to 6';
  title2 'explicit output if x=5 so only 1 obs is output';
run;
 
** 2) remove output statement and all 4 obs will be written;
** but only x=5 obs will have y=2;
data temp_implied_output;
   infile datalines;
   input x;
   if x=5 then do;
      y=2;
   end;
return;
datalines;
3
4
5
6
run;
 
proc print data=temp_implied_output;
  title '2) value of X goes from 3 to 6';
  title2 'every obs is output, but obs for 5 has y=2';
run;

Ask a Question
Discussion stats
  • 7 replies
  • 282 views
  • 3 likes
  • 4 in conversation