How many observations will be there in output dataset temp?
data temp;
x=4;
if x=5 then do;
y=2;
output;
end;
run;
a) 1
b) 2
c) 0
d) 3
Ans: c
Is the reason for this answer that as OUTPUT has been invoked, output will only be given if the condition is true and never otherwise?
Correct. Explicit 'output' disables implicit 'output' at the end of the data step. And when Explicit 'output' is associate with a condition, then it depends on the condition whether there will be an output.
Here, output means "if x=5 then output the whole data set; else, do not output anything".
If you take away output, like this:
data temp;
x=4;
if x=5 then do;
y=2;
end;
run;
Then there is a new data set~
Quote: "Here, output means "if x=5 then output the whole data set; else, do not output anything"."
This statement is a little bit confusing. 'output' statement only works on ONE record that is residing in the PDV. It does not work on 'whole data set', per se. I know you may mean it has systematic effect, here is just to clarify.
Got it! Thank you so much for the clarification~~
Hi,
The title of the original post was "Reading raw files", which is not what the sample program shows. As an additional example, this program has 3 different steps, 1a, 1b and 2 and the values being read via DATALINES vary. In step 1a, the values being read in for X are never 5, so the output statement never gets executed so the first dataset has 0 obs (your original question) and there is nothing for the PROC PRINT to print.
Then, step 1b is the same program, but this time, the data being read in via DATALINES has values for X that range from 3 to 6. So for 1 observation and ONLY 1 observation the condition X=5 is true, so Y is assigned a value of 2 and ONLY the observation for X=5 is output-- so now, the output dataset has 1 obs.
Finally, in step 2, the values for X via DATALINES are the same as in the previous step, but, the OUTPUT statement is removed. As you can see in the results, only the row where X=5 has a value for Y. All the other rows have missing values for Y. But you can see that with the "explicit" OUTPUT statement removed, the implicit output at the bottom of the DATA step takes over. And so 4 datalines were read and 4 observations were written out.
cynthia
** here is the code;
** 1a) no values meet x=5 condition;
data temp_explicit_output_0rec;
infile datalines;
input x;
if x=5 then do;
y=2;
output;
end;
return;
datalines;
1
2
3
4
run;
proc print data=temp_explicit_output_0rec;
title '1a) value of X goes from 1 to 4';
title2 'No obs have x=5 so no obs are output and PROC PRINT fails';
run;
** 1b) have a value where x=5 condition will be true;
data temp_explicit_output;
infile datalines;
input x;
if x=5 then do;
y=2;
output;
end;
return;
datalines;
3
4
5
6
run;
proc print data=temp_explicit_output;
title '1b) value of X goes from 3 to 6';
title2 'explicit output if x=5 so only 1 obs is output';
run;
** 2) remove output statement and all 4 obs will be written;
** but only x=5 obs will have y=2;
data temp_implied_output;
infile datalines;
input x;
if x=5 then do;
y=2;
end;
return;
datalines;
3
4
5
6
run;
proc print data=temp_implied_output;
title '2) value of X goes from 3 to 6';
title2 'every obs is output, but obs for 5 has y=2';
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.