What's the difference between the results from two data step in the following?
data s;
do n=1 to 2;
set sashelp.class;
end;
run;
data s;
set sashelp.class;
set sashelp.class;
output;
run;
The is because the sas compiler sets up one data stream for each instance of a SET statement. In the first program there is one set statement, therefore one stream. It's executed twice per each iteration of the data step, giving you observation numbers 2,4,6,8,10,12,14,16, and 18 (9 obs).
In the second program there are two streams, each executed once per iteration of the data step. In your example, in which both streams come from the same data source (and therefore have common variables), the second stream values overwrite the values obtained from the first stream.
BTW, while two SET statements invoke two data streams, two INPUT statements read from the same raw data stream.
the difference is as follows;
why?
the why is because the code is incorrect.
do this if you want to dupe the table rows
data s;
set sashelp.class sashelp.class;
output;
run;
117
118 data s;
119 set sashelp.class sashelp.class;
120 output;
121 run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.S has 38 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.01 seconds
try using /debug and you would see that in the first datastep the output is only done after exiting the do loop. so you are getting records 2,4,6.....
in your second datastep you have to set statements. the code here as set your_table your_table works based on your example. The question is why do you want 2 rows with the same information in the same table?
My intention is to understand the way in which the data step read the obervations in input data set. I just wander why the different obervations are read sequentially in the do loop in one data step iteration in the first code , but it did not in the second code. Thanks a lot!
The is because the sas compiler sets up one data stream for each instance of a SET statement. In the first program there is one set statement, therefore one stream. It's executed twice per each iteration of the data step, giving you observation numbers 2,4,6,8,10,12,14,16, and 18 (9 obs).
In the second program there are two streams, each executed once per iteration of the data step. In your example, in which both streams come from the same data source (and therefore have common variables), the second stream values overwrite the values obtained from the first stream.
BTW, while two SET statements invoke two data streams, two INPUT statements read from the same raw data stream.
/debug can be of assistance when you have a why.
step the records through the process and see what is happening by using the debugger.
data s / debug;
do n=1 to 2;
set sashelp.class;
end;
run;
data s / debug;
set sashelp.class;
set sashelp.class;
output;
run;
@ShufeGuoding: If you're not familiar with the data step debugger (and its somewhat cryptic commands), here is a brief instruction: https://communities.sas.com/t5/SAS-Programming/use-of-index/m-p/264460#M51865
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.