BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ShufeGuoding
Obsidian | Level 7

What's the difference between the results from two data step in the following?

data s;
do n=1 to 2;
set sashelp.class;
end;
run;

 

 

data s;
set sashelp.class;
set sashelp.class;

output;
run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

The is because the sas compiler sets up one data stream for each instance of a SET statement.  In the first program there is one set statement, therefore one stream.  It's executed twice per each iteration of the data step, giving you observation numbers 2,4,6,8,10,12,14,16, and 18  (9 obs).

 

In the second program there are two streams, each executed once per iteration of the data step.  In your example, in which both streams come from the same data source (and therefore have common variables), the second stream values overwrite the values obtained from the first stream.

 

 

BTW, while two SET statements invoke two data streams, two INPUT statements read from the same raw data stream.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

7 REPLIES 7
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

the difference is as follows;

 

105  data s;
106  do n=1 to 2;
107  set sashelp.class;
108  end;
109  run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.S has 9 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.00 seconds

110
111
112  data s;
113  set sashelp.class;
114  set sashelp.class;
115  output;
116  run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.S has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
 
the second datastep may give you want you want but you did not describe your wants.
why are you assigning the same data table 2 times in the same datastep?
 
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

the why is because the code is incorrect.

do this if you want to dupe the table rows

data s;
set sashelp.class sashelp.class;
output;
run;

117
118 data s;
119 set sashelp.class sashelp.class;
120 output;
121 run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.S has 38 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.01 seconds

 try using /debug and you would see that in the first datastep the output is only done after exiting the do loop.  so you are getting records 2,4,6.....

in your second datastep you have to set statements.  the code here as set your_table your_table works based on your example.  The question is why do you want 2 rows with the same information in the same table?

ShufeGuoding
Obsidian | Level 7

My intention is to understand the way in which  the data step read the obervations in input data set. I just wander why the  different obervations  are read  sequentially in the do loop in  one data step iteration in the first code , but it did not in the second code. Thanks a lot!

mkeintz
PROC Star

The is because the sas compiler sets up one data stream for each instance of a SET statement.  In the first program there is one set statement, therefore one stream.  It's executed twice per each iteration of the data step, giving you observation numbers 2,4,6,8,10,12,14,16, and 18  (9 obs).

 

In the second program there are two streams, each executed once per iteration of the data step.  In your example, in which both streams come from the same data source (and therefore have common variables), the second stream values overwrite the values obtained from the first stream.

 

 

BTW, while two SET statements invoke two data streams, two INPUT statements read from the same raw data stream.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

/debug can be of assistance when you have a why.

 

step the records through the process and see what is happening by using the debugger.

 

data s / debug;
do n=1 to 2;
set sashelp.class;
end;
run;
 
 
data s / debug;
set sashelp.class;
set sashelp.class;
output;
run;
FreelanceReinh
Jade | Level 19

@ShufeGuoding: If you're not familiar with the data step debugger (and its somewhat cryptic commands), here is a brief instruction: https://communities.sas.com/t5/SAS-Programming/use-of-index/m-p/264460#M51865

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2183 views
  • 1 like
  • 4 in conversation