BookmarkSubscribeRSS Feed
kisumsam
Quartz | Level 8

Hi there, I have a question about multiple output statements within a data step:

 

 

data TWO;
input x y;
datalines;
5 2
3 1
5 6
;
RUN;

data ONE TWO OTHER;
set TWO;
if X eq 5 then output ONE;
if Y lt 5 then output TWO;
output;
run;

 

The code above seems fairly straightforward. However, when I'm looking at the ONE data set, the order of the observations seems to be a little strange:

 

image 1.png

 

 

From the first output statement, the first two observations should be:

 

5 2

5 6

 

From the last output statement, the last three observations should be:

 

5 2
3 1
5 6

 

I am expecting the combined data set to be:

 

5 2
5 6

5 2
3 1
5 6

 

However, the order of the output data set is:

 

52
52
31
56
56

 

The order of the data set is changed but it wasn't exactly sorted by X and Y.

 

Does anyone know how the data set is sorted when we have the two output statements in the data step?

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

First the data step reads the observation 5 2. Then it checks if X eq 5, which is true, so it outputs to the data set ONE. next it checks if Y lt 5, which is false, so nothing is outputted. Next you simply have an output statement, which means that the 5 2 observations is outputted again(to all created data sets, including X). That is why the first two observations are 5 2.

ballardw
Super User

@kisumsam wrote:

Hi there, I have a question about multiple output statements within a data step:

 

 

data TWO;
input x y;
datalines;
5 2
3 1
5 6
;
RUN;

data ONE TWO OTHER;
set TWO;
if X eq 5 then output ONE;
if Y lt 5 then output TWO;
output;
run;

 

The code above seems fairly straightforward. However, when I'm looking at the ONE data set, the order of the observations seems to be a little strange:

 

image 1.png

 

 

 

I am expecting the combined data set to be:

 

5 2
5 6

5 2
3 1
5 6

 

 

 

Does anyone know how the data set is sorted when we have the two output statements in the data step?


To get that specific result one way would be:

data three;
   set TWO (where=(x ge 5))
       two
   ;
run;

which reads set two twice, the first bit says to keep only the records with x ge 5 and since that is the first input set they are output first,

 

then read set two again with all records.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 6678 views
  • 5 likes
  • 3 in conversation