DATA Step, Macro, Functions and more

Multiple Output Statements within a Single Data Step

Reply
Contributor
Posts: 41

Multiple Output Statements within a Single Data Step

Hi there, I have a question about multiple output statements within a data step:

 

 

data TWO;
input x y;
datalines;
5 2
3 1
5 6
;
RUN;

data ONE TWO OTHER;
set TWO;
if X eq 5 then output ONE;
if Y lt 5 then output TWO;
output;
run;

 

The code above seems fairly straightforward. However, when I'm looking at the ONE data set, the order of the observations seems to be a little strange:

 

image 1.png

 

 

From the first output statement, the first two observations should be:

 

5 2

5 6

 

From the last output statement, the last three observations should be:

 

5 2
3 1
5 6

 

I am expecting the combined data set to be:

 

5 2
5 6

5 2
3 1
5 6

 

However, the order of the output data set is:

 

52
52
31
56
56

 

The order of the data set is changed but it wasn't exactly sorted by X and Y.

 

Does anyone know how the data set is sorted when we have the two output statements in the data step?

PROC Star
Posts: 1,190

Re: Multiple Output Statements within a Single Data Step

[ Edited ]

First the data step reads the observation 5 2. Then it checks if X eq 5, which is true, so it outputs to the data set ONE. next it checks if Y lt 5, which is false, so nothing is outputted. Next you simply have an output statement, which means that the 5 2 observations is outputted again(to all created data sets, including X). That is why the first two observations are 5 2.

Super User
Posts: 12,996

Re: Multiple Output Statements within a Single Data Step


kisumsam wrote:

Hi there, I have a question about multiple output statements within a data step:

 

 

data TWO;
input x y;
datalines;
5 2
3 1
5 6
;
RUN;

data ONE TWO OTHER;
set TWO;
if X eq 5 then output ONE;
if Y lt 5 then output TWO;
output;
run;

 

The code above seems fairly straightforward. However, when I'm looking at the ONE data set, the order of the observations seems to be a little strange:

 

image 1.png

 

 

 

I am expecting the combined data set to be:

 

5 2
5 6

5 2
3 1
5 6

 

 

 

Does anyone know how the data set is sorted when we have the two output statements in the data step?


To get that specific result one way would be:

data three;
   set TWO (where=(x ge 5))
       two
   ;
run;

which reads set two twice, the first bit says to keep only the records with x ge 5 and since that is the first input set they are output first,

 

then read set two again with all records.

Ask a Question
Discussion stats
  • 2 replies
  • 267 views
  • 5 likes
  • 3 in conversation