BookmarkSubscribeRSS Feed

It would be nice to have a way to set the order of the variables in the output data set based on the order of the variables listed in a KEEP= data set option.  I apologize if this already exists, I did a quick Google search and the only ways I see to achieve this is by using RETAIN, FORMAT, etc. before the SET statement, like in the below code.  It is not a huge lift to list the variables twice, it just seems redundant.  The second data step below demonstrates one way this could be done.

 

Thanks in advance for your attention to this matter.

data WANT (keep= ID DATE ORDERNUM AMT);
	retain ID DATE ORDERNUM AMT;
	set HAVE;
run;

data WANT (keep= ID DATE ORDERNUM AMT / inorder);
set HAVE;
run;
3 Comments
Quentin
Super User

What would you think if /inorder was only allowed on the input dataset, e.g.:

 

data WANT ;
set HAVE (keep= ID DATE ORDERNUM AMT / inorder);
run;

 

Just guessing how SAS works, that might be slightly easier to implement, because the order of variables in the PDV would be the same as the order in the output dataset.  But even for this, there would need to be a change to the DATA step compiler. 

 

The request for this, or a REORDER statement, is so common, I think it's worth some thought. 

ballardw
Super User

A recurring response to this question is also: what are you doing in terms of analysis or reporting that requires the variables to be any specific order in the data set?

 

In the very limited cases where I might care, because I spend so much time cleaning garbage data that I need some related variable together in the data set, I use the informat/attribute statement before an Input to place the variables "in order" when the data is read from an external file.

Astounding
PROC Star

As a workaround to reduce typing in the meantime, you can use:

%let keeplist = ID DATE ORDERNUM AMT;
 
data want;
   retain &keeplist;
   set have;
   keep &keeplist;
run;

Even better, apply KEEP= on the SET statement to reduce the amount of data being read in.