Help using Base SAS procedures

Use RETAIN to re-order variables

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 109
Accepted Solution

Use RETAIN to re-order variables

Hello,

I’ve a question about using RETAIN to reorder the variable order in a dataset. I know I can use retain, such as the following

     DATA want;

     retain Var3 Var4 Var2 Var1;

     set old;

     run;

(I also just learned I could us attrib to do it, such as the following)

     DATA   want;

     attrib Var3 Var4 Var2 Var1;

     set old;

     run;

But I also know from my experience that I could mess up the output data if I’ve data processing work after the SET statement. I don't exactly remember the kind of data processing I used when I noticed the problem. Now that I'm using RETAIN a lot to reorder the variables, I thought I'd better know what kind of the data processing after SET will mess up or won’t mess up the data.


Say instead of the two steps below,

     data combined;

merge data1 data2; by id;

     data combined;

retain var3 var4 var1 var2;

set combined;

run;

Can I do in one step without causing any problem?

     data combined;

retain var3 var4 var1 var2;

merge data1 data2;

by id;

run;


How about it's more than just merging but involve creating derived variables, processing missing values, say the following?

     data new; retain var3 var4 var1 var2;

          set old;

          If var3=. then var3=var2;

          run;


Sorry I can't find an example when I noticed the problem of using RETAIN. Any general comments about potential problems when using retain are appreciated.


Accepted Solutions
Solution
‎12-16-2016 01:30 PM
SAS Employee kmw
SAS Employee
Posts: 11

Re: Use RETAIN to re-order variables

Editor's Note: This is a popular topic.  The following SAS Note shows how to reorder variables in a SAS data set:

Usage Note 8395

View solution in original post


All Replies
Trusted Advisor
Posts: 2,125

Re: Use RETAIN to re-order variables

Don't use RETAIN just to reorder the variables.  Use ATTRIB, LENGTH, or even FORMAT.

RETAIN has a significant side effect here in that the input data buffer is not cleared between observations.  It is valuable when used as designed, but can be nasty when doing SETs with different variables.

Doc Muhlbaier

Duke

Frequent Contributor
Posts: 109

Re: Use RETAIN to re-order variables

I learned the issues about RETAIN much later on.

But with ATTRIB, LENGTH, FORMAT, it's quite inconvenient to use them because I have to specify the format or length of each of the variables involved, such as the following. Is there an easy way out?

data attrib_method; attrib a b c d format=1.; set sample;

data attrib_method;attrib a b c d informat=1.;set sample;

data attrib_method;attrib a b c d label=’example’; set sample;

data attrib_method; attrib a b c d length=3.; set sample;

Respected Advisor
Posts: 3,852

Re: Use RETAIN to re-order variables

Retain is fine as long as you know what it does and does NOT do.

For variables coming from SET MERGE or UPDATE is does nothing except to change the order when the RETAIN statement is placed properly.  Variables coming from SET MERGE or UPDATE are NOT initialized to missing like variables that are created in a data step or read from INFILE.  That's why something like "if _N_ eq 1 then set ...;"  works to put the values read from this SET statement on every observation.

If you use RETAIN on variables that ARE created in a data step then you will need to understand the effect that will have on those variables and "program" accordinly.

Trusted Advisor
Posts: 2,125

Re: Use RETAIN to re-order variables

You could also use a KEEP statement, but then you have to specify EVERY variable.

Respected Advisor
Posts: 3,852

Re: Use RETAIN to re-order variables

I don't believe KEEP will work to reorder variables.  Can you make example?

Frequent Contributor
Posts: 109

Re: Use RETAIN to re-order variables

Posted in reply to data_null__

I don't think KEEP works (unless there are tricks in specifying it). I guess it's a trade off for using RETAIN. Do I mind the data processing time for using RETAIN just for reordering variable? Do I want to spend time specifying the format/length of each of the variables involved in reordering when using attrib, length, format, etc.

Respected Advisor
Posts: 3,852

Re: Use RETAIN to re-order variables

If you don't want to "move" them around after the fact create the variables in the proper order to begin with.  I showed you how to get TRANSPOSE to do that in your TRANSPOSE thread.

Super User
Super User
Posts: 8,090

Re: Use RETAIN to re-order variables

You can use the KEEP and/or DROP dataset option.

One way to get variables defined is to copy the definition from an existing dataset.  You can use a SET statement that is not executed.

data combined;

  if 0 then set data2 (keep=var3 var4) data1(keep= var1 var2);

  merge data1 data2;

  by id;

run;

Super Contributor
Posts: 398

Re: Use RETAIN to re-order variables

Hi,

While I usually agree with Tom, and he's a heck of a lot smarter than me ;-), on this occasion I'll disagree.

IMO, the keep statement will set the variable attributes, but doesn't give you much, if any, control over the variable order in the PDV.

For example, if in data2, the (kept) variables are in this order:  var4 var3, and in data1 the variables are in this order:  var2 var1, then in the final dataset the variable order would be var4 var3 var2 var1 (then all the other variables contributed by data1 and data2 on the merge statement), NOT the order you'd think from the keep statement. 

Plus, as stated above, any variables that are defined via a set or merge statement have an implied retain, which has occasionally bit me in the past.  That's also why you'll see "if 0 then set somedataset (keep=whatever)" to set the variable attributes when using a data step hash object, but with call missing somewhere in the mix (often call missing (of _all_) to code around the implied retain issue.

But I digress...

My favourite approach when I want to reorder the variables in the PDV is a "naked format" statement.  For example:

data whatever;

  format height weight age sex name;

  set sashelp.class;

run;

During data step compilation, the compiler will set the variable PDV order based on the order of the statements in the data step.  In this example, the variable order is set by the format statement.  Then, the set statement kicks in, defining the rest of the variable attributes (length, label, format, informat).  If the variables have a permanent format associated with them in the dataset, "last one wins", and the format is associated with the variable at that point.

Occasionally this approach doesn't work, for example if I need to reference a variable before the set statement.  Since the variable attributes aren't set by the format statement, and numeric variables are the default, occasionally I get a character variable cast as numeric.  In this scenario, I need to revert to a length statement.  But this is rare.

As a first approach, try the "naked format" statement, and see if that works for you.

HTH...

Scott

Regular Contributor
Posts: 184

Re: Use RETAIN to re-order variables

Posted in reply to ScottBass

Too bad that a "naked" ATTRIBUTE statement is not allowed.

Scott Bass wrote:

[snip]

My favourite approach when I want to reorder the variables in the PDV is a "naked format" statement.  For example:

data whatever;

  format height weight age sex name;

  set sashelp.class;

run;

During data step compilation, the compiler will set the variable PDV order based on the order of the statements in the data step.  In this example, the variable order is set by the format statement.  Then, the set statement kicks in, defining the rest of the variable attributes (length, label, format, informat).  If the variables have a permanent format associated with them in the dataset, "last one wins", and the format is associated with the variable at that point.

Occasionally this approach doesn't work, for example if I need to reference a variable before the set statement.  Since the variable attributes aren't set by the format statement, and numeric variables are the default, occasionally I get a character variable cast as numeric.  In this scenario, I need to revert to a length statement.  But this is rare.

As a first approach, try the "naked format" statement, and see if that works for you.

HTH...

Scott

Respected Advisor
Posts: 3,852

Re: Use RETAIN to re-order variables

That's no different from RETAIN (not as good actually) and you could end up needing a set statement for every variable.

Solution
‎12-16-2016 01:30 PM
SAS Employee kmw
SAS Employee
Posts: 11

Re: Use RETAIN to re-order variables

Editor's Note: This is a popular topic.  The following SAS Note shows how to reorder variables in a SAS data set:

Usage Note 8395

Super Contributor
Posts: 398

Re: Use RETAIN to re-order variables

From the usage note:

 

For example, there are six variables in data set ONE in the positional order of A, B, C, D, E, and F. If the new order needs to be A, E, C, D, B, and F, then the following DATA step reorders the variables in that order:

data two;
   retain a e c d b;
   set one;
run;

As noted, the variable F is left off the RETAIN statement since the order of variable F should not change. After running the above DATA step, the variables in data set ONE are in the order of A, E, C, D, B and F.

 

So expanding on the Usage Note:

 

data one;
   set sashelp.class;
   a=name;
   b=age;
   c=sex;
   d=weight;
   e=height;
   f=height*weight;
   keep a--f;
run;

data two;
   retain g a e c d b;
   set one;
   if c='M' then g='FOO';
run;

data three;
   format g a e c d b;
   set one;
   if c='M' then g='FOO';
run;

Compare the output between two and three.  IMO the output from two is possibly undesired, and the output from three is as expected.

 

The usage note is good, but IMO should go into more detail of the repercussions of using retain to set the PDV order for a data step (as opposed to data set) variable.  Obviously an experienced SAS programmer would know this, but this could catch a newbie.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 48215 views
  • 8 likes
  • 7 in conversation