Help using Base SAS procedures

Use RETAIN to re-order variables

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 101
Accepted Solution

Use RETAIN to re-order variables

Hello,

I’ve a question about using RETAIN to reorder the variable order in a dataset. I know I can use retain, such as the following

     DATA want;

     retain Var3 Var4 Var2 Var1;

     set old;

     run;

(I also just learned I could us attrib to do it, such as the following)

     DATA   want;

     attrib Var3 Var4 Var2 Var1;

     set old;

     run;

But I also know from my experience that I could mess up the output data if I’ve data processing work after the SET statement. I don't exactly remember the kind of data processing I used when I noticed the problem. Now that I'm using RETAIN a lot to reorder the variables, I thought I'd better know what kind of the data processing after SET will mess up or won’t mess up the data.


Say instead of the two steps below,

     data combined;

merge data1 data2; by id;

     data combined;

retain var3 var4 var1 var2;

set combined;

run;

Can I do in one step without causing any problem?

     data combined;

retain var3 var4 var1 var2;

merge data1 data2;

by id;

run;


How about it's more than just merging but involve creating derived variables, processing missing values, say the following?

     data new; retain var3 var4 var1 var2;

          set old;

          If var3=. then var3=var2;

          run;


Sorry I can't find an example when I noticed the problem of using RETAIN. Any general comments about potential problems when using retain are appreciated.


Accepted Solutions
Solution
‎12-16-2016 01:30 PM
SAS Employee kmw
SAS Employee
Posts: 7

Re: Use RETAIN to re-order variables

Editor's Note: This is a popular topic.  The following SAS Note shows how to reorder variables in a SAS data set:

Usage Note 8395

View solution in original post


All Replies
Trusted Advisor
Posts: 2,113

Re: Use RETAIN to re-order variables

Don't use RETAIN just to reorder the variables.  Use ATTRIB, LENGTH, or even FORMAT.

RETAIN has a significant side effect here in that the input data buffer is not cleared between observations.  It is valuable when used as designed, but can be nasty when doing SETs with different variables.

Doc Muhlbaier

Duke

Frequent Contributor
Posts: 101

Re: Use RETAIN to re-order variables

I learned the issues about RETAIN much later on.

But with ATTRIB, LENGTH, FORMAT, it's quite inconvenient to use them because I have to specify the format or length of each of the variables involved, such as the following. Is there an easy way out?

data attrib_method; attrib a b c d format=1.; set sample;

data attrib_method;attrib a b c d informat=1.;set sample;

data attrib_method;attrib a b c d label=’example’; set sample;

data attrib_method; attrib a b c d length=3.; set sample;

Respected Advisor
Posts: 3,777

Re: Use RETAIN to re-order variables

Retain is fine as long as you know what it does and does NOT do.

For variables coming from SET MERGE or UPDATE is does nothing except to change the order when the RETAIN statement is placed properly.  Variables coming from SET MERGE or UPDATE are NOT initialized to missing like variables that are created in a data step or read from INFILE.  That's why something like "if _N_ eq 1 then set ...;"  works to put the values read from this SET statement on every observation.

If you use RETAIN on variables that ARE created in a data step then you will need to understand the effect that will have on those variables and "program" accordinly.

Trusted Advisor
Posts: 2,113

Re: Use RETAIN to re-order variables

You could also use a KEEP statement, but then you have to specify EVERY variable.

Respected Advisor
Posts: 3,777

Re: Use RETAIN to re-order variables

I don't believe KEEP will work to reorder variables.  Can you make example?

Frequent Contributor
Posts: 101

Re: Use RETAIN to re-order variables

I don't think KEEP works (unless there are tricks in specifying it). I guess it's a trade off for using RETAIN. Do I mind the data processing time for using RETAIN just for reordering variable? Do I want to spend time specifying the format/length of each of the variables involved in reordering when using attrib, length, format, etc.

Respected Advisor
Posts: 3,777

Re: Use RETAIN to re-order variables

If you don't want to "move" them around after the fact create the variables in the proper order to begin with.  I showed you how to get TRANSPOSE to do that in your TRANSPOSE thread.

Super User
Super User
Posts: 6,495

Re: Use RETAIN to re-order variables

You can use the KEEP and/or DROP dataset option.

One way to get variables defined is to copy the definition from an existing dataset.  You can use a SET statement that is not executed.

data combined;

  if 0 then set data2 (keep=var3 var4) data1(keep= var1 var2);

  merge data1 data2;

  by id;

run;

Super Contributor
Posts: 376

Re: Use RETAIN to re-order variables

Hi,

While I usually agree with Tom, and he's a heck of a lot smarter than me ;-), on this occasion I'll disagree.

IMO, the keep statement will set the variable attributes, but doesn't give you much, if any, control over the variable order in the PDV.

For example, if in data2, the (kept) variables are in this order:  var4 var3, and in data1 the variables are in this order:  var2 var1, then in the final dataset the variable order would be var4 var3 var2 var1 (then all the other variables contributed by data1 and data2 on the merge statement), NOT the order you'd think from the keep statement. 

Plus, as stated above, any variables that are defined via a set or merge statement have an implied retain, which has occasionally bit me in the past.  That's also why you'll see "if 0 then set somedataset (keep=whatever)" to set the variable attributes when using a data step hash object, but with call missing somewhere in the mix (often call missing (of _all_) to code around the implied retain issue.

But I digress...

My favourite approach when I want to reorder the variables in the PDV is a "naked format" statement.  For example:

data whatever;

  format height weight age sex name;

  set sashelp.class;

run;

During data step compilation, the compiler will set the variable PDV order based on the order of the statements in the data step.  In this example, the variable order is set by the format statement.  Then, the set statement kicks in, defining the rest of the variable attributes (length, label, format, informat).  If the variables have a permanent format associated with them in the dataset, "last one wins", and the format is associated with the variable at that point.

Occasionally this approach doesn't work, for example if I need to reference a variable before the set statement.  Since the variable attributes aren't set by the format statement, and numeric variables are the default, occasionally I get a character variable cast as numeric.  In this scenario, I need to revert to a length statement.  But this is rare.

As a first approach, try the "naked format" statement, and see if that works for you.

HTH...

Scott

Regular Contributor
Posts: 184

Re: Use RETAIN to re-order variables

Too bad that a "naked" ATTRIBUTE statement is not allowed.

Scott Bass wrote:

[snip]

My favourite approach when I want to reorder the variables in the PDV is a "naked format" statement.  For example:

data whatever;

  format height weight age sex name;

  set sashelp.class;

run;

During data step compilation, the compiler will set the variable PDV order based on the order of the statements in the data step.  In this example, the variable order is set by the format statement.  Then, the set statement kicks in, defining the rest of the variable attributes (length, label, format, informat).  If the variables have a permanent format associated with them in the dataset, "last one wins", and the format is associated with the variable at that point.

Occasionally this approach doesn't work, for example if I need to reference a variable before the set statement.  Since the variable attributes aren't set by the format statement, and numeric variables are the default, occasionally I get a character variable cast as numeric.  In this scenario, I need to revert to a length statement.  But this is rare.

As a first approach, try the "naked format" statement, and see if that works for you.

HTH...

Scott

Respected Advisor
Posts: 3,777

Re: Use RETAIN to re-order variables

That's no different from RETAIN (not as good actually) and you could end up needing a set statement for every variable.

Solution
‎12-16-2016 01:30 PM
SAS Employee kmw
SAS Employee
Posts: 7

Re: Use RETAIN to re-order variables

Editor's Note: This is a popular topic.  The following SAS Note shows how to reorder variables in a SAS data set:

Usage Note 8395

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 30607 views
  • 2 likes
  • 7 in conversation