BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
cosmid
Lapis Lazuli | Level 10

Hi,

 

1. What's the best way to rename a variable that's in the SAME dataset? Is it PROC DATASETS?

2. For the RENAME statement, the examples I found online, they all used a different output dataset name for the NEW variable name. Is there any harm if I use the same dataset names as both the output and the input?

Example from online:

data qtr1 qtr2 ytd(drop=qtrtot);
   set ytdsales;
   if qtr=1 then output qtr1;
   else if qtr=2 then output qtr2;
   else output ytd;
   rename total=qtrtot;
run;

Example if I want to rename the variable in the SAME dataset:

data ytdsales;
  set ytdsales;
  rename total=qtrtot;
run;

I want to use the same dataset because that's the dataset where I want the variable name to be renamed.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If you have to do something else that requires you to make a NEW dataset (like your first example) then using the RENAME statement (or properly placed RENAME= dataset option) is the best.  The code is clearer since it does not use a proc that most users never have learned about.

 

But if making a new version of the dataset would take a long time, or too much disk space, or lose some other attributes of the dataset (the member label comes to mind for your second example) then using PROC DATASETS makes sense.  It will run quickly and only make the changes you wanted.

 

proc datasets nolist lib=WORK;
  modify ytdsales;
    rename total=qtrtot;
  run;
quit;

 

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

If you have to do something else that requires you to make a NEW dataset (like your first example) then using the RENAME statement (or properly placed RENAME= dataset option) is the best.  The code is clearer since it does not use a proc that most users never have learned about.

 

But if making a new version of the dataset would take a long time, or too much disk space, or lose some other attributes of the dataset (the member label comes to mind for your second example) then using PROC DATASETS makes sense.  It will run quickly and only make the changes you wanted.

 

proc datasets nolist lib=WORK;
  modify ytdsales;
    rename total=qtrtot;
  run;
quit;

 

cosmid
Lapis Lazuli | Level 10
Thanks for the info and the quick reply!

Just to summarize to see if I understand everything correctly:
It's best to use the PROC DATASETS when it comes to just renaming a variable because it is more efficient. The RENAME option in the 2nd example will have to re-create the dataset again to have the old variable replaced with the new variable name. Other than the inefficiency, the RENAME option from the 2nd example can be used.
Tom
Super User Tom
Super User

Yes.  You can do a simple data step to rename a variable.

data have;
  set have;
  rename old=new;
run;

And for most simple analysis programs that is fine and appropriate.

But it will

  • Make a NEW dataset and delete the old one.
  • Take longer than using PROC DATASET since that PROC will just changing the name in the file's header instead of copying the file.
  • Lose some other attributes of the dataset.
    • Member label.
    • SORTED status.  You could try adding BY statement but I don't think SAS really trusts SORTED status that is not generated by PROC SORT.
    • Indexes
    • Constraints (does any one actually use these on SAS datasets)

 

cosmid
Lapis Lazuli | Level 10
Wow! Thank you! That's good to know. I wasn't aware of the things you listed. I'll use PROC DATASETS for renaming variables. Thanks again!
Tom
Super User Tom
Super User

@cosmid wrote:
Wow! Thank you! That's good to know. I wasn't aware of the things you listed. I'll use PROC DATASETS for renaming variables. Thanks again!

Like I said for most simple analysis programs it does not matter.

 

But the best thing is to modify the previous (or next) step so that renaming is not needed.  Then you don't need to add an extra step at all.

 

So give the variables the names you want when you make the dataset the first time.

data step2;
  set step1;
  ... other stuff ... 
  rename old=new;
run;
proc means data=step2;
  var new;
run;

Or change the name at the point where you first read the dataset.

proc means data=step2(rename=(old=new));
  var new;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1026 views
  • 5 likes
  • 2 in conversation