I need to recode a long list of variables that are named this way (numbered list by month, from first to last):
Variable Variable label
MBA_1_VALUE "Monthly Benefit Amount, 1/1962"
Hi @gurre003
I know all too well how it is to deal with take-as-is-and-deliver-as-wanted data. Here is another (tested) solution working after similar lines. It processes all numeric variables with a given name pattern, and if you have more than one group of numeric variables, the if prxmatch ... end section can be repeated and ajusted for each group.
If you have one or more groups of character variables, you must declare a character variable array also and repeat the whole do loop to deal with the char array.
Proc datasets changes names in situ, so I strongly adwise to make a copy in work and then replace input when everything works OK.
* Make test input data;
data have;
attrib
ID length=8
Department length=$30
MBA_1_VALUE length=8 label="Monthly Benefit Amount, 1/1962"
MBA_2_VALUE length=8 label="Monthly Benefit Amount, 2/1962"
MBA_3_VALUE length=8 label="Monthly Benefit Amount, 3/1962"
;
ID = 1; Department = 'A'; MBA_1_VALUE = 1; MBA_2_VALUE = 2; MBA_3_VALUE = 3; output;
ID = 2; Department = 'B'; MBA_1_VALUE = 11; MBA_2_VALUE = 22; MBA_3_VALUE = 33; output;
run;
* Rename variables in data set;
data _null_; set have (obs=1) end=end;
array nums _numeric_;
if _N_ = 1 then call execute('proc datasets nolist lib=work; modify have;');
do i = 1 to dim(nums);
vname = vname(nums{i});
if prxmatch("/MBA_\d+_VALUE/",vname) then do;
d = input(scan(vlabel(nums{i}),-1,' '),anydtdte.);
ren = 'rename ' || trim(vname) || ' = mba_' || put(d,monname3.) || '_' || put(d,year2.)||';';
call execute(ren);
end;
end;
if end then call execute('run; quit;');
run;
Well, the idea of having variable names with calendar information in the variable name is usually discouraged, as it is a poor practice that usually winds up making your coding more difficult. In addition, for creating tables and reports and such, the LABEL ought to be sufficient.
So, I'm going to say YES there is an efficient way to rename all of these variables, but don't do it. You are making the rest of your coding more difficult.
I'll clarify, this formatting is not something I want to do, but something I have to do in order to get other programs to run since those programs are set up to read month/year variable names. Either I do this manually (variable by variable) or I create some efficient code to do this.
UNTESTED CODE
Replace LIBRARY with the actual name of the library where your data set exists
proc sql;
select cats(name,'=mba_',put(input(scan(label,-2,'/, '),2.),monname3.),'_',
substr(scan(label,-1,'/, '),3,2))
into :renames separated by ' ' from dictionary.columns
where libname='LIBRARY' and memname='YourDataSetName' and name eqt 'MBA_';
quit;
proc datasets library=LIBRARY nolist;
modify YourDataSetName;
rename &renames;
run;
quit;
I would suggest changing the plan to use names that look like this:
mba_1962_01
mba_1962_02
...
mba_2017_12
Those names are just as descriptive as what you are planning on now, but they have the advantages of using four-digit years and sorting in the proper order.
If that sounds like a good idea to you, I could propose a way to get there from where you are now.
Can you rename ALL of them? If you are forced to use date strings in names it is better to use Y-M-D order so that they will sort alphabetically in chronological order.
Anyway here is an easy way to detect the date represented by either type of name and use that date to generate a new name.
First let's make some test data. In real life you would get the list from your dataset's metadata (via PROC CONTENTS or one of the DICTIONARY metadata tables/views).
data have ;
input name :$32.;
cards;
MBA_1_VALUE "Monthly Benefit Amount, 1/1962"
MBA_2_VALUE "Monthly Benefit Amount, 2/1962"
MBA_672_VALUE "Monthly Benefits Amount, 12/2017"
mba_jan_62 "Monthly Benefit Amount, 1/1962"
mba_feb_62 "Monthly Benefit Amount, 2/1962"
mba_dec_17 "Monthly Benefits Amount, 12/2017"
;
Now let's run a simple data step to generate the new name we want.
data want;
set have ;
length date 8 new_name $32.;
format date yymmdd10.;
name=upcase(name);
if scan(name,-1,'_')='VALUE' then
date = intnx('month','01JAN1962'd,input(scan(name,-2,'_'),32.)-1,'b')
;
else date=input(cats('01',scan(name,-2,'_'),scan(name,-1,'_')),date9.);
new_name=catx('_',scan(name,1,'_'),put(date,yymm7.));
run;
Results:
Obs name date new_name 1 MBA_1_VALUE 1962-01-01 MBA_1962M01 2 MBA_2_VALUE 1962-02-01 MBA_1962M02 3 MBA_672_VALUE 2017-12-01 MBA_2017M12 4 MBA_JAN_62 1962-01-01 MBA_1962M01 5 MBA_FEB_62 1962-02-01 MBA_1962M02 6 MBA_DEC_17 2017-12-01 MBA_2017M12
Now you can use those NAME/NEW_NAME pairs to generate OLDNAME=NEWNAME pairs into a macro variable. You can then use the macro variable in a RENAME statement or RENAME= dataset option. To change the names without copying the data use PROC DATASETS to modify the dataset and rename the variables.
proc sql noprint;
select catx('=',name,new_name) into :rename separated by ' '
from have
where upcase(name) ne upcase(new_name)
;
quit;
proc datasets nolist lib=work;
modify DS_TO_CHANGE;
rename &rename ;
run;
quit;
Hi @gurre003 Here is a way to approach to RENAME reading dicitonary columns dynamically
Assuming you have a dataset like the below sample named DSN, try to replicate the below to your needs and let us know
/*Sample DSN*/
data dsn;
MBA_1_VALUE=25;
MBA_2_VALUE=35;
label MBA_1_VALUE='Monthly Benefit Amount, 1/1962'
MBA_2_VALUE='Monthly Benefit Amount, 2/1962';
run;
/*Generate dynamic RENAME statements using call execute from dictionary tables*/
data _null_;
set sashelp.vcolumn(keep=libname memname name label) end=z;
where libname='WORK' and memname='DSN';
w=put(input(scan(label,-1,','),ANYDTDTE10.),monyy7. -l);
want=catx('_',scan(name,1,'_'),w);
if _n_=1 then call execute('proc datasets lib=work nolist;modify dsn;rename ');
call execute(cats(' ',name,'=',want,' '));
if z;
call execute(cats(';','quit;'));
run;
/*Verify the change*/
proc contents data=dsn;
run;
Hi @gurre003,
Here's another (similar) approach:
filename rencode temp;
data _null_;
file rencode;
put 'proc datasets lib=YOUR_LIBREF nolist;';
put 'modify YOUR_DATASET;';
put 'rename';
do i=1 to 672;
d=intnx('month','01JAN1962'd,i-1);
m=lowcase(put(d,monname3.));
y=put(d,year2.);
put 'MBA_' i +(-1) '_VALUE=mba_' m +(-1) '_' y;
end;
put '; quit;';
run;
%inc rencode;
Just replace "YOUR_LIBREF" and "YOUR_DATASET" by the actual libref and dataset name, respectively.
Hi @gurre003
I know all too well how it is to deal with take-as-is-and-deliver-as-wanted data. Here is another (tested) solution working after similar lines. It processes all numeric variables with a given name pattern, and if you have more than one group of numeric variables, the if prxmatch ... end section can be repeated and ajusted for each group.
If you have one or more groups of character variables, you must declare a character variable array also and repeat the whole do loop to deal with the char array.
Proc datasets changes names in situ, so I strongly adwise to make a copy in work and then replace input when everything works OK.
* Make test input data;
data have;
attrib
ID length=8
Department length=$30
MBA_1_VALUE length=8 label="Monthly Benefit Amount, 1/1962"
MBA_2_VALUE length=8 label="Monthly Benefit Amount, 2/1962"
MBA_3_VALUE length=8 label="Monthly Benefit Amount, 3/1962"
;
ID = 1; Department = 'A'; MBA_1_VALUE = 1; MBA_2_VALUE = 2; MBA_3_VALUE = 3; output;
ID = 2; Department = 'B'; MBA_1_VALUE = 11; MBA_2_VALUE = 22; MBA_3_VALUE = 33; output;
run;
* Rename variables in data set;
data _null_; set have (obs=1) end=end;
array nums _numeric_;
if _N_ = 1 then call execute('proc datasets nolist lib=work; modify have;');
do i = 1 to dim(nums);
vname = vname(nums{i});
if prxmatch("/MBA_\d+_VALUE/",vname) then do;
d = input(scan(vlabel(nums{i}),-1,' '),anydtdte.);
ren = 'rename ' || trim(vname) || ' = mba_' || put(d,monname3.) || '_' || put(d,year2.)||';';
call execute(ren);
end;
end;
if end then call execute('run; quit;');
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.