Hello all,
I have a problem that I think can be solved with code. I have to go through each variable in a dataset and rename them according to a couple rules. They are as follows:
For example, if a variable is named CNT_FIRST_SECOND_THIRD_FOURTH it would have to be changed to CNT_FIRSTSECOND_THIRD_FOURTH. We're not picky on which underscore has to go as long as the number of underscores is 3 or less. An example of the second rule would be if we had a variable named CNT_FIRST_OBS_SECOND it would have to be rearranged where OBS is placed at the end of the variable name as in CNT_FIRST_SECOND_OBS. Let me know if there's a way to solve this with code. I'll be watching closely. Thanks ahead of time for your help!
i think countw() and scan() will solve both problems.
the countw() function allows you to determine if issue 1 is a problem for a given variable. in a loop, i'd use something like if ge 4 (with delimitter = "_"), then [define new variable name using concatenation of a few scan() instances]
for issue 2, i'd use some form of if scan() = "OBS" then [define new variable name]
Can you show us what you have tried?
Suggestion, query the dataset using Proc sql re contents as below and search for the patterns you need and then do appropriate rename
proc contents
data = sashelp.class
noprint
out = var_list
(keep = name );
run;
i think countw() and scan() will solve both problems.
the countw() function allows you to determine if issue 1 is a problem for a given variable. in a loop, i'd use something like if ge 4 (with delimitter = "_"), then [define new variable name using concatenation of a few scan() instances]
for issue 2, i'd use some form of if scan() = "OBS" then [define new variable name]
sorry, I meant to say first that I'd output the variable names to a dataset to work with them separately from the data
Use SASHELP.VCOLUMN table to scan your variable names
Subset the desired variables and create the new names according to the rules
Submit proc datasets with modify option and do the rename
data vars;
set SASHELP.VCOLUMN
(where=(libname=<library> and memname=<dataset name>));
N_ = countw(varname,'_') ;
pos1 = index(varname,'OBS');
pos2 = index(varname.'PER');
if N_ > 3 then do;
do i= 1 to (N_ - 3);
ix = index(varname,'_',-1); /* seek from the end */
varname = substr(varname,1,ix-1) || substr(varname,ix+1);
end;
newname = varname;
flagout=1;
end;
if pos1 >0 or pos2>0 then do;
pos = max(pos1 pos2);
suffix = substr(varname,pos,3);
newname = trim(substr(varname,1,pos-1) ||
substr(varname,pos+3)) || suffix;
flagout =1;
end;
if flagout = 1 then output;
keep varname newname;
run;
filename program temp 'generated_program.sas';
data _null_;
set vars (end=eof);
file program;
if _N_= 1 then do;
put 'proc datasets lib=<library> nolist; ';
put 'modify <dataset-name>; ';
put 'rename';
end;
put varname ' = ' nename ';' ;
if eof then put '; run;' ;
run;
%include program;
/* in case of issues just repair the program before running */
Hey thanks! When I fill in the libname in your 3rd line it says "error: Variable yourlibrary is not on file SASHELP.VCOLUMN. How can I fix this?
The value of LIBRARY is the name of the libref defined that points to where the dataset lives. So it cannot be longer than 8 characters and must be specified as a character string in all uppercase.
library='MYLIB'
Similarly the MEMNAME is the name of the dataset in uppercase. So it also must be a string and cannot be longer then 32 bytes.
data vars;
set SASHELP.VCOLUMN ;
where library='MYLIB' and memname='MYDATASET_NAME' ;
Library and member names have to be all uppercase when searching in the dictionary tables (and the respective sashelp views). Library names are also limited to 8 characters, so "yourlibrary" is impossible.
And you need to place those literals in quotes, or the data step compiler will misunderstand them for variable names within the view.
I ended up using the countw() function as my workhorse. Here is my final solution:
data temp; set var_list; if index(lowcase(name), "__c") > 0 then ix="true"; count = countw(name, "_"); run; data temp2; set temp; newstring=tranwrd(name, "__C", ''); run; data temp4; set temp2; count = countw(newstring, "_"); len = length(name); IF index(newstring, "OBS") then obs="true"; if index(newstring, "PER") then per="true"; if (length(newstring) - index(newstring, "OBS")) > 2 then breakOBS="true"; if (length(newstring) - index(newstring, "PER")) > 2 then breakPER="true"; IF obs = "true" and breakobs="true" then make_list1 = "true"; if per = "true" and breakper="true" then make_list2 = "true"; if count > 4 then make_list3 = "true"; if make_list1 = "true" or make_list2 = "true" or make_list3 = "true" then made_list = "true"; run; data list(keep=name); set temp4; if made_list = "true"; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.