BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alexander2
Fluorite | Level 6

Hello all,

 

I have a problem that I think can be solved with code. I have to go through each variable in a dataset and rename them according to a couple rules. They are as follows:

 

  • Issue: More than 3 underscores “_”
  • Resolution: Rename the variable with only 3 underscores.
  • Issue: Time period location on a variable name
  • Resolution: Rename the variables where PER or OBS is at the end of the variable/ suffix.

For example, if a variable is named CNT_FIRST_SECOND_THIRD_FOURTH it would have to be changed to CNT_FIRSTSECOND_THIRD_FOURTH. We're not picky on which underscore has to go as long as the number of underscores is 3 or less. An example of the second rule would be if we had a variable named CNT_FIRST_OBS_SECOND it would have to be rearranged where OBS is placed at the end of the variable name as in CNT_FIRST_SECOND_OBS. Let me know if there's a way to solve this with code. I'll be watching closely. Thanks ahead of time for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
seeff
Obsidian | Level 7

i think countw() and scan() will solve both problems.

 

the countw() function allows you to determine if issue 1 is a problem for a given variable. in a loop, i'd use something like if ge 4 (with delimitter = "_"), then [define new variable name using concatenation of a few scan() instances]

 

for issue 2, i'd use some form of if scan() = "OBS" then [define new variable name]

View solution in original post

8 REPLIES 8
ghosh
Barite | Level 11

Can you show us what you have tried?

 

Suggestion, query the dataset using Proc sql re contents as below and search for the patterns you need and then do appropriate rename

 

proc contents
     data = sashelp.class
          noprint
          out = var_list
               (keep = name );
run;
seeff
Obsidian | Level 7

i think countw() and scan() will solve both problems.

 

the countw() function allows you to determine if issue 1 is a problem for a given variable. in a loop, i'd use something like if ge 4 (with delimitter = "_"), then [define new variable name using concatenation of a few scan() instances]

 

for issue 2, i'd use some form of if scan() = "OBS" then [define new variable name]

seeff
Obsidian | Level 7

sorry, I meant to say first that I'd output the variable names to a dataset to work with them separately from the data

Shmuel
Garnet | Level 18

Use SASHELP.VCOLUMN table to scan your variable names

Subset the desired variables and create the new names according to the rules

Submit proc datasets with modify option and do the rename

data vars;
 set SASHELP.VCOLUMN 
       (where=(libname=<library> and memname=<dataset name>));
        N_ =  countw(varname,'_') ;
        pos1 = index(varname,'OBS');
        pos2 = index(varname.'PER');

         if N_ > 3 then do;
             do i= 1 to (N_ - 3);
                 ix = index(varname,'_',-1); /* seek from the end */
                 varname = substr(varname,1,ix-1) || substr(varname,ix+1);
             end;
            newname = varname;
            flagout=1;
        end;

       if pos1 >0 or pos2>0 then do;
          pos = max(pos1 pos2);
          suffix = substr(varname,pos,3);
          newname = trim(substr(varname,1,pos-1) ||
                              substr(varname,pos+3)) || suffix;
          flagout =1;
      end;
      if flagout = 1 then output;
      keep varname newname;
run;

filename program temp 'generated_program.sas';
data _null_;
   set vars (end=eof);
         file program;
         if _N_= 1 then do;
            put 'proc datasets lib=<library>  nolist; ';
            put 'modify <dataset-name>; ';
            put 'rename';
         end;
         put varname ' = ' nename ';' ;
       
         if eof then put '; run;' ;
run;

%include program;

/* in case of issues just repair the program before running */
    
Alexander2
Fluorite | Level 6

Hey thanks! When I fill in the libname in your 3rd line it says "error: Variable yourlibrary is not on file SASHELP.VCOLUMN. How can I fix this?

Tom
Super User Tom
Super User

The value of LIBRARY is the name of the libref defined that points to where the dataset lives.  So it cannot be longer than 8 characters and must be specified as a character string in all uppercase.

library='MYLIB' 

Similarly the MEMNAME is the name of the dataset in uppercase.  So it also must be a string and cannot be longer then 32 bytes.

data vars;
 set SASHELP.VCOLUMN ;
 where library='MYLIB' and memname='MYDATASET_NAME' ;
Kurt_Bremser
Super User

Library and member names have to be all uppercase when searching in the dictionary tables (and the respective sashelp views). Library names are also limited to 8 characters, so "yourlibrary" is impossible.

And you need to place those literals in quotes, or the data step compiler will misunderstand them for variable names within the view.

Alexander2
Fluorite | Level 6

I ended up using the countw() function as my workhorse. Here is my final solution:

 

data temp;
set var_list;
if index(lowcase(name), "__c") > 0 then ix="true";
count = countw(name, "_");
run;
data temp2;
set temp;
newstring=tranwrd(name, "__C", '');
run;
data temp4;
set temp2;
count = countw(newstring, "_");
len = length(name);
IF index(newstring, "OBS") then obs="true";
if index(newstring, "PER") then per="true";
if (length(newstring) - index(newstring, "OBS")) > 2 then breakOBS="true";
if (length(newstring) - index(newstring, "PER")) > 2 then breakPER="true";
IF obs = "true" and breakobs="true" then make_list1 = "true";
if per = "true" and breakper="true" then make_list2 = "true";
if count > 4 then make_list3 = "true";
if make_list1 = "true" or make_list2 = "true" or make_list3 = "true" then made_list = "true";
run;
data list(keep=name);
set temp4;
if made_list = "true";
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2230 views
  • 0 likes
  • 6 in conversation