Replace prefix of variable

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Replace prefix of variable

Hi, 

 

I have created a list of variables ending with months and years in the format costMMYYYY (cost12017 as an example). I need to rename the set of variables to cost_MMYYYY (cost_012017), however, there are a lot of them to do using a simple RENAME (4 years worth). I tried an array and do loop and my code is creating the correct var names, but the values are inaccurate. I was trying to set them equal to the original and was planning to then drop the original: 

 

DATA want 
SET have;
array old(12) cost12012 cost22012 cost32012 cost42012 cost52012 cost62012
72012 cost82012 cost92012 cost102012 cost112012 cost122012;
array new (12) cost_12012 cost_22012 cost_32012 cost_42012 cost_52012 cost_62012
cost_72012 cost_82012 cost_92012 cost_102012 cost_112012 cost_122012;
do i=1 to dim(old);
do j=1 to dim(new);
new(j)=old_hci(i);
end;
end;
RUN;

 

Any suggestions on how to either correct this array/loop OR another way to rename my vars? 

 

Thank you!


Accepted Solutions
Solution
‎01-05-2017 07:55 PM
Super User
Posts: 5,093

Re: Replace prefix of variable

You have received a lot of good advice here.  However, just in case it is too complex given your current experience with SAS ...

 

Your program is close to working.  You just need only a single DO loop not nested DO loops.

 

do i=1 to dim(new);

  new{i} = old{i};

end;

 

That still leaves you with an extra set of variables in your data set.  You might want to drop the old names, once you are certain that the program is working.

 

Also, if you insist on using months and years as part of the variable name (and you have been advised against it by now), it would be much easier if you put the year first:  cost_201201

 

That way, the variable names will always be in order even if you have more than one year included in your data.

View solution in original post


All Replies
Super Contributor
Posts: 251

Re: Replace prefix of variable

[ Edited ]

Let me introduce you to the magic of dictionary tables. They first appeared in SAS 6.07, referenced in my favourite manual: Technical Report P-222.

 

You can reference them in SQL, or as their SASHELP analogues (dictionary.columns becomes the view sashelp.vcolumn).

 

Here's how I made your code work - and it doesn't need arrays at all. The first step is just populating the source dataset - you won't need this, of course.

 

data have;
retain cost12012 cost22012 cost32012 cost42012 cost52012 cost62012
       cost72012 cost82012 cost92012 cost102012 cost112012 cost122012;
call missing(of _numeric_);
stop;
run;

proc sql noprint;
select strip(name) || '=' || catx('_', substr(name, 1, 4), substr(name, 5))
  into :new_vars separated by ' '
  from dictionary.columns
 where libname = 'WORK'
   and memname = 'HAVE';
quit;

data want;
set have;
rename &new_vars;
run;

The SQL clause is creating a long macro variable (new_vars) with the variable renaming. The dataset does the actual renaming with the - well - rename command.

Super User
Posts: 17,912

Re: Replace prefix of variable

And if you don't want to replace your data, only rename the variables use PROC DATASETS instead of the final data step.

 

proc datasets library=LIBRARY nodetails nolist;

modify DATASETNAME;

rename &new_vars;

run;quit;

Super User
Posts: 10,538

Re: Replace prefix of variable

Any you have encountered one of the many reasons not to incorporate data in the variable name. For most analysis and reporting data works much better in the form of

Date and Value

with one record per date and value.

instead of ValueatDate1 ValueAtDate2 ValueAtDate3.

Your number of variables will continue to grow requiring any program doing analyis to keep adding variables and complexity of code. With the Date and Value approach the code stays the same and selection of time frames is done using the Date not variables.

 

For Reports then the date, often with appropriate format to group by year, month and year, quarter and year or even week and year, can form appropriate column headings, and without adding any additional variables.

Solution
‎01-05-2017 07:55 PM
Super User
Posts: 5,093

Re: Replace prefix of variable

You have received a lot of good advice here.  However, just in case it is too complex given your current experience with SAS ...

 

Your program is close to working.  You just need only a single DO loop not nested DO loops.

 

do i=1 to dim(new);

  new{i} = old{i};

end;

 

That still leaves you with an extra set of variables in your data set.  You might want to drop the old names, once you are certain that the program is working.

 

Also, if you insist on using months and years as part of the variable name (and you have been advised against it by now), it would be much easier if you put the year first:  cost_201201

 

That way, the variable names will always be in order even if you have more than one year included in your data.

Respected Advisor
Posts: 3,901

Re: Replace prefix of variable

Don't use a data step but use proc datasets for this task as proposed by @Reeza

 

Reasons:

- renaming only requires processing of the descriptor part (the "header") of your SAS file. No need to iterate through all the data.

- renaming instead of replacing preserves all other variable attributes like formats and labels.

data work.have;
  retain cost12012 cost22012 cost32012 cost42012 cost52012 cost62012
    cost72012 cost82012 cost92012 cost102012 cost112012 cost122012;
  call missing(of _numeric_);
  stop;
run;

proc sql noprint;
select strip(name) || '=' || catx('_', substr(name, 1, 4), substr(name, 5))
  into :new_vars separated by ' '
  from dictionary.columns
 where libname = 'WORK'
   and memname = 'HAVE';
quit;

proc datasets library=work nodetails nolist;
  modify have;
  rename &new_vars;
  run;
quit;
New Contributor
Posts: 2

Re: Replace prefix of variable

Thank you! I am quickly learning the consequences of variables with dates. Appreciate your suggestions and will experiment with dictionary tables as well!
Super Contributor
Posts: 251

Re: Replace prefix of variable

I endorse @Patrick's response. In my solution, I allowed for wanting a separate dataset with different variable names. But if you don't need that, proc datasets is far superior. It results in virtually no I/O: instantaneous, as it only modifies the dataset's metadata.

 

But reformatting your data to use by groups would be a better idea. It requires a little pre-parsing of the data beforehand, but it will reap efficiency and reliability rewards.

Respected Advisor
Posts: 3,901

Re: Replace prefix of variable

Below some sample code demonstrating how you could transpose your data from a wide to a long structure.

You can then always use Proc Transpose should you ever need a wide structure again.

data work.have;
  retain cost12012 cost22012 cost32012 cost42012 cost52012 cost62012
    cost72012 cost82012 cost92012 cost102012 cost112012 cost122012;
  call missing(of _numeric_);
  output;
  stop;
run;

data want(drop=_i cost: rename=(_cost=cost));
  set have;
  attrib
    _cost length=8
    date length=8 format=mmyyn6.
    ;

  array costs {*} cost:;
  do _i=1 to dim(costs);
    _cost=costs[_i];
    if length(vname(costs[_i]))=9 then
      do;
        date=mdy(substrn(vname(costs[_i]),5,1),'01',substrn(vname(costs[_i]),6,4));
      end;
    else
      do;
        date=mdy(substrn(vname(costs[_i]),5,2),'01',substrn(vname(costs[_i]),7,4));
      end;
    output;
  end;
run;
Respected Advisor
Posts: 3,777

Re: Replace prefix of variable

You could use the features of the RENAME data set option but you will need help from a somewhat obscure SAS system option.

 

 DKRICOND=ERROR
Option Definition Information for SAS Option DKRICOND
    Group= ERRORHANDLING
    Group Description: Error messages and error conditions settings
    Group= SASFILES
    Group Description: Library and member file information
    Description: Specifies the error level to report when a variable is missing from an input data set during the processing of a 
                 DROP=, KEEP=, or RENAME= data set option.

 

data have;
   stop;
   retain cost12012 cost22012 cost32012 cost42012 cost52012 cost62012
      cost72012 cost82012 cost92012 cost102012 cost112012 cost122012;
   call missing(of _numeric_);
   run;
proc options option=dkricond define;
   run;
options dkricond=nowarning;
data need;
   set have(rename=(cost12012-cost122012=cost_12012-cost_122012));
   run;
options dkricond=error;
proc contents;
   run;

Capture.PNG 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 466 views
  • 9 likes
  • 7 in conversation