Desktop productivity for business analysts and programmers

Rename columns based on a reference table

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 80
Accepted Solution

Rename columns based on a reference table

[ Edited ]

Have:

 

column reference table (named 'ColTable'), that looks like -

Standard Col_Names
Col_1      a1
Col_2      a2

one dataset (named 'sample'), that looks like -

a1  a2 
3   char

Want:
rename the columns in the dataset 'sample' to:

Col_1 Col_2
3     char

Questions:

How to achieve this?

 

Thanks.


Accepted Solutions
Solution
‎04-20-2017 03:14 AM
Frequent Contributor
Posts: 80

Re: Rename columns based on a reference table

Found one solution:

%macro test;

proc sql noprint;

select compress (Col_Names) into :list1 separated by " "
from ColTable;

select compress (Standard) into :list2 separated by " "
from sample;

quit;

%let Num_Col = %sysfunc(countw(&list1));
%do _j=1 %to &Num_Col;
%let ColName = %scan(&list1,&_j);
%let RefName = %scan(&list2,&_j);

data sample;
set sample;

rename &ColName = &RefName;
run;

%end;


%mend test;
%test;

However, if the column names contain space, then this method might not work.

View solution in original post


All Replies
Grand Advisor
Posts: 17,411

Re: Rename columns based on pre-defined reference tables (advanced)

So what exactly is your question?

Solution
‎04-20-2017 03:14 AM
Frequent Contributor
Posts: 80

Re: Rename columns based on a reference table

Found one solution:

%macro test;

proc sql noprint;

select compress (Col_Names) into :list1 separated by " "
from ColTable;

select compress (Standard) into :list2 separated by " "
from sample;

quit;

%let Num_Col = %sysfunc(countw(&list1));
%do _j=1 %to &Num_Col;
%let ColName = %scan(&list1,&_j);
%let RefName = %scan(&list2,&_j);

data sample;
set sample;

rename &ColName = &RefName;
run;

%end;


%mend test;
%test;

However, if the column names contain space, then this method might not work.

Respected Advisor
Posts: 3,837

Re: Rename columns based on a reference table

Hi @ayin

 

Even though you've got already a solution I'd like to propose yet another solution.

 

I believe @RW9's code only works when all columns in your ref table match with a column in your have table. Where I fully agree with @RW9 is, that you should be using Proc Datasets and not a data step.

 

Reason:

Renaming table names and changing variable attributes (with exception of the length) is only a change to the descriptor part of a SAS table (the "header"). You don't need to process the data for such a change. That's what Proc Datasets does for you. A datastep on the other hand fully re-creates the table and iterates through all the data. That's may be not that important for your real use case but it's going to make a huge performance difference when dealing with large volume tables.

 

And yes, post in the future please sample data created via SAS data steps (as I've done now in below sample). That makes it easier for us to propose unit tested code.

data col_ref;
  input (standard col_names) ($);
  datalines;
Col_1 a1
Col_2 a2
Col_3 a3
;
run;

data have;
  a1=3;
  a2='char';
  a99=55;
  output;
  stop;
run;


%macro StandardizeTable(sourceDS,targetDS,refTbl);

  /* generate column rename statements */
  %local renameList;
  proc sql noprint;
    select cats(col_names,'=',standard) into :renameList separated by ' '
    from &refTbl
    where upcase(col_names) in
      (
        select upcase(name)
        from dictionary.columns
        where libname="%upcase(%scan(work.&sourceDS,-2))" and memname="%upcase(%scan(&sourceDS,-1))"
      )
    ;
  quit;

  /* rename table and columns */
  proc datasets lib=%scan(work.&sourceDS,-2) nolist nowarn;

    /* rename table if targetDS name provided */
    %if %bquote(&targetDS) ne %bquote() %then
      %do;
        delete want;
        run;
        change %scan(&sourceDS,-1)=&targetDS;
        run;
      %end;
    %else %let targetDS=%scan(&sourceDS,-1);

    /* rename columns if any match with cols in refTbl */
    %if %bquote(&renameList) ne %bquote() %then
      %do;
        modify &targetDS;
          rename
            &renameList
          ;
        run;
      %end;
  quit;

%mend;

%StandardizeTable(have,,col_ref);

Thanks,

Patrick

Esteemed Advisor
Esteemed Advisor
Posts: 7,249

Re: Rename columns based on a reference table

Yuk.  That macro code is a bit overcomplicated and re-creates the dataset (using resources).  Bettter off doing a proc datasets dynamically like below.  However you say there may be spaces?  If the names are not valida SAS names then it will fail - this is why you messing around with keeping metadata in datasets - presumable from the bad datasource Excel - needs lots of quality control;

data _null_;
  set col_ref end=last;
  if _n_=1 then call execute('proc datasets data=one noprint nolist;');
  call execute(cat('rename ',strip(col_names),'=',strip(standard),';'));
  if last then call execute(';run;quit;');
run;

Do note you can re-label, format etc. at the same time with the above and it doesn't re-write the dataset (so on bigger data would be a saving).  Also Post test data in the form of a datastep!!!!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 261 views
  • 3 likes
  • 4 in conversation