Rename variables with the same suffix

Reply
Contributor Kc2
Contributor
Posts: 42

Rename variables with the same suffix

Hello,

 

I am using a macro to:

 1)  read through all the datasets which are in my external  library.

 2) pick only the ID and dates variables from one dataset at a time

Some datasets have 1 date variable , others have 2. All the date variables have the same structure datanameDAT, datanameDAT2.

 

In the last step of my macro I want to rename the datanameDAT, datanameDAT2 to date1 and date2.

The issue I am running into is that each dataset has a different prefix. Only the DAT suffix matches.

 

I am not able to rename the variables inside the  macro.

I also want to set all the external datasets into one dataset  by appending them after the steps in the macro, using the ID and the new date1 and date2 variables.

 

data have1;

  set ext.have;

  keep &keepvar.;

run;

 

data have2;

  set have1;

  how do I rename the variables haveDAT, haveDAT2 to date1, date2 in this step. When the macro runs it goes  through all the datasets in the external library so I don't know the dataset it is working on.

run;

 

data final;

 set have2;

I want to append all the datasets into one as they get processed by the macro.

run;

Any suggestions?

 

thanks.

KC

 

PROC Star
Posts: 7,791

Re: Rename variables with the same suffix

Here is one way:

data have;
  input havedat havedat2;
  cards;
1 2
;

data _null_;
  set have (obs=1) indsname=ds;
  call execute('data need;set have; rename ');
  call execute(catt(scan(ds,2,'.'),'dat=date1;'));
  call execute('rename ');
  call execute(catt(scan(ds,2,'.'),'dat2=date2;'));
  call execute('run;');
run;

Art, CEO, AnalystFinder.com

 

Contributor Kc2
Contributor
Posts: 42

Re: Rename variables with the same suffix

Hello Art,

 

thank you for your response but it is not working. I get an error message stating the variables cannot be found. The piece of code you shared assumes that the variables have the same  prefix as the entry dataset but I have some special cases:

 

I have a dataset named substance, and it has the variables  ID, DRUGDAT, SMOKDAT

 

I have a dataset named endurance, and it has the variables ID,  ENDFDAT, ENDLDAT

 

Is it possible to pick from the list of variables available only those which end in DAT and rename those?

 

Thank you ,

KC.

 

 

Super User
Posts: 5,987

Re: Rename variables with the same suffix

It sounds like you are creating a headache instead of solving a problem.  Your current variable names are fine.  If you encounter (in the future) names like DATE1 and DATE2, how will you know which date refers to which variable?  At a minimum, don't rename them.  Copy them to DATE1 and DATE2, which you can use for sorting purposes.

 

Do you really intend to append a bunch of data sets that contain different sets of variables?  I'm not sure you appreciate what that result would look like.  If you spell out where you are headed (if you succeed in this endeavor), you might be able to get some suggestions about ways to obtain your final objective.

 

At any rate, your current macro can probably do the job with minor modification.  Do you understand how it works, or is it code that was given to you without explanation?

Contributor Kc2
Contributor
Posts: 42

Re: Rename variables with the same suffix

Posted in reply to Astounding

Yes I do understand the macro code I am using.

I have around 25 datasets in my external library. These will increase to a few more. I don't want to have to update my program each time a new dataset is added.

In the first step of the macro, using proc sql I create a table DATLST that lists the datasets and the variables from dictionary.columns, Since I only want the IDs and the date I specify a where condition where name  in "ID" or name like "%DAT " .

Then with another proc sql using count I count the number of memname to get the number of dataset and I assign them to a macro variable using into:.TOTCNT

 

I do a loop from 1 to totcnt;

 first I grab from DATLST the memane at positon i

by using a data _null_ statement with a call symput  to assign the memname to a macro variable

 

Then in a second data step I use the macro variable to call from  the external directory the dataset = to memname and do some processing.

Because the number of datasets is not fixed I am trying to automate the program so I don't have to manually to the changes when they happen.

Right now I am able to append all my datasets into one large one. The issue is that the date variable are different  and thus I cannot do the sorting by date .

If I can rename them dat variable then I can sort the large datasets by ID and dates and continue the processing I need.

In  a data step . I know that I can specify each dataset and do the renaming in a simple way. I don't want to have to do it manually for each dataset

 

 

 

Super User
Posts: 5,987

Re: Rename variables with the same suffix

OK, let's back it up to the point where you extract from dictionary.columns.  If I understand correctly, you now have a SAS data set with MEMNAME and NAME, and a limited number of observations representing a limited number of variables (ID, and any date variables).  This might not be the fastest way, but it might be reasonably simple and effective.

 

proc sort data=extract;

by memname;

run;

 

data _null_;

set extract (where=(upcase(name) ne 'ID'));

by memname;

if first.memname then do;

   datecount=1;

   call execute ('data ' || memname || '; set ' || memname || ';');

end;

else datecount + 1;

call execute ('rename ' || name || '= date' || put(datecount, 1.) || ';');

if last.memname then call execute('run;');

run;

 

This generates a series of DATA steps, one for each in your extract.  It renames every date variable selected by the extract.  It keeps all variables originally in the data set, however.  Is this at least moving in the right direction?  

Contributor Kc2
Contributor
Posts: 42

Re: Rename variables with the same suffix

Posted in reply to Astounding

I tried the solution you suggest before posting my question on this forum. it does not work. If you change the variable name  in the first step , then when you go to call the actual data from the external library the variables which are renamed cannot be found.

 

This is my code:

proc sql;

create table DATLST as

select memname,name

from dictionary.columns

where upcase(libname)="EXT" and (upcase(name)in("ID") or upcase(name) like ("%DAT") );

quit;

Proc sort data= datlst ; by memname name ; run;

%macro loop();

%local TOTCNT rpt dsin keepvar;

 

proc sql noprint;

select count(memname)

into :TOTCNT

from DATLST;

quit;

%put datasets= &TOTCNT;

 

%let rpt=1;

 

%do %while (&rpt.<= 2/*&TOTCNT.*/);

 

data _null_;

set DATLST (firstobs=&rpt. obs=&rpt.);

call symput("dsin",strip(memname));

run;

 

data ds0;

set DATLST;

where memname="&dsin.";

run;

 

proc transpose data=ds0 out=tr_ds0(drop=_name_ _label_);

by memname ;

var name;

run;

 

 

proc sql noprint ;

select count(distinct name )into :varnum

from ds0 ;

%put varnum = &varnum;

 

data tr_ds ;

length keepvar_ $600;

 

set tr_ds0 ;

retain keepvar_;

%do i=1 %to &varnum.;

keepvar_=catx(" " ,keepvar_,col&i.);

call symput("keepvar",keepvar_);

i+1;

%end;

%put vars to keep= &keepvar;

run;

 

data ds1;

length fromdat $200;

set idraw.&dsin.;

fromdat="&dsin.";

keep fromdat &keepvar.;

run;

 

 

/*At this stage my ds1 dataset has 4 variables: ID, FROMDAT, XXXDAT, XXXDAT. I don't know what the XXX is and I want to do the renaming automatically.SAS does allow to manipulate variable who have the same prefix with the colon e.g SAME:.  I am wondering if there is a way to do it with variables with the same suffix. */

 

 

data ds2;

set ds1;

/*renaming DAT variables*/

run;

 

 

%let rpt=%eval(&rpt.+1);

%put rpt= &rpt.;

 

 

%end;

%mend;

 

%loop()

Contributor Kc2
Contributor
Posts: 42

Re: Rename variables with the same suffix

[ Edited ]

Using array has solved the problem.

data ds2 ;

set ds1 ;

array vars[*] _numeric_;

 date1 = vars[2];

 date2 = vars[3];

 

keep subject date1 date2 ;

format date1 date2 date9.;

run;

 

 

Respected Advisor
Posts: 2,161

Re: Rename variables with the same suffix

[ Edited ]

Kc2 wrote:

Using array has solved the problem.

data ds2 ;

set ds1 ;

array vars[*] _numeric_;

 date1 = vars[2];

 date2 = vars[3];

 

keep subject edate1 edate2 ;

format svdate1 svdate2 date9.;

run;

 

 


Even though you have marked this solution as correct, it can't possibly be correct because there is no renaming of variables to add the same suffix here. Thus, I am asking you to explain how this is the proper solution; or un-mark it as correct.

 

Using PROC SQL, it is relatively easy to rename variables to have the same suffix.

 

UNTESTED CODE

data cars;
	set sashelp.cars;
run;
proc contents data=cars noprint out=_cont_;
run;
proc sql noprint;
	select distinct cats(name,'=',name,'_mysuffix') into :renames separated by ' ' from _cont_ where type=1;
quit;
%put &=renames;
proc datasets library=work;
	modify cars;
	rename &renames;
run;
quit;

 

--
Paige Miller
Contributor Kc2
Contributor
Posts: 42

Re: Rename variables with the same suffix

Posted in reply to PaigeMiller

It work because I know that in my dataset I have 3 numeric variables. the first one is the ID the second and third ones are the dates.

Because the date variable name are not consistent instead of renaming the variable I am assigning the names I want , date1 and date2 , to the variables in located in position 2 and 3 of my array. it solved my problem.

I am not renaming them using the rename function but doing it by assigning them through an array.

PROC Star
Posts: 7,791

Re: Rename variables with the same suffix

Although this is the first time you mentioned the fact that the second and third numeric variables are ALWAYS the two dates you're interested in, your code (as posted) won't work because:

 

You create date1 and date2, but don't keep them (you are keeping subject, edate1 and edate2)

You format svdate1 and svdate2, but never mention them in your code. Do those two variables even exist?

 

Art, CEO, AnalystFinder.com

 

Contributor Kc2
Contributor
Posts: 42

Re: Rename variables with the same suffix

I updated the code I posted, the one which solved my problem.

Thank you for response.

Respected Advisor
Posts: 2,161

Re: Rename variables with the same suffix


Kc2 wrote:

It work because I know that in my dataset I have 3 numeric variables. the first one is the ID the second and third ones are the dates.

Because the date variable name are not consistent instead of renaming the variable I am assigning the names I want , date1 and date2 , to the variables in located in position 2 and 3 of my array. it solved my problem.

I am not renaming them using the rename function but doing it by assigning them through an array.


You still need to UNMARK your solution as the correct answer, because it has nothing to do with renaming variables now, which was the original question.

--
Paige Miller
Super User
Super User
Posts: 7,393

Re: Rename variables with the same suffix

[ Edited ]

I am not sure I understand what your goal is. 

Are you numbering the new "DATEnn" variables restarting from 1 for each separate input dataset? Or do you want to number them as one long sequence?

It is simple to get the list of variables and generate a new name.  Here is example that gives each one a distinct name.

proc sql noprint;
  create table DATLST as
    select libname,memname,varnum,name
    from dictionary.columns
    where libname="EXT"
      and (upcase(name) = "ID" or upcase(name) like '%DAT')
    order by 1,2,3
  ;
quit;

data datlst ;
  set datlst ;
  by libname memname ;
  length newname $8 ;
  if upcase(name) ne 'ID' then do;
     n+1;
     newname=cats('DATE',n);
  end;
run;

Now what are you going to do with this list?

Are you just trying to rename the "DAT" variables?

Are you trying to merge a group of datasets by ID and keep only the "DAT" variables?

You could generate code to do that easily from the dataset above.  You could even generate code to assign labels to the new variables to indicate where they came from.

filename code temp ;
data _null_;
  file code lrecl=80 ;
  set datlst end=eof;
  by libname memname ;
  if _n_ = 1 then put 'merge' ;
  if first.memname then put @3 libname +(-1) '.' memname '(' @ ;
  put 'keep=' name @ ;
  if upcase(name) ne 'ID' then put 'rename=(' name '=' newname ') ' @ ;
  if last.memname then put ')';
  if eof then put ';' ;
run;

filename labels temp ;
data _null_;
  file labels ;
  set datlst end=eof;
  where upcase(name) ne 'ID' ;
  if _n_ = 1 then put 'label' ;
  put @3 newname "='" libname memname name "'" ;
  if eof then put ';' ;
run;

data all ;
%include code / source2 ;
  by id;
%include labels / source2 ;
run;

 Let's make up a little test situation to see if that code generation works. First let's make two copies of SASHELP.CLASS and rename NAME as ID. 

data class1 class2; set sashelp.class ; rename name=id; run ;

Then let's make an example DATLST dataset and see if the code generation works.

data datlst ;
  length libname $8 memname $32 varnum 8 name $32 newname $8 ;
  input libname -- newname ;
cards;
work class1 1 id .
work class1 2 height ht
work class2 1 id .
work class2 2 weight wt
work class2 5 sex gender
;

Results

148  data all ;
149  %include code / source2 ;
NOTE: %INCLUDE (level 1) file CODE is file
      /hpc/grid/sas-scratcht/SAS_workE46700072DF7/#LN00023.
150 +merge
151 +  work.class1 (keep=id keep=height rename=(height =ht ) )
152 +  work.class2 (keep=id keep=weight rename=(weight =wt ) keep=sex rename=(sex =
153 +gender ) )
154 +;
NOTE: %INCLUDE (level 1) ending.
155    by id;
156  %include labels / source2 ;
NOTE: %INCLUDE (level 1) file LABELS is file
      /hpc/grid/sas-scratcht/SAS_workE46700072DF7/#LN00024.
157 +label
158 +  ht ='work class1 height '
159 +  wt ='work class2 weight '
160 +  gender ='work class2 sex '
161 +;
NOTE: %INCLUDE (level 1) ending.
162  run;

NOTE: There were 19 observations read from the data set WORK.CLASS1.
NOTE: There were 19 observations read from the data set WORK.CLASS2.
NOTE: The data set WORK.ALL has 19 observations and 4 variables.

 

Contributor Kc2
Contributor
Posts: 42

Re: Rename variables with the same suffix

Thank you everyone for your help. I have solved my problem by reassigning my variables instead of renaming them.

KC

 

Ask a Question
Discussion stats
  • 14 replies
  • 209 views
  • 0 likes
  • 5 in conversation