BookmarkSubscribeRSS Feed
braam
Quartz | Level 8

Can I use array to "rename" many variables? I know there are some other ways using SQL, but don't know if it's possible with array. The first code works of course, but the second doesn't. I'm also curious why the second code with array doesn't work. Perhaps I don't understand the array's characteristics.

 


	data want; set sashelp.cars;
		rename Model= Mo;
		rename Type= Ty;
		rename Origin= Or;
		run;

 

 

	data want; set sashelp.cars;
		array VAR 	Model Type Origin;
		array TEMP	Mo Ty Or;

		do i=1 to dim(VAR);
			rename VAR[i]= TEMP[i];
		end;
		run;
10 REPLIES 10
PaigeMiller
Diamond | Level 26

You can't use an array to rename variables.

 

If the variable names all have a pattern (such as x1-x27) and the renamed variables have a pattern (such as paige1-paige27) — which by the way are highly recommended variable names 😉  — then you can resort to some PROC SQL trickery. If there is no pattern, as in your example, you'd have to type the rename statement yourself.

--
Paige Miller
FreelanceReinh
Jade | Level 19

@PaigeMiller wrote:

If the variable names all have a pattern (such as x1-x27) and the renamed variables have a pattern (such as paige1-paige27) — which by the way are highly recommended variable names 😉  — then you can resort to some PROC SQL trickery. 


@PaigeMiller: I think you mean the RENAME statement in PROC DATASETS, which indeed doesn't accept variable lists -- unlike the RENAME statement of the DATA step where numbered variable lists are allowed:

rename x1-x27 = paige1-paige27;

Even if the ranges differ (but have the same length):

rename x1-x27 = paige20-paige46;

If the pattern is not of this special form, then I would start thinking about "PROC SQL trickery".

 

Replicating @braam's example:

proc sql noprint;
select cats(name,'=',substr(name,1,2)) into :renlist separated by ' ' 
from dictionary.columns
where libname='SASHELP' & memname='CARS' & 2<=varnum<=4;
quit;

data want;
set sashelp.cars;
rename &renlist;
run;
PaigeMiller
Diamond | Level 26

 


@FreelanceReinh wrote:

@PaigeMiller wrote:

If the variable names all have a pattern (such as x1-x27) and the renamed variables have a pattern (such as paige1-paige27) — which by the way are highly recommended variable names 😉  — then you can resort to some PROC SQL trickery. 


@PaigeMiller: I think you mean the RENAME statement in PROC DATASETS, which indeed doesn't accept variable lists -- unlike the RENAME statement of the DATA step where numbered variable lists are allowed:

rename x1-x27 = paige1-paige27;

Even if the ranges differ (but have the same length):

rename x1-x27 = paige20-paige46;

If the pattern is not of this special form, then I would start thinking about "PROC SQL trickery".

 

The problem using the DATA step rename is that the entire data set has to be read and written, and if this is a large data set then this gets time consuming. Using the PROC SQL trickery, combined with PROC DATASETS, only the metadata for a data set is changed, and no records are read or written. So generally, I do not like the DATA step rename, despite its programming simplicity.

--
Paige Miller
FreelanceReinh
Jade | Level 19

@PaigeMiller wrote:

So generally, I do not like the DATA step rename, despite its programming simplicity.


Same with me. Luckily (for my suggested code), the OP's example (using sashelp.cars) was not amenable to the metadata approach anyway.

Tom
Super User Tom
Super User

The problem using the DATA step rename is that the entire data set has to be read and written, and if this is a large data set then this gets time consuming. 

That is normally not a problem. The reason to rename the variables is normally to use the data in some way.  There generally no good reason to go back and rename the variable in an existing dataset. Instead fix the process that creates the dataset or the process that uses it.

 

PaigeMiller
Diamond | Level 26

@Tom wrote:

The problem using the DATA step rename is that the entire data set has to be read and written, and if this is a large data set then this gets time consuming. 

That is normally not a problem. The reason to rename the variables is normally to use the data in some way.  There generally no good reason to go back and rename the variable in an existing dataset. Instead fix the process that creates the dataset or the process that uses it.

 


I can't agree that this is not normally a problem. It is normally a problem. In most applications, we have no control over the naming of the variables in the databases we have to use, or the layout/structure of how the data arrives at our doorstep. In many cases, the data has to be re-arranged and/or renamed before we can then go ahead and do SAS analyses on them.

--
Paige Miller
Tom
Super User Tom
Super User

The RENAME statement needs the names of the variables.  

You don't need so many RENAME statements.  One statement can include as many pairs of old=new names as you want.

rename
  Model= Mo 
 Type= Ty
 Origin= Or
;
andreas_lds
Jade | Level 19

I have to agree to @Tom comment: renaming many variables is hardly ever a good idea. So why do have to change the names of many variables? Creating a view with the new names using the original dataset could be a solution, because you can use the lists mentioned by @FreelanceReinh and still avoid reading the dataset completely.

braam
Quartz | Level 8

@andreas_lds @Tom @FreelanceReinh @PaigeMiller 

Thanks all for your replies. The reason why I'm trying to do so is that I have many "character" variables that actually have numeric values. Let's say A, B, and C. Using array, I generated A2, B2, and C2(, which are numeric now) and dropped A, B, and C, (characters). Thereafter, I wanted to rename A2, B2, and C2 to A, B, and C, respectively. Many thanks for having informed me.

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @braam 

 

You can rename variables with Proc Datasets, and you can use arrays to automate the process with a data _null_ step that generates statements and executes them with call execute as shown below.

 

You cannot mix variable types in the same array, so you neew two arrays, one for numeric variables and another for character variables. The automatic arrays _character_ and _numeric_ takes all variables, that's why the length statement in the following code is placed after the array statements, otherwise they would be part of the array too.

 

The code handles all variables with names consisting of one or more letters followed by one or more digits. The digits are removed with prxchange, and if the resulting name is different, the variable is renamed.

 

I prefer to avoid changing things "in place", like renaming variables in an existing data set or using the same data set for input and output in a data step, because it destroys input data if anything goes wrong, so I added another step with a few changes to do the same with data step rename. It makes a copy, so it is ineffecient in comparison, byt I consider it "best practice".

 

data have;
	ID = 123;
	A1 = 'A';
	B2 = 'B';
	C2 = 1;
	D2D = 'D2D';
run;

data _null_; set have end=end;
	array char _character_;
	array num _numeric_;
	if _N_ = 1 then call execute('proc datasets library=work nolist; modify have;');

	length name newname $32;
	do i = 1 to dim(char);
		name = vname(char[i]);
		newname = prxchange('s/(\D+)(\d+)$/$1/',-1,trim(name));
		if newname ne name then do;
			cmd = catx(' ','rename',name,'=',newname,';');
			call execute(cmd);
		end;
	end;

	length name newname $32;
	do i = 1 to dim(num);
		name = vname(num[i]);
		newname = prxchange('s/(\D+)(\d+)/$1/',-1,name);
		if newname ne name then do;
			cmd = catx(' ','rename',name,'=',newname,';');
			call execute(cmd);
		end;
	end;
	if end then call execute('run; quit;'); 
run;

data _null_; set have end=end;
	array char _character_;
	array num _numeric_;
	if _N_ = 1 then call execute('data want; set have;');

	length name newname $32;
	do i = 1 to dim(char);
		name = vname(char[i]);
		newname = prxchange('s/(\D+)(\d+)$/$1/',-1,trim(name));
		if newname ne name then do;
			cmd = catx(' ','rename',name,'=',newname,';');
			call execute(cmd);
		end;
	end;

	length name newname $32;
	do i = 1 to dim(num);
		name = vname(num[i]);
		newname = prxchange('s/(\D+)(\d+)/$1/',-1,name);
		if newname ne name then do;
			cmd = catx(' ','rename',name,'=',newname,';');
			call execute(cmd);
		end;
	end;
	if end then call execute('run;'); 
run;

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 10928 views
  • 4 likes
  • 6 in conversation