SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Change all variable names into an indexed sequence

Accepted Solution Solved
Reply
Contributor
Posts: 56
Accepted Solution

Change all variable names into an indexed sequence

Hello everyone,

 

I have a very wide dataset with 229 variables. 

I am trying to change thoese variables' names into an enumerated sequence such that my final result will look like so:

sq_1 sq_2 sq_3.... sq_227.

 

I tried this macro but it failed:

%macro rename(lib,dsn,newname);
proc contents data=&lib..&dsn;
title 'before renaming';
run;
proc sql noprint;
%rename(WORK,call_qs1);
from dictionary.tables
where libname="&LIB" and memname="&DSN";
select distinct(name) into :var1-:var%trim(%left(&num_vars))
from dictionary.columns
where libname="&LIB" and memname="&DSN";
quit;
run;
proc datasets library = &LIB;
modify &DSN;
rename
%do i = 1 %to &num_vars.;
&&var&i = &sq._&i.
%end;
;
quit;
run;
proc contents data=&lib..&dsn.;
title 'after renaming';
run;
%mend rename;

%rename(WORK,call_qs1,call_qs2);

 

 

As always, Thank you for your help.

 


Accepted Solutions
Solution
‎01-20-2016 08:15 AM
Super User
Posts: 9,687

Re: Change all variable names into an indexed sequence

Here is an example :

 

 

data have;
 set sashelp.class;
run;





proc transpose data=have(obs=0) out=temp;
 var _all_;
run;
data temp;
 set temp;
 length name $ 20;
 name=cats('sq_',_n_);
run;
proc sql;
 select cats(_name_,'=',name) into : rename separated by  ' '
  from temp;
quit;
proc datasets library=work nodetails nolist;
 modify have;
 rename &rename ;
quit;

View solution in original post


All Replies
Super User
Posts: 6,963

Re: Change all variable names into an indexed sequence

What was your problem, and what does your log look like?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 56

Re: Change all variable names into an indexed sequence

Sas just crashes after I run the macro.

I am unable to see the log, unfortunately.

Super User
Posts: 6,963

Re: Change all variable names into an indexed sequence

Did you notice that you have a recursion in your macro?

It calls itself right after the proc sql noprint;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 56

Re: Change all variable names into an indexed sequence

Oh yes, I can see that there is a log output repeating itself. I couldn't not quite read what it says as it goes fast and SAS crashes.
Solution
‎01-20-2016 08:15 AM
Super User
Posts: 9,687

Re: Change all variable names into an indexed sequence

Here is an example :

 

 

data have;
 set sashelp.class;
run;





proc transpose data=have(obs=0) out=temp;
 var _all_;
run;
data temp;
 set temp;
 length name $ 20;
 name=cats('sq_',_n_);
run;
proc sql;
 select cats(_name_,'=',name) into : rename separated by  ' '
  from temp;
quit;
proc datasets library=work nodetails nolist;
 modify have;
 rename &rename ;
quit;
Super User
Super User
Posts: 7,413

Re: Change all variable names into an indexed sequence

Well, you can do it with arrays:

data have;
  vara=1; varxyz=2; abc=4;
run;

data want (keep=sq_:);
  set have;
  array aq_{*} vara--abc;
  array sq_{3} 8.;
  do i=1 to 3;
    sq_{i}=aq_{i};
  end;
run;

What I don't understand however is why this does not work, I assume the array is defined as _temporary_ but can't seem to find an option to make it permanent:

data have;
  vara=1; varxyz=2; abc=4;
run;

data want (keep=sq_:);
  set have;
  array sq_{*} vara--abc;
run;

To my mind the above should work?

Trusted Advisor
Posts: 1,115

Re: Change all variable names into an indexed sequence

@RW9: Isn't this just the usual behavior of arrays in SAS ("temporarily identifying a group of variables" [online help])? That is, in your example you can refer to the existing variables vara, ..., abc as sq_[1], ..., sq_[3] within the data step where the array is defined, but this does not change the variables' names and does not create new variables sq_1, ..., sq_3.

Super User
Super User
Posts: 7,413

Re: Change all variable names into an indexed sequence

Well, this is what is confusing me.  In the first example, I have the temporary reference to the variables, and I have another array, and basically copy to the sq_ array.  This works, and we get the variables sq_1-sq_3 - note I have not put any other options on that.  In the second instance, aq_1-aq_3 is created as I can reference it in the code, but it is removed before the end of the datastep as its considered temporary.  So the question is whys does:

array sq_{3} 8.;

Create the variables sq_1-sq3 and keep them, but:

array aq_{*} vara--abc;

Creates aq_1-aq_3, but creates a temporary so get dropped.

 

And is there an options I can pubt on that line:

array aq_{*} _nottemporary_ vara--abc;

To avoid this, as with that optin the OP's request is simply a one line statement then.

Trusted Advisor
Posts: 1,115

Re: Change all variable names into an indexed sequence

@RW9: As I see it, an array is a list of variables which can (individually) be referenced by the array name followed by a subscript in brackets (or parentheses or curly brackets ...) after the array has been defined, but only within the same data step (and not in DROP/KEEP/RENAME statements or the corresponding dataset options). In a _TEMPORARY_ array, these variables (!) exist only during the data step where the array is defined (and can only be referenced using the subscript notation, are automatically retained, but not written to the output dataset).

 

In an array definition like array sq_{3} 8; (no need for a period after the 8 -- it's the length, not a format) the list of variables is implicitly defined by default as sq_1, sq_2, sq_3. In a definition like array aq_{*} vara varxyz abc; the list of variables is explicitly defined. Special variable lists such as vara--abc or c: or _numeric_ require the existence of corresponding variables in the PDV. In these cases the order of ARRAY and, e.g., SET statement is important.

 

In all cases where the list of variables does not require the existence of the variables in the PDV, one of the following two cases applies to each variable in the list:

  1. the variable exists in the PDV: then it can henceforth (but only within the same data step, and see further restrictions above) be referenced as the respective array element, i.e. with the subscript notation like aq_{2}
  2. the variable does not exist in the PDV: then it is created as a new variable (and can be referenced ... as above)

So, in your first example, array aq_{*} vara--abc; just enables you to refer, e.g., to the existing variable abc as aq_{3} (but not as aq_3, there is no such variable in the PDV!), this is case 1. Since dataset HAVE does not contain variables sq_1, sq_2, sq_3, the array definition array sq_{3} 8; creates new variables (i.e. adds them to the PDV) with those default names, this is case 2. The new variables are by default kept in the output dataset, like any other new variables.

 

I think, the above explains already why the second example (with keep=sq_:) does not work: There is no variable whose name starts with sq_ in the PDV and the KEEP= option would not even accept a reference like sq_{1} (with the intention to keep variable vara).

 

There seems to be no option to kind of duplicate a list of existing variables, so that array aq_{*} option vara--abc; would create new variables aq_1, ... with the same values as vara, ... Instead, it's either case 1 or case 2 that applies to each individual variable in the list. In the definition array aq_{*} vara newvar abc; case 1 would apply to vara and abc and case 2 to newvar (i.e., a new variable newvar would be created). If dataset HAVE happened to contain variables sq_1 and sq_3, but not sq_2, the statement array sq_{3}; would create only one new variable, sq_2.

 

 

 

 

Contributor
Posts: 56

Re: Change all variable names into an indexed sequence

Thank you very much this suggestion worked just fine.

 

Thank you ever so much. A quick explanation for the following line would be great if you would:

 

proc sql;
 select cats(_name_,'=',name) into : rename separated by  ' '
  from temp;
quit;

 Is it storing all variables names in a vector called rename? 

 

Thank you.

Super User
Posts: 6,963

Re: Change all variable names into an indexed sequence

"Is it storing all variables names in a vector called rename?"

 

Yes, but in SAS speak this is not a vector, it's a macro variable.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 56

Re: Change all variable names into an indexed sequence

Oh okay. I got it!

Thank you ever so much.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 865 views
  • 2 likes
  • 5 in conversation