Solved: Re: Naming variables in sequential order

Job04 · Posted 03-09-2022 10:47 AM

data have1;

My data have 240 variables that are named sequentially VAR_1 to VAR_240. I want to change the name on the variables to read like this: VAR 001 VAR 002 .....VAR 069..... VAR125.......VAR 240. Below is how the data looks like and I import it using proc import.

Thank you

infile datalines dlm=',';
input id $ VAR_1 VAR_2 VAR_3 VAR_4;
datalines;
1,4.19855,5.74574,33.46678,6.85391
3,3.48004,6.69138,31.85662,11.73753
4,3.33851,5.74293,36.09064,10.9801
9,3.2966,8.15718,30.27008,7.62836
;

run;

Kurt_Bremser · Posted 03-09-2022 11:09 AM

proc transpose data=have out=long;
by id;
var var:;
run;

data want;
set long;
sequence = input(scan(_name_,2,"_"),best.);
drop _name_;
rename col1=value;
run;

You will probably be much better off keeping the long structure, but if you absolutely want the unwieldy wide layout, you can do this:

data long2;
set want;
idvar = 'VAR_' !! put(sequence,z3.);
run;

proc transpose data=long2 out=want_wide (drop=_name_);
by id;
var value;
id idvar;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

mkeintz · Posted 03-09-2022 11:05 AM

If you really want to change names to a structure like VAR 001 (i.e. with an internal blank), you are committing yourself to using name literals to refer to those variables.

I.e to rename VAR_001 to VAR 001, you would need the expression

rename   VAR_001='VAR 001'n

and all subsequent references to that varible would require you to use 'VAR 001'n instead of the far more easily typed VAR_001.

Doing a rename of all the variables in a compact form would be relatively easy. But do you really want to risk forgetting an apostrophe?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sbxkoenk · Posted 03-09-2022 11:06 AM

PROC DATASETS LIBRARY=MYLIB NoList memtype=DATA;
 modify MyDataSet;
  rename 
VAR_1   = 'VAR 001'n
VAR_2   = 'VAR 002'n
...
VAR_69  = 'VAR 069'n
...

VAR_125 = 'VAR 125'n
...
VAR_240 = 'VAR 240'n ; run;
QUIT;

Job04 · Posted 03-09-2022 11:15 AM

Yes I needed the change to be consistent with other datasets and documentation.

PaigeMiller · Posted 03-09-2022 11:08 AM

May I ask why you want to do this? It seems to me that you are committing to doing a fair amount of work, for (in my opinion) no benefit.

--
Paige Miller

Job04 · Posted 03-09-2022 11:21 AM

I agree. But some work involves biochemists and other branches of medical field. When they generate data usually they put alot of description. Example, the sample 's id is too long that contains location of hospitals-phase of trial-patient's number-buffer : all this together. I'm doing this as matter of convenience and communication between people.

Reeza · Posted 03-09-2022 11:34 AM

@Job04 wrote:

I agree. But some work involves biochemists and other branches of medical field. When they generate data usually they put alot of description. Example, the sample 's id is too long that contains location of hospitals-phase of trial-patient's number-buffer : all this together. I'm doing this as matter of convenience and communication between people.

I would recommend using labels instead. When you export the data you can specify that the labels be output which are display friendly. Then the variable names are easier to program.

Kurt_Bremser · Posted 03-09-2022 11:09 AM

proc transpose data=have out=long;
by id;
var var:;
run;

data want;
set long;
sequence = input(scan(_name_,2,"_"),best.);
drop _name_;
rename col1=value;
run;

You will probably be much better off keeping the long structure, but if you absolutely want the unwieldy wide layout, you can do this:

data long2;
set want;
idvar = 'VAR_' !! put(sequence,z3.);
run;

proc transpose data=long2 out=want_wide (drop=_name_);
by id;
var value;
id idvar;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Job04 · Posted 03-09-2022 11:36 AM

Perfect. Thank you so much!

FreelanceReinh · Posted 03-09-2022 11:33 AM

@Job04 wrote:

My data have 240 variables that are named sequentially VAR_1 to VAR_240. (...) I import it using proc import.

You can use variable lists in a RENAME= dataset option of the dataset specified in the OUT= option of PROC IMPORT:

out=want(rename=(var_1-var_240 = var001-var240))

The same works with 'var 001'n-'var 240'n, but I wouldn't recommend such names.

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away