I have an existing dataset which I will use for secondary data analysis. The dataset contains responses to a questionnaire. I will be using the SET statement to combine two datasets together so I would like to ensure that the variable names are identical. One of my datasets has the question numbers incorrect for some of the questions so I would like to correct them. I tried the RENAME function but I received the following error messages.
ERROR: Variable Q15 already exists on file xxx.QOL.
ERROR: Invalid DROP, KEEP, or RENAME option on file xxx.QOL.
An example of the code I am using is below:
xxx.qol (rename=(Q14=Q15 Q15=Q16 ...))
Other than creating a new unique variable name, is there a way for me to rename the variables so they have the correct numbering? Please let me know if you need any additional information
Thanks.
As you cannot rename variable to other existing variable name, you need use a temporary name, like in:
data want(rename=(var1=q16 var2=q17 ...));
set have(rename=(q15=var1 q16=var2 ...));
run;
You might want to investigate Proc datasets as it allows for many different types of modifications to an existing data set.
Do you have a data set that has something like OldName and NewName, or can build one then you can build a script that uses call execute to do the rename.
Here is a brief example:
data work.dummy;
input q2 q3 q4;
datalines;
1 2 3
;
run;
proc datasets library=work;
modify dummy;
rename
q2=q1
q3=q2
q4=q3
;
label q1='Old q2';
run;
quit;
Note the use of Quit to end the procedure. Proc Datasets supports multiple run groups, such as processing many different datasets within the library.
As you cannot rename variable to other existing variable name, you need use a temporary name, like in:
data want(rename=(var1=q16 var2=q17 ...));
set have(rename=(q15=var1 q16=var2 ...));
run;
Before you try anything fancy, I would try something simple. I can't test it now, but you may be able to get away with changing the order of the renames. Assuming that Q17 does not already exist:
rename=(Q16=Q17 Q15=Q16))
SAS may be applying the renames in order, so you could start with the largest number first to avoid conflicts.
Another thing to try is to rename the entire list with a short statement. Assuming that you want to rename Q14-Q90, but Q91 does not yet exist:
rename=(Q14-Q90=Q15-Q91)
Something to consider if you think you may have future sets that need to be aligned similar to this is HOW the data is read into SAS.
I work with some data sources that have changed slowly over the 15 years of the project. With monthly or quarterly text files to read when the layout changes I modify the data step code used to read the text file by dropping or adding variables to the input statement and adjusting informat to meet the changes (and add drop label statements). Even if the Order of the variable changes a little as long as it represents the same data I keep the same variable name and when data sets are combined across time then I don't run into renaming issues.
I also can keep consistent lengths for character variables to avoid potential data problems associated with mismatched lengths.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.