Hi all!
I am wondering how to create and replace a variable totally?
For example, I want to generate the variable x1 based on x, and I want to drop x, change the name of x1 to x
My code is:
data want (rename=(x1=x));
set have;
x1=x+2;
drop x;
run;
I am wondering if there is any other way or else to do so?
Warm regards.
If the variable is already numeric then just modify the VALUES and leave the name alone.
data want;
set have;
x=x*1000;
run;
If the variable is character and you want to make a new numeric variable that uses the same name then you will have to use rename (and perhaps also drop). But there is no need to use dataset options on either the input or output dataset referece.
Just use the normal data step commands.
data want;
set have;
new_x = input(x,32.)*1000;
rename new_x=x;
drop x;
run;
Rename x to something else when you do the SET command.
data want;
set have (rename=(x=old_x));
x=old_x+2;
drop old_x;
run;
Hi @PaigeMiller
Thank you for your answer,
But I do not why we need to rename on SET command rather than DATA statement. I did a test with data sashelp.class as below:
data class1;
set sashelp.class;
run;
data class2 (rename=(Age=birth));
set class1;
run;
data class3;
set class1 (rename=(Age=birth));
run;
The datasets class2 and class3 all have the column birth instead of Age.
Warm and thanks,
Phil.
@Phil_NZ wrote:
Hi @PaigeMiller
Thank you for your answer,
But I do not why we need to rename on SET command rather than DATA statement. I did a test with data sashelp.class as below:
data class1; set sashelp.class; run; data class2 (rename=(Age=birth)); set class1; run; data class3; set class1 (rename=(Age=birth)); run;
The datasets class2 and class3 all have the column birth instead of Age.
Warm and thanks,
Phil.
This isn't the same example. You are not replacing a variable with a given name with a new variable of the same name.
But anyway, @ballardw and @Tom have better solutions.
@Phil_NZ wrote:
Hi @PaigeMiller
Thank you for your answer,
But I do not why we need to rename on SET command rather than DATA statement. I did a test with data sashelp.class as below:
data class1; set sashelp.class; run; data class2 (rename=(Age=birth)); set class1; run; data class3; set class1 (rename=(Age=birth)); run;
The datasets class2 and class3 all have the column birth instead of Age.
Warm and thanks,
Phil.
Where you place the rename, data set option on set statement, data set option on data statement , or a simple rename statement not as an option depends on why you are renaming at all.
This also does the same as your simple examples:
data class4;
set class1;
rename Age=Birth;
run;
Sometimes you rename on a set statement because you are "setting" multiple data sets and two variables that need to be treated the same are set to the same name to make the coding "nicer"
data want; set dataone (rename=(value2020 = value) ) datatwo (rename=(value2021 = value) ) datathr (rename=(value2019 = value) ) ; <code that uses VALUE instead of bodging 3 variable names> ;
This happens quite often when you get data from multiple contributors or someone has lousy variable naming practices and places data, such as the year into a variable name.
You may also be creating two or more data sets and for some reason need the names in the output data sets to be different. That would be a case where the rename needs to be a data set option for each output set.
Sometimes it is just style choice if you don't manipulate the renamed variable.
Some people will use a data step to rename a variable without realizing that SAS has a procedure, DATASETS, that will change properties (name, format, label, informat) in place (not values). Which can be much quicker than a data step because when you use a data step every record is processed, which means moved through the data vector, and can take a long time for large data sets. Proc Datasets manipulated the values in the data set header.
If you are not changing the type just use the same variable on both sides of the =.
x=x+2;
for example will replace a non-missing value of X with the value of X with 2 added to it.
I specifically say "non-missing" because the result of adding 2 to a missing value is always 2. It is up to you to make sure the logic of the assignment meets the desired result.
Also, this is one of the reasons we strongly suggest not using the way to0 common structure;
data somename; set somename; <code here> ;
If you decide you need to do something else, possibly to a different variable or only for some values of x, all of the x's already had 2 added. So you would end up adding 2 again. And again. Every time you test a bit of logic or assignment code.
Hi @ballardw
Thank you for your comment. In my case specifically, I just want to generate a new variable x1=x*1000. Is there any notice with multiple if using your approach, please? The example above I created to know the logical way to do it.
It is really nice that your way does not need a new variable being generated. I am still ambiguous about this part
Also, this is one of the reasons we strongly suggest not using the way to0 common structure;
data somename; set somename; <code here> ;If you decide you need to do something else, possibly to a different variable or only for some values of x, all of the x's already had 2 added. So you would end up adding 2 again. And again. Every time you test a bit of logic or assignment code.
Many thanks and warmest regards,
Phil.
If the variable is already numeric then just modify the VALUES and leave the name alone.
data want;
set have;
x=x*1000;
run;
If the variable is character and you want to make a new numeric variable that uses the same name then you will have to use rename (and perhaps also drop). But there is no need to use dataset options on either the input or output dataset referece.
Just use the normal data step commands.
data want;
set have;
new_x = input(x,32.)*1000;
rename new_x=x;
drop x;
run;
Since you do not change the type, do this:
data want;
set have;
x1 = x1 + 2;
run;
Now, if x1 was accidentally stored as character, then you would need the RENAME= dataset option, and INPUT function and a DROP:
data want (rename=(x1=x));
set have;
x1 = input(x,32.) + 2;
drop x;
run;
As long as the type is numeric, you can do it "in place", even if you want to convert value types:
data have;
x1 = datetime();
format x1 e8601dt19.;
run;
data want;
set have;
x1 = datepart(x1);
format x1 yymmdd10.;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.