Hi SAS community,
I had a question about the data set step when I tried to re-assign new value to an existing variable. Below are 2 example codes. The second example codes work good and generate the data what we want. Somehow the first code cannot assign the value properly. And I would like to know the reason for why code 1 not working.
Example code 1: ( I want to assign the all the non missing var2 from data d to missing value variable var1 from data b. However, in the test data, only the first non missing var2 from data d assigned to all the var1 missing value records.)
data test;
set a b c d;
if b.var1='' and d.var2^='' then var1=var2; (variable 1 come from data b, while variable 2 come from data d)
IF var1^='';
run;
Example code 2: (I separate the same code into 2 data step and the value can be assigned properly)
data test;
set a b c d;
run;
data test;
set test;
if var1='' and var2^='' then var1=var2; (variable 1 come from data b, while variable 2 come from data d)
IF var1^='';
run;
@annie0 wrote:
Hi SAS community,
I had a question about the data set step when I tried to re-assign new value to an existing variable. Below are 2 example codes. The second example codes work good and generate the data what we want. Somehow the first code cannot assign the value properly. And I would like to know the reason for why code 1 not working.
Example code 1: ( I want to assign the all the non missing var2 from data d to missing value variable var1 from data b. However, in the test data, only the first non missing var2 from data d assigned to all the var1 missing value records.)
data test;
set a b c d;
if b.var1='' and d.var2^='' then var1=var2; (variable 1 come from data b, while variable 2 come from data d)
IF var1^='';
run;
Using b.var1 and d.var2 is is not valid data step code. Variable names can not have a dot in the name. Probably you want:
if var1='' and var2^='' then var1=var2;
TableName.VariableName (in red below) method of referring to variables is valid in SQL only, it is not valid in SAS data steps. In a data step you can only reference variables with their actual names. You can determine which records come from which data set using the IN data set option if desired.
@annie0 wrote:
Hi SAS community,
I had a question about the data set step when I tried to re-assign new value to an existing variable. Below are 2 example codes. The second example codes work good and generate the data what we want. Somehow the first code cannot assign the value properly. And I would like to know the reason for why code 1 not working.
Example code 1: ( I want to assign the all the non missing var2 from data d to missing value variable var1 from data b. However, in the test data, only the first non missing var2 from data d assigned to all the var1 missing value records.)
data test;
set a b c d;
if b.var1='' and d.var2^='' then var1=var2; (variable 1 come from data b, while variable 2 come from data d)
IF var1^='';
run;
Example code 2: (I separate the same code into 2 data step and the value can be assigned properly)
data test;
set a b c d;run;
data test;
set test;
if var1='' and var2^='' then var1=var2; (variable 1 come from data b, while variable 2 come from data d)
IF var1^='';
run;
do you have a variable in the a b c and d datasets like an ID that the datasets share?
otherwise your just stacking the data in examples 1 and 2.
You may need to clear up your concept of what happens when you list multiple data sets in a SET statement:
set a b c d;
This takes the observations from each data set, separately (all from a, then all from b, then all from c, then all from d).
There are no observations that come from both data set b and data set d.
If you spell out a few observations from each data set, plus the result you are hoping to obtain, you can get suggestions about what the program should look like. I suspect you need:
set a b c d (rename=(var2=var1));
Note that the RENAME here applies only to data set d observations.
The result your program produces actually requires fairly complex logic to understand. Variables that come from a SAS data set are automatically retained. So when data set D contains VAR2 (but not VAR1), your program changes VAR1 on the first observation from data set D. Because data set B contains VAR1, the value is retained. It just sits there when SAS encounters the second observation from data set D.
proc sql;
select a.*, b.*, c.*, d.*
from a as a,
b as b,
c as c,
d as d
case when b.var1 = null and d.var2 ^= null then b.var1 = d.var2
end
order by ID;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.