Hello everyone,
I am trying to create a new binary variable based on the conditions of multiple other variables, see below for example dataset. Var1-3 values can be 0-3, -99, or missing and these are 3 non-overlapping variables (i.e. if there is a value for Var1, there will not be a value for Var2-3). I'd like my NEW_VAR to code for the presence of a "legit" value (which I am defining as either a 1-3 in any Var1-3.). basically if any of the values for Var1-3 = 0 or -99 then NEW_VAR=0, else NEW_VAR=1. The problem with the code I below is that everything gets coded as 0 in the NEW_VAR. Any help is much appreciated.
Attempted code:
data want;
set have;
if (var1>=1 or var2>=1 or var3>=1) then new_var=1;
if (var1<=0 or var2<=0 or var3<=0)then new_var=0;
run;
Desired output:
ID | Var1 | Var2 | Var3 | NEW_VAR |
1 | 3 | . | . | 1 |
2 | . | . | 0 | 0 |
3 | . | -99 | . | 0 |
4 | . | . | 2 |
1 |
data want;
set have;
new_var = coalesce(var1, var2, var3) in (1, 2, 3);
run;
Coalesce will get the single value from the 3 so you don't need all three comparisons.
IN will check if the value is between 1 and 3, and if so, returns a 1, otherwise it returns 0.
@monsterpie wrote:
Hello everyone,
I am trying to create a new binary variable based on the conditions of multiple other variables, see below for example dataset. Var1-3 values can be 0-3, -99, or missing and these are 3 non-overlapping variables (i.e. if there is a value for Var1, there will not be a value for Var2-3). I'd like my NEW_VAR to code for the presence of a "legit" value (which I am defining as either a 1-3 in any Var1-3.). basically if any of the values for Var1-3 = 0 or -99 then NEW_VAR=0, else NEW_VAR=1. The problem with the code I below is that everything gets coded as 0 in the NEW_VAR. Any help is much appreciated.
Attempted code:
data want; set have; if (var1>=1 or var2>=1 or var3>=1) then new_var=1; if (var1<=0 or var2<=0 or var3<=0)then new_var=0; run;
Desired output:
ID Var1 Var2 Var3 NEW_VAR 1 3 . . 1 2 . . 0 0 3 . -99 . 0 4 . . 2 1
data want;
set have;
new_var = coalesce(var1, var2, var3) in (1, 2, 3);
run;
Coalesce will get the single value from the 3 so you don't need all three comparisons.
IN will check if the value is between 1 and 3, and if so, returns a 1, otherwise it returns 0.
@monsterpie wrote:
Hello everyone,
I am trying to create a new binary variable based on the conditions of multiple other variables, see below for example dataset. Var1-3 values can be 0-3, -99, or missing and these are 3 non-overlapping variables (i.e. if there is a value for Var1, there will not be a value for Var2-3). I'd like my NEW_VAR to code for the presence of a "legit" value (which I am defining as either a 1-3 in any Var1-3.). basically if any of the values for Var1-3 = 0 or -99 then NEW_VAR=0, else NEW_VAR=1. The problem with the code I below is that everything gets coded as 0 in the NEW_VAR. Any help is much appreciated.
Attempted code:
data want; set have; if (var1>=1 or var2>=1 or var3>=1) then new_var=1; if (var1<=0 or var2<=0 or var3<=0)then new_var=0; run;
Desired output:
ID Var1 Var2 Var3 NEW_VAR 1 3 . . 1 2 . . 0 0 3 . -99 . 0 4 . . 2 1
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.