Hello everyone,
I am trying to create a new binary variable based on the conditions of multiple other variables, see below for example dataset. Var1-3 values can be 0-3, -99, or missing and these are 3 non-overlapping variables (i.e. if there is a value for Var1, there will not be a value for Var2-3). I'd like my NEW_VAR to code for the presence of a "legit" value (which I am defining as either a 1-3 in any Var1-3.). basically if any of the values for Var1-3 = 0 or -99 then NEW_VAR=0, else NEW_VAR=1. The problem with the code I below is that everything gets coded as 0 in the NEW_VAR. Any help is much appreciated.
Attempted code:
data want;
set have;
if (var1>=1 or var2>=1 or var3>=1) then new_var=1;
if (var1<=0 or var2<=0 or var3<=0)then new_var=0;
run;
Desired output:
ID | Var1 | Var2 | Var3 | NEW_VAR |
1 | 3 | . | . | 1 |
2 | . | . | 0 | 0 |
3 | . | -99 | . | 0 |
4 | . | . | 2 |
1 |
data want;
set have;
new_var = coalesce(var1, var2, var3) in (1, 2, 3);
run;
Coalesce will get the single value from the 3 so you don't need all three comparisons.
IN will check if the value is between 1 and 3, and if so, returns a 1, otherwise it returns 0.
@monsterpie wrote:
Hello everyone,
I am trying to create a new binary variable based on the conditions of multiple other variables, see below for example dataset. Var1-3 values can be 0-3, -99, or missing and these are 3 non-overlapping variables (i.e. if there is a value for Var1, there will not be a value for Var2-3). I'd like my NEW_VAR to code for the presence of a "legit" value (which I am defining as either a 1-3 in any Var1-3.). basically if any of the values for Var1-3 = 0 or -99 then NEW_VAR=0, else NEW_VAR=1. The problem with the code I below is that everything gets coded as 0 in the NEW_VAR. Any help is much appreciated.
Attempted code:
data want; set have; if (var1>=1 or var2>=1 or var3>=1) then new_var=1; if (var1<=0 or var2<=0 or var3<=0)then new_var=0; run;
Desired output:
ID Var1 Var2 Var3 NEW_VAR 1 3 . . 1 2 . . 0 0 3 . -99 . 0 4 . . 2 1
data want;
set have;
new_var = coalesce(var1, var2, var3) in (1, 2, 3);
run;
Coalesce will get the single value from the 3 so you don't need all three comparisons.
IN will check if the value is between 1 and 3, and if so, returns a 1, otherwise it returns 0.
@monsterpie wrote:
Hello everyone,
I am trying to create a new binary variable based on the conditions of multiple other variables, see below for example dataset. Var1-3 values can be 0-3, -99, or missing and these are 3 non-overlapping variables (i.e. if there is a value for Var1, there will not be a value for Var2-3). I'd like my NEW_VAR to code for the presence of a "legit" value (which I am defining as either a 1-3 in any Var1-3.). basically if any of the values for Var1-3 = 0 or -99 then NEW_VAR=0, else NEW_VAR=1. The problem with the code I below is that everything gets coded as 0 in the NEW_VAR. Any help is much appreciated.
Attempted code:
data want; set have; if (var1>=1 or var2>=1 or var3>=1) then new_var=1; if (var1<=0 or var2<=0 or var3<=0)then new_var=0; run;
Desired output:
ID Var1 Var2 Var3 NEW_VAR 1 3 . . 1 2 . . 0 0 3 . -99 . 0 4 . . 2 1
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.