Hi,
I would like to create a new variable "type" based on conditions being true across multiple variables, but I have too many variables (~100) to type. I am using SAS Studio v 9.4.
My data is set up similar to this:
DATA have;
INPUT id
a_var_a a_var_b a_var_c a_var_d a_var_e
b_var_a b_var_b b_var_c b_var_d
c_var_a c_var_b c_var_c d_var_d;
DATALINES;
01 1 0 0 0 0 0 0 0 0 0 0 0 0
02 0 1 0 0 0 0 0 0 0 0 0 0 0
03 0 0 1 0 0 0 0 0 0 0 0 0 0
04 0 0 0 1 0 0 0 0 0 0 0 0 0
05 0 0 0 0 1 0 0 0 0 0 0 0 0
06 0 0 0 0 0 1 0 0 0 0 0 0 0
07 0 0 0 0 0 0 1 0 0 0 0 0 0
08 0 0 0 0 0 0 0 1 0 0 0 0 0
09 0 0 0 0 0 0 0 0 1 0 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0 0
12 0 0 0 0 0 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 0 0 0 0 0 1
;
Run;
I thought it would be as simple as:
Data want;
Set have;
If a_var: = 1 then type = 1;
Else If b_var: = 1 then type = 2;
Else If c_var: = 1 then type = 3;
Else type = 0;
Run;
However I keep getting an error code because I am not allowed to group the variables.
Data want;
Set have;
Array a (*) a_var:;
Array other (2,4) b_var: c_var:;
do i = 1 to dim(a);
If a(i) = 1 then type=1;
end;
do i = 1 to 4;
If other (1,i) = 1 then type=2;
If other (2,i) = 1 then type=3;
Else type=0;
end;
drop i;
Run;
Thank you!
Some slight modifications but this is the code that eventually worked.
DATA have;
INPUT id
a_var_a a_var_b a_var_c a_var_d a_var_e
b_var_a b_var_b b_var_c b_var_d
c_var_a c_var_b c_var_c c_var_d;
if whichn (1, of a_var:) =>1 then type=1;
else if whichn (1, of b_var:) =>1 then type=2;
else if whichn(1, of c_var:) =>1 then type=3;
else type = 0;
DATALINES;
01 1 0 0 0 0 0 0 0 0 0 0 0 0
02 0 1 0 0 0 0 0 0 0 0 0 0 0
03 0 0 1 0 0 0 0 0 0 0 0 0 0
04 0 0 0 1 0 0 0 0 0 0 0 0 0
05 0 0 0 0 1 0 0 0 0 0 0 0 0
06 0 0 0 0 0 1 0 0 0 0 0 0 0
07 0 0 0 0 0 0 1 0 0 0 0 0 0
08 0 0 0 0 0 0 0 1 0 0 0 0 0
09 0 0 0 0 0 0 0 0 1 0 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0 0
12 0 0 0 0 0 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 0 0 0 0 0 1
14 0 0 0 0 0 0 0 0 0 0 0 0 0
;
Run;
Your example data implies that each of your rows may only have one value of '1' and you want the position in the list of variables returned as the value of TYPE.
If that is the case the perhaps the WHICHN function will solve the problem:
DATA have; INPUT id a_var_a a_var_b a_var_c a_var_d a_var_e b_var_a b_var_b b_var_c b_var_d c_var_a c_var_b c_var_c d_var_d; array values a_var_a a_var_b a_var_c a_var_d a_var_e b_var_a b_var_b b_var_c b_var_d c_var_a c_var_b c_var_c d_var_d; type = whichn(1,of values(*)); DATALINES; 01 1 0 0 0 0 0 0 0 0 0 0 0 0 02 0 1 0 0 0 0 0 0 0 0 0 0 0 03 0 0 1 0 0 0 0 0 0 0 0 0 0 04 0 0 0 1 0 0 0 0 0 0 0 0 0 05 0 0 0 0 1 0 0 0 0 0 0 0 0 06 0 0 0 0 0 1 0 0 0 0 0 0 0 07 0 0 0 0 0 0 1 0 0 0 0 0 0 08 0 0 0 0 0 0 0 1 0 0 0 0 0 09 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 1 0 0 0 11 0 0 0 0 0 0 0 0 0 0 1 0 0 12 0 0 0 0 0 0 0 0 0 0 0 1 0 13 0 0 0 0 0 0 0 0 0 0 0 0 1 ; Run;
Whichn looks for the value of the first parameter, in this case 1, and returns the position number of the value found in the list of variables or 0 if not found. The " of arrayname(*)" provides a list of variables and the function would return the list position in regards to the order they are listed in the array definition statement.
If you need to consider multiple variables with values then you need to provide a clearer example of that data and the rules for assigning the type based on two or more values.
Thanks, but I am less concerned with the position of the 1. Instead, I am trying to create 3 categories of the "type" variable (0,1,2, and 3) based on how the conditions are met.
@Doyinsola wrote:
Thanks, but I am less concerned with the position of the 1. Instead, I am trying to create 3 categories of the "type" variable (0,1,2, and 3) based on how the conditions are met.
Then you need to explicitly give us the rules for creating the type variable.
Code that does not do what you want does not provide details on how to do what is needed.
@Doyinsola wrote:
Hi,
I would like to create a new variable "type" based on conditions being true across multiple variables, but I have too many variables (~100) to type. I am using SAS Studio v 9.4.
My data is set up similar to this:
D
proc sgplot data=sashelp.class;
series x=height y=weight;
xaxis labelattrs=(family='Arial');
run;I thought it would be as simple as:
Data want; Set have;
If a_var: = 1 then type = 1; Else If b_var: = 1 then type = 2; Else If c_var: = 1 then type = 3; Else type = 0; Run;However I keep getting an error code because I am not allowed to group the variables.
Try the WHICHN function.
Something like this:
DATA have;
INPUT id
a_var_a a_var_b a_var_c a_var_d a_var_e
b_var_a b_var_b b_var_c b_var_d
c_var_a c_var_b c_var_c d_var_d;
if whichn(1,of a_var:)>0 then type = 1;
DATALINES;
01 1 0 0 0 0 0 0 0 0 0 0 0 0
02 0 1 0 0 0 0 0 0 0 0 0 0 0
03 0 0 1 0 0 0 0 0 0 0 0 0 0
04 0 0 0 1 0 0 0 0 0 0 0 0 0
05 0 0 0 0 1 0 0 0 0 0 0 0 0
06 0 0 0 0 0 1 0 0 0 0 0 0 0
07 0 0 0 0 0 0 1 0 0 0 0 0 0
08 0 0 0 0 0 0 0 1 0 0 0 0 0
09 0 0 0 0 0 0 0 0 1 0 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0 0
12 0 0 0 0 0 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 0 0 0 0 0 1
;
Run;
By the way, if you have hundreds of variables, might it not make more sense to name them a1-a100, etc.?
Hi Paige,
This won't work because I am trying to create 3 categories of the "type" variable (0,1,2, and 3) based on how the conditions are met.
@Doyinsola wrote:
"type" is:
1 If any of the group a vars (a_var:) are equal to 1
2 If any of the group b vars (b_var:) are equal to 1
3 If any of the group c vars (c_var:) are equal to 1
else 0
I feel like writing some slightly obscure code:
code = (whichn(1, of a_var:)>0 )*1 + (whichn(1, of b_var:)>0 )*2 + (whichn(1, of c_var:)>0 )*3 ;
which assumes that you have 1's in only one of the a_ b_ or c_ var groups. If this results in a 4 or 5 then your problem description is still incomplete.
Thank you!
Some slight modifications but this is the code that eventually worked.
DATA have;
INPUT id
a_var_a a_var_b a_var_c a_var_d a_var_e
b_var_a b_var_b b_var_c b_var_d
c_var_a c_var_b c_var_c c_var_d;
if whichn (1, of a_var:) =>1 then type=1;
else if whichn (1, of b_var:) =>1 then type=2;
else if whichn(1, of c_var:) =>1 then type=3;
else type = 0;
DATALINES;
01 1 0 0 0 0 0 0 0 0 0 0 0 0
02 0 1 0 0 0 0 0 0 0 0 0 0 0
03 0 0 1 0 0 0 0 0 0 0 0 0 0
04 0 0 0 1 0 0 0 0 0 0 0 0 0
05 0 0 0 0 1 0 0 0 0 0 0 0 0
06 0 0 0 0 0 1 0 0 0 0 0 0 0
07 0 0 0 0 0 0 1 0 0 0 0 0 0
08 0 0 0 0 0 0 0 1 0 0 0 0 0
09 0 0 0 0 0 0 0 0 1 0 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0 0
12 0 0 0 0 0 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 0 0 0 0 0 1
14 0 0 0 0 0 0 0 0 0 0 0 0 0
;
Run;
@Doyinsola wrote:
Hi Paige,
This won't work because I am trying to create 3 categories of the "type" variable (0,1,2, and 3) based on how the conditions are met.
That's right, I said "something like this". It covers the case where any of the "a" variables equals 1. Obvious modifications to the code get what you want, that I suspect you can make without further issue.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.