How to create a variable based on multiple conditions with lots of var...

SAS-questioner

Hi, I have a lots of variables, q1 to q20 let's say. And I want to create a new variable by several conditions based on the 20 variables:

1. if any one of the 20 variables > 0 then flag=1; (as long as anyone of them > 0)

2.else if any one of the 20 variables = -7 or -8 or -9 then flag=.; (for the rest of the values which <=0)

3. else if ALL of the 20 variables = 0 then flag=0; (for the rest of the values which =0)

For example

q1  q2  q3 .........   q20     flag
20  0    0              0       1
20  20   0              0       1
20  0   -7              0       1
0   0    0              0       0
0  -8    0              0       .
-8 -7   -7              0       .

Because there are too many variables, I tried array:

flag=0;
array q q1-q20;
       do over q;
                    if q > 0 then flag=1;
                    else if q in ('7', '8', '9') then flag=.;
        end;
run;

But the else if here didn't work. Does anyone know how to deal with many variables and multiple conditions? Thank you!

PaigeMiller

data want;
    set have;
    array q q1-20;
    /* Condition 1 */
    if max(of q:)>0 then flag=1;
    /* Condition 2 */
    else do i=1 to 20;
       if q(i) in (-9 -8 -7) then flag=.;
       leave;
    end;
    /* Condition 3 */
    else if max(of q:)=0 and min(of q:)=0 then flag=0;
run;

I wonder if there is a flaw in your logic. If q1 = -1 and all the other q's are equal to zero, then none of the conditions execute. The code mirrors your logic. However, this gives a value of FLAG equal to missing, is that what you want?

--
Paige Miller

SAS-questioner

Could you be more specific about the flaw? if q1= -7 or -8 or -9 and all the other q's are equal to 0, then flag should be missing. I tried your code, but error said "ERROR 160-185: No matching IF-THEN clause" for the last else there.

AhmedAl_Attar

Try this

data have;
	q1=1; q2=0;	q3= 2; q4=3; q5=4; output;
	q1=0; q2=0;	q3= 0; q4=0; q5=0; output;
	q1=0; q2=0;	q3=-9; q4=0; q5=-7;	output;
run;

/* This should work with your data set of 20 vars */
data want;
  set have;
  if (sum(of q:)=0) then flag=0;
  else if (max(of q:) > 0) then flag=1;
  else if (sum(whichn(-9, of q:),whichn(-8, of q:),whichn(-7, of q:)  ) > 0) then flag=-1;
run;

PaigeMiller

@SAS-questioner wrote:
Could you be more specific about the flaw? if q1= -7 or -8 or -9 and all the other q's are equal to 0, then flag should be missing. I tried your code, but error said "ERROR 160-185: No matching IF-THEN clause" for the last else there.

What happens if your logic encounters q1 = -1 (not negative seven) and the rest of the q values are zeros? It also is missing according to your stated logic.

When you get an error in the log, show us the entire log for this data step. Do not show us a partial log, and do not show us just the error messages.

--
Paige Miller

Patrick

Below option should work even if the sum of all q vars should result in zero (obs 7 in sample data).

data have;
  length flag_want 8;
  infile datalines truncover;
  input (q1-q20 flag_want) (:best32.);
  datalines;
20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20 0 -7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 -8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
-8 -7 -7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
-7 1 1 1 1 1 1 1  0 0 0 0 0 0 0 0 0 0 0 0 1
;
run;

data want;
  length flag_derived 8;
  set have;
  array a_q (*) q:;
  do i=1 to dim(a_q);
    /*  1. if any one of the 20 variables > 0 then flag=1 */
    if a_q[i]>0 then do; flag_derived=1; leave; end;
    /* 2.else if any one of the 20 variables = -7 or -8 or -9 then flag=. */
    else 
    if a_q[i] in (-7,-8,-9) then flag_derived=sum(flag_derived,2);
    /* 3. else if ALL of the 20 variables = 0 then flag=0 */
    else
    if a_q[i]=0 then flag_derived=sum(flag_derived,0);
  end;
  if flag_derived>1 then call missing(flag_derived);
  drop i;
run;

proc print data=want;
run;

Tom

Does not look like you need to use looping. Instead just use functions that can operate on multiple variables.

data have;
  input q1-q20 flag_want;
datalines;
20  0  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20 20  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20  0 -7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
 0  0  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 -8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
-8 -7 -7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
-7  1  1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1
;

data want;
  set have;
  if max(0,of q1-q20) > 0 then flag=1;
  else if whichn(-7,of q1-q20)
       or whichn(-8,of q1-q20)
       or whichn(-9,of q1-q20)
       then flag=.;
  else if 0=min(0,of q1-q20)=max(0,of q1-q20) then flag=0;
run;

And perhaps if those special values, -7, -8 and -9 are the only valid values that are less then zero then the second IF can be simpler. For example you could use MIN() function.

  else min(0,of q1-q20)<0 then flag=.;

Or if -7, -8 and -9 are really just different special missing codes if you actually code the data using special missing values (instead of valid numbers) you could use the NMISS() function. For example by using .A, .B, and .C instead of -7, -8 and -9.

missing abc ;
data have;
  input q1-q20 flag_want;
datalines;
20  0  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20 20  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20  0  A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
 0  0  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0  B  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
 B  A  A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
 A  1  1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1
;

data want;
  set have;
  if max(0,of q1-q20) > 0 then flag=1;
  else if 0<nmiss(of q1-q20) then flag=.;
  else if 0=min(0,of q1-q20)=max(0,of q1-q20) then flag=0;
run;

Tom

If you really want to use looping try these loops.

data want;
  set have ;
  array q q1-q20;
  flag=0;
  do index=1 to dim(q) while(flag=0);
    if 0 < q[index] then flag=1;
  end;
  do index=1 to dim(q) while(flag=0);
    if q[index] in (-7 -8 -9) then flag=.;
    else if q[index] ne 0 then put 'WARNING: Unexpected value. ' q[index]=;
  end;
  drop index;
run;

Ksharp

data have;
  input q1-q20 ;
datalines;
20  0  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20 20  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
20  0 -7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
 0  0  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 -8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
-8 -7 -7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .
-7  1  1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1
;


data want;
  set have;
  array x{*} _numeric_ ;
if max(of x{*})>0 then flag=1;
 else if ((-7 in x) or (-8 in x) or (-9 in x)) and max(of x{*})<=0 then flag=.;
  else if max(of x{*})=0 and min(of x{*})=0 then flag=0;
run;

Patrick

@Ksharp Neat. Not sure if I was ever aware of syntax like (-7 in x). That's certainly useful.

dxiao2017

Hi @SAS-questioner , does this answer your question?

data have;
   input q1 q2 q3 q4 q5 q6;
   datalines;
20  0    0   4   5   0
20  20   0  -1  -2   0
20  0   -7   1   2   0
0   0    0   0   0   0
0  -8    0   -1  -3  0
-8 -7   -7   -1  -1  0
1   2   3    0   -1  -20
-9  20  1   2   3    1
-7  -1  0   0   -8  -9
;
run;
proc print data=have;run;
data want (drop=i);
   set have;
   array q[*] q:;
   do i=1 to dim(q);
      if q[i]>0 then flag=1;
      else if q[i] in (-7,-8,-9)
         then flag=.;
      else if max(of q:)=0 and 
              min(of q:)=0
         then flag=0;
   end;
run;
proc print data=want;run;

dxiao2017

And @SAS-questioner , according to your if/else condition, if any row has all values<=0, the flag of that row is missing, regardless there is -7, -8, and -9 in that row or not. I think this is also the question @PaigeMiller prompted. See the example:

data have;
   input q1 q2 q3 q4 q5 q6;
   datalines;
20  0    0   4   5   0
20  20   0  -1  -2   0
20  0   -7   1   2   0
0   0    0   0   0   0
0  -8    0   -1  -3  0
-8 -7   -7   -1  -1  0
1   2   3    0   -1  -20
-9  20  1   2   3    1
-7  -1  0   0   -8  -9
-1  -2  -3  -4  -5  -6
0   -1  -2  -3   0  -1
;
run;
proc print data=have;run;
data want (drop=i);
   set have;
   array q[*] q:;
   do i=1 to dim(q);
      if q[i]>0 then flag=1;
      else if q[i] in (-7,-8,-9)
         then flag=.;
      else if max(of q:)=0 and 
              min(of q:)=0
         then flag=0;
   end;
run;
proc print data=want;run;

How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Re: How to create a variable based on multiple conditions with lots of variables?

Join the 2025 SAS Hackathon!

SAS Training: Just a Click Away