Solved
Contributor
Posts: 61

# process 'if statement' more efficiently.

I have a small data set like this on the bottom.

The logic is: if any of the variable (S1-S5) contains 0, the sex is defined as 'M', otherwise as 'F'.

My question is: is there any way we can replace the repeated if statements with a more efficient coding style?

It will be very helpful when the variable goes from S1 to S200 something.

Thanks.

*;

data one;

input S1-S5 var2;

datalines;

0 . . . . 11

. 0 . . . 22

. . 0 . . 33

1 . . . . 44

. . . . 0 55

;

data two;

set one;

if S1=0 then sex='M';

else if S2=0 then sex='M';

else if S3=0 then sex='M';

else if S4=0 then sex='M';

else if S5=0 then sex='M';

else  sex='F';

run;

*;

Accepted Solutions
Solution
‎10-21-2014 05:47 PM
Posts: 1,270

## Re: process 'if statement' more efficiently.

data one;
input S1-S5 var2;
datalines;
0 . . . . 11
. 0 . . . 22
. . 0 . . 33
1 . . . . 44
. . . . 0 55
;

data want(drop=i);
set one;
sex = 'F';
array s(*) s1-s5;
do i=1 to dim(s);
if s(i) = 0 then sex='M';
end;
run;

All Replies
Super User
Posts: 23,754

## Re: process 'if statement' more efficiently.

If you only have 5 I wouldn't even bother with an array statement. By default the max/min/sum functions ignore missing values so something like the following can work.

I've included for the case where all may be missing with an else clause.

data two;

set one;

if max(s1,s2,s3,s4,s5)=0 then sex='M';

else if max(s1,s2,s3,s4,s5)=1 then sex='F';

else sex='U';

run;

Solution
‎10-21-2014 05:47 PM
Posts: 1,270

## Re: process 'if statement' more efficiently.

data one;
input S1-S5 var2;
datalines;
0 . . . . 11
. 0 . . . 22
. . 0 . . 33
1 . . . . 44
. . . . 0 55
;

data want(drop=i);
set one;
sex = 'F';
array s(*) s1-s5;
do i=1 to dim(s);
if s(i) = 0 then sex='M';
end;
run;

Super Contributor
Posts: 578

## Re: process 'if statement' more efficiently.

if s1 * s2 * s3 * s4 * s5 = 0 then set='M';

else sex='F';

Posts: 1,270

## Re: process 'if statement' more efficiently.

This will generate missing values but on the same lines we can do something like this.

data want;

set one;

if sum(of s = 0 then sex='M';

else sex='F';

run;

Contributor
Posts: 61

## Re: process 'if statement' more efficiently.

Hi, ALL,

Thanks for you quick help on this topic.

I personally think the array one works the best for my situation and beyond (e.g. when you have character variables instead of numeric ones in this case). All other responses helped to solve my problem. Thank you all.

Joe

Contributor
Posts: 54

## Re: process 'if statement' more efficiently.

DBailey suggested:

if s1 * s2 * s3 * s4 * s5 = 0 then sex='M';

else sex='F';

Sorry, but this doesn't work. The product of a missing value with anything is missing:

data one;

input S1-S5 var2;

datalines;

0 . . . . 11

. 0 . . . 22

. . 0 . . 33

1 . . . . 44

. . . . 0 55

;

data two;

set one;

if s1 * s2 * s3 * s4 * s5 = 0 then sex='M';

else sex='F';

run;

proc print data= two;

run;

Result:

 Obs S1 S2 S3 S4 S5 var2 sex 1 0 . . . . 11 F 2 . 0 . . . 22 F 3 . . 0 . . 33 F 4 1 . . . . 44 F 5 . . . . 0 55 F
PROC Star
Posts: 8,164

## Re: process 'if statement' more efficiently.

If you think arrays would be your best choice then I'd suggest using something like:

data two;

set one;

array s(*) \$ s1-s5;

if '0' in s then sex='M';

else sex='F';

run;

Contributor
Posts: 61