DATA Step, Macro, Functions and more

process 'if statement' more efficiently.

Accepted Solution Solved
Reply
Contributor
Posts: 61
Accepted Solution

process 'if statement' more efficiently.

I have a small data set like this on the bottom.

The logic is: if any of the variable (S1-S5) contains 0, the sex is defined as 'M', otherwise as 'F'.

My question is: is there any way we can replace the repeated if statements with a more efficient coding style?

It will be very helpful when the variable goes from S1 to S200 something.

Thanks.

*;

data one;

input S1-S5 var2;

datalines;

0 . . . . 11

. 0 . . . 22

. . 0 . . 33

1 . . . . 44

. . . . 0 55

;

data two;

set one;

  if S1=0 then sex='M';

else if S2=0 then sex='M';

else if S3=0 then sex='M';

else if S4=0 then sex='M';

else if S5=0 then sex='M';

else  sex='F';

run;

*;


Accepted Solutions
Solution
‎10-21-2014 05:47 PM
Trusted Advisor
Posts: 1,204

Re: process 'if statement' more efficiently.

data one;
input S1-S5 var2;
datalines;
0 . . . . 11
. 0 . . . 22
. . 0 . . 33
1 . . . . 44
. . . . 0 55
;

data want(drop=i);
set one;
sex = 'F';
array s(*) s1-s5;
do i=1 to dim(s);
    if s(i) = 0 then sex='M';
  end;
run;

View solution in original post


All Replies
Super User
Posts: 17,836

Re: process 'if statement' more efficiently.

If you only have 5 I wouldn't even bother with an array statement. By default the max/min/sum functions ignore missing values so something like the following can work.

I've included for the case where all may be missing with an else clause.

data two;

set one;

if max(s1,s2,s3,s4,s5)=0 then sex='M';

else if max(s1,s2,s3,s4,s5)=1 then sex='F';

else sex='U';

run;

Solution
‎10-21-2014 05:47 PM
Trusted Advisor
Posts: 1,204

Re: process 'if statement' more efficiently.

data one;
input S1-S5 var2;
datalines;
0 . . . . 11
. 0 . . . 22
. . 0 . . 33
1 . . . . 44
. . . . 0 55
;

data want(drop=i);
set one;
sex = 'F';
array s(*) s1-s5;
do i=1 to dim(s);
    if s(i) = 0 then sex='M';
  end;
run;

Super Contributor
Posts: 578

Re: process 'if statement' more efficiently.

how about

if s1 * s2 * s3 * s4 * s5 = 0 then set='M';

else sex='F';

Trusted Advisor
Posts: 1,204

Re: process 'if statement' more efficiently.

This will generate missing values but on the same lines we can do something like this.

data want;

set one;

if sum(of sSmiley Happy = 0 then sex='M';

else sex='F';

run;

Contributor
Posts: 61

Re: process 'if statement' more efficiently.

Hi, ALL,

Thanks for you quick help on this topic.

I personally think the array one works the best for my situation and beyond (e.g. when you have character variables instead of numeric ones in this case). All other responses helped to solve my problem. Thank you all.

Joe

Contributor dkb
Contributor
Posts: 53

Re: process 'if statement' more efficiently.

DBailey suggested:

how about

if s1 * s2 * s3 * s4 * s5 = 0 then sex='M';

else sex='F';

Sorry, but this doesn't work. The product of a missing value with anything is missing:

data one;

input S1-S5 var2;

datalines;

0 . . . . 11

. 0 . . . 22

. . 0 . . 33

1 . . . . 44

. . . . 0 55

;

data two;

set one;

if s1 * s2 * s3 * s4 * s5 = 0 then sex='M';

else sex='F';

run;

proc print data= two;

run;

Result:

ObsS1S2S3S4S5var2sex
10....11F
2.0...22F
3..0..33F
41....44F
5....055F
PROC Star
Posts: 7,363

Re: process 'if statement' more efficiently.

If you think arrays would be your best choice then I'd suggest using something like:

data two;

  set one;

  array s(*) $ s1-s5;

  if '0' in s then sex='M';

  else sex='F';

run;

Contributor
Posts: 61

Re: process 'if statement' more efficiently.

Thanks for Arthur's additional comments.

I really learned something from you guys.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 286 views
  • 8 likes
  • 6 in conversation