I'm trying to separate the data 'have' to
output_1 if any dummy variable in z1-z7 series contains '1'
else
output_0 if z1-z7 series are all '0'
as shown below.
However, my current code using array results are wrong as you can see from wrong_0 and wrong_1 resulting datasets.
Any hints? what am I doing wrong here? I placed 'end' before and after output with no success. Thanks in advance!
data have;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
a 0 0 0 0 0 1 1
b 0 0 0 0 0 0 0
c 1 0 0 0 0 0 0
d 0 1 1 0 0 0 0
e 0 0 0 0 0 0 1
f 0 0 0 0 0 0 0
;
data output_0;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
b 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0
;
data output_1;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
a 0 0 0 0 0 1 1
c 1 0 0 0 0 0 0
d 0 1 1 0 0 0 0
e 0 0 0 0 0 0 1
;
data wrong_1; set have; /*output data has N=6 rows instead 4*/
array m z:;
do over m;
if m in ('1') then output;
end;
run;
data wrong_0; set have; /*output data has N=36 rows instead 2*/
array m z:;
do over m;
if m in ('0') then output;
end;
run;
if you are using array, your syntax is the issue
if sum(of m)>1
should be
if sum(of m(*))>1
and if you are using in operator
data wrong_1 wrong_0;
set have;
array m z1-z7;
if 1 in m then output wrong_1;
else output wrong_0;
run;
data have;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
a 0 0 0 0 0 1 1
b 0 0 0 0 0 0 0
c 1 0 0 0 0 0 0
d 0 1 1 0 0 0 0
e 0 0 0 0 0 0 1
f 0 0 0 0 0 0 0
;
data output_0 output_1;
set have;
if max(of z1-z7)>0 then output output_0;
else output output_1;
run;
using var list z:
data output_0 output_1;
set have;
if max(of z:)>0 then output output_0;
else output output_1;
run;
Well this is certainly confusing
output_0 if any dummy variable in z1-z7 series contains '1'
else
output_1 if z1-z7 series are all '0'
because your code does the exact opposite.
However, this should get the data separated properly, except for the confusion stated above.
data wrong_1 wrong_0; set have; if sum(of z:)>1 then output wrong_1; else output wrong_0; run;
Sorry for a confusion, I will correct that. Below is the error I got.
ERROR: Array subscript out of range at line 377 column 15.
374 data p.wrong_1 p.wrong_0;
375 set p.have;
376 array m z1-z7;
377 if sum(of m)>1 then output p.wrong_1;
378 else output p.wrong_0;
379 run;
ERROR: Array subscript out of range at line 377 column 15.
id=a z1=0 z2=0 z3=0 z4=0 z5=0 z6=1 z7=1 _I_=. _ERROR_=1 _N_=1
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 1 observations read from the data set P.HAVE.
WARNING: The data set P.WRONG_1 may be incomplete. When this step was stopped there were 0
observations and 8 variables.
WARNING: Data set P.WRONG_1 was not replaced because this step was stopped.
WARNING: The data set P.WRONG_0 may be incomplete. When this step was stopped there were 0
observations and 8 variables.
WARNING: Data set P.WRONG_0 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.42 seconds
cpu time 0.01 seconds
if you are using array, your syntax is the issue
if sum(of m)>1
should be
if sum(of m(*))>1
and if you are using in operator
data wrong_1 wrong_0;
set have;
array m z1-z7;
if 1 in m then output wrong_1;
else output wrong_0;
run;
ERROR: Array subscript out of range at line 377 column 15.
374 data p.wrong_1 p.wrong_0;
375 set p.have;
376 array m z1-z7;
377 if sum(of m)>1 then output p.wrong_1;
378 else output p.wrong_0;
379 run;
I showed simpler code that you have changed, causing the error. Run my exact code without changes.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.