I'm trying to separate the data 'have' to
output_1 if any dummy variable in z1-z7 series contains '1'
else
output_0 if z1-z7 series are all '0'
as shown below.
However, my current code using array results are wrong as you can see from wrong_0 and wrong_1 resulting datasets.
Any hints? what am I doing wrong here? I placed 'end' before and after output with no success. Thanks in advance!
data have;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
a 0 0 0 0 0 1 1
b 0 0 0 0 0 0 0
c 1 0 0 0 0 0 0
d 0 1 1 0 0 0 0
e 0 0 0 0 0 0 1
f 0 0 0 0 0 0 0
;
data output_0;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
b 0 0 0 0 0 0 0
f 0 0 0 0 0 0 0
;
data output_1;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
a 0 0 0 0 0 1 1
c 1 0 0 0 0 0 0
d 0 1 1 0 0 0 0
e 0 0 0 0 0 0 1
;
data wrong_1; set have; /*output data has N=6 rows instead 4*/
array m z:;
do over m;
if m in ('1') then output;
end;
run;
data wrong_0; set have; /*output data has N=36 rows instead 2*/
array m z:;
do over m;
if m in ('0') then output;
end;
run;
if you are using array, your syntax is the issue
if sum(of m)>1
should be
if sum(of m(*))>1
and if you are using in operator
data wrong_1 wrong_0;
set have;
array m z1-z7;
if 1 in m then output wrong_1;
else output wrong_0;
run;
data have;
input id $ z1 z2 z3 z4 z5 z6 z7;
cards;
a 0 0 0 0 0 1 1
b 0 0 0 0 0 0 0
c 1 0 0 0 0 0 0
d 0 1 1 0 0 0 0
e 0 0 0 0 0 0 1
f 0 0 0 0 0 0 0
;
data output_0 output_1;
set have;
if max(of z1-z7)>0 then output output_0;
else output output_1;
run;
using var list z:
data output_0 output_1;
set have;
if max(of z:)>0 then output output_0;
else output output_1;
run;
Well this is certainly confusing
output_0 if any dummy variable in z1-z7 series contains '1'
else
output_1 if z1-z7 series are all '0'
because your code does the exact opposite.
However, this should get the data separated properly, except for the confusion stated above.
data wrong_1 wrong_0; set have; if sum(of z:)>1 then output wrong_1; else output wrong_0; run;
Sorry for a confusion, I will correct that. Below is the error I got.
ERROR: Array subscript out of range at line 377 column 15.
374 data p.wrong_1 p.wrong_0;
375 set p.have;
376 array m z1-z7;
377 if sum(of m)>1 then output p.wrong_1;
378 else output p.wrong_0;
379 run;
ERROR: Array subscript out of range at line 377 column 15.
id=a z1=0 z2=0 z3=0 z4=0 z5=0 z6=1 z7=1 _I_=. _ERROR_=1 _N_=1
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 1 observations read from the data set P.HAVE.
WARNING: The data set P.WRONG_1 may be incomplete. When this step was stopped there were 0
observations and 8 variables.
WARNING: Data set P.WRONG_1 was not replaced because this step was stopped.
WARNING: The data set P.WRONG_0 may be incomplete. When this step was stopped there were 0
observations and 8 variables.
WARNING: Data set P.WRONG_0 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.42 seconds
cpu time 0.01 seconds
if you are using array, your syntax is the issue
if sum(of m)>1
should be
if sum(of m(*))>1
and if you are using in operator
data wrong_1 wrong_0;
set have;
array m z1-z7;
if 1 in m then output wrong_1;
else output wrong_0;
run;
ERROR: Array subscript out of range at line 377 column 15.
374 data p.wrong_1 p.wrong_0;
375 set p.have;
376 array m z1-z7;
377 if sum(of m)>1 then output p.wrong_1;
378 else output p.wrong_0;
379 run;
I showed simpler code that you have changed, causing the error. Run my exact code without changes.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.