I wanna count the consecutive rows (>=3) with RUN=0 between rows with RUN=10.
Have:
data have;
input ID day run;
datalines;
1 0 0
1 2 10
1 2 0
1 3 10
1 4 0
1 4 0
1 5 0
1 7 0
1 7 0
1 10 0
1 11 10
1 12 0
1 14 0
1 15 0
1 16 10
1 17 0
1 18 0
1 19 0
1 20 0
1 21 10
1 22 0
1 23 0
1 24 0
1 25 0
1 26 10
1 27 0
1 28 0
1 29 10
1 30 0
;;
want:
ID | DAY | RUN | COUNT |
1 | 0 | 0 | 1 |
1 | 2 | 10 | . |
1 | 2 | 0 | 1 |
1 | 3 | 10 | . |
1 | 4 | 0 | 2 |
1 | 4 | 0 | 2 |
1 | 5 | 0 | 2 |
1 | 7 | 0 | 2 |
1 | 7 | 0 | 2 |
1 | 10 | 0 | 2 |
1 | 11 | 10 | . |
1 | 12 | 0 | 3 |
1 | 14 | 0 | 3 |
1 | 15 | 0 | 3 |
1 | 16 | 10 | . |
1 | 17 | 0 | 4 |
1 | 18 | 0 | 4 |
1 | 19 | 0 | 4 |
1 | 20 | 0 | 4 |
1 | 21 | 10 | . |
1 | 22 | 0 | 5 |
1 | 23 | 0 | 5 |
1 | 24 | 0 | 5 |
1 | 25 | 0 | 5 |
1 | 26 | 10 | . |
1 | 27 | 0 | 1 |
1 | 28 | 0 | 1 |
1 | 29 | 10 | . |
1 | 30 | 0 | 1 |
I do appreciate any tips to get this output.
Note: All consecutive rows with RUN=0 and day <10 will be assigned the same value.
Is below a possible scenario and if yes how would the count look like??
1 0 0 1 2 10 1 2 0 1 3 0 1 3 10 1 4 0 1 4 0 1 5 0 1 7 0 1 7 0
Count would be:
1
.
1
1
.
2
2
2
2
2
Here a way to go:
data have;
input ID day run;
datalines;
1 0 0
1 2 10
1 2 0
1 3 10
1 4 0
1 4 0
1 5 0
1 7 0
1 7 0
1 10 0
1 11 10
1 12 0
1 14 0
1 15 0
1 16 10
1 17 0
1 18 0
1 19 0
1 20 0
1 21 10
1 22 0
1 23 0
1 24 0
1 25 0
1 26 10
1 27 0
1 28 0
1 29 10
1 30 0
;
run;
data inter1
HaveWithGroup(keep=ID day run group);
set have;
by run notsorted;
if first.run and run ne 10 then
do;
_group+1;
end;
if run=10 then call missing(group);
else group=_group;
if run=10 then group_cnt=.;
else group_cnt+1;
output HaveWithGroup;
if last.run and run ne 10 then
do;
if group_cnt<3 then count=1;
else count+1;
output inter1;
end;
run;
data want;
set HaveWithGroup;
call missing(count);
if _n_=1 then
do;
dcl hash h1(dataset:'inter1(keep=group count)', multidata:'n');
h1.defineKey('group');
h1.defineData('count');
h1.defineDone();
end;
if run ne 10 then h1.find();
run;
You want to find all instances of 3+ consecutive zeroes between 10's. When such a series is found you increment the sequence number by 1. But if a shorter series is found you reset the sequence number to 1. Is that correct?
data want (drop=nzero seq);
do Nzero=0 by 1 until (run=10 or eod);
set have end=eod;
end;
if nzero>=3 then seq+1;
else seq=1;
count=seq;
do until(run=10 or eod2);
set have end=eod2;
if run=10 then count=.;
output;
end;
run;
Notes:
If you have run values other than 10 and 0, then counting the consecutive records between 10's is no longer the same as counting "run=0" records. You'd have to modify the program to this:
data want (drop=threezeroesfound seq);
do until (run=10 or eod);
set have end=eod;
if lag2(run)=0 and lag(run)=0 and run=0 then threezeroesfound=1;
end;
if threezeroesfound then seq+1;
else seq=1;
count=seq;
do until(run=10 or eod2);
set have end=eod2;
if run=10 then count=.;
output;
end;
run;
Thanks.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.