I wanna count the consecutive rows (>=3) with RUN=0 between rows with RUN=10.
Have:
data have;
input ID day run;
datalines;
1 0 0
1 2 10
1 2 0
1 3 10
1 4 0
1 4 0
1 5 0
1 7 0
1 7 0
1 10 0
1 11 10
1 12 0
1 14 0
1 15 0
1 16 10
1 17 0
1 18 0
1 19 0
1 20 0
1 21 10
1 22 0
1 23 0
1 24 0
1 25 0
1 26 10
1 27 0
1 28 0
1 29 10
1 30 0
;;
want:
ID | DAY | RUN | COUNT |
1 | 0 | 0 | 1 |
1 | 2 | 10 | . |
1 | 2 | 0 | 1 |
1 | 3 | 10 | . |
1 | 4 | 0 | 2 |
1 | 4 | 0 | 2 |
1 | 5 | 0 | 2 |
1 | 7 | 0 | 2 |
1 | 7 | 0 | 2 |
1 | 10 | 0 | 2 |
1 | 11 | 10 | . |
1 | 12 | 0 | 3 |
1 | 14 | 0 | 3 |
1 | 15 | 0 | 3 |
1 | 16 | 10 | . |
1 | 17 | 0 | 4 |
1 | 18 | 0 | 4 |
1 | 19 | 0 | 4 |
1 | 20 | 0 | 4 |
1 | 21 | 10 | . |
1 | 22 | 0 | 5 |
1 | 23 | 0 | 5 |
1 | 24 | 0 | 5 |
1 | 25 | 0 | 5 |
1 | 26 | 10 | . |
1 | 27 | 0 | 1 |
1 | 28 | 0 | 1 |
1 | 29 | 10 | . |
1 | 30 | 0 | 1 |
I do appreciate any tips to get this output.
Note: All consecutive rows with RUN=0 and day <10 will be assigned the same value.
Is below a possible scenario and if yes how would the count look like??
1 0 0 1 2 10 1 2 0 1 3 0 1 3 10 1 4 0 1 4 0 1 5 0 1 7 0 1 7 0
Count would be:
1
.
1
1
.
2
2
2
2
2
Here a way to go:
data have;
input ID day run;
datalines;
1 0 0
1 2 10
1 2 0
1 3 10
1 4 0
1 4 0
1 5 0
1 7 0
1 7 0
1 10 0
1 11 10
1 12 0
1 14 0
1 15 0
1 16 10
1 17 0
1 18 0
1 19 0
1 20 0
1 21 10
1 22 0
1 23 0
1 24 0
1 25 0
1 26 10
1 27 0
1 28 0
1 29 10
1 30 0
;
run;
data inter1
HaveWithGroup(keep=ID day run group);
set have;
by run notsorted;
if first.run and run ne 10 then
do;
_group+1;
end;
if run=10 then call missing(group);
else group=_group;
if run=10 then group_cnt=.;
else group_cnt+1;
output HaveWithGroup;
if last.run and run ne 10 then
do;
if group_cnt<3 then count=1;
else count+1;
output inter1;
end;
run;
data want;
set HaveWithGroup;
call missing(count);
if _n_=1 then
do;
dcl hash h1(dataset:'inter1(keep=group count)', multidata:'n');
h1.defineKey('group');
h1.defineData('count');
h1.defineDone();
end;
if run ne 10 then h1.find();
run;
You want to find all instances of 3+ consecutive zeroes between 10's. When such a series is found you increment the sequence number by 1. But if a shorter series is found you reset the sequence number to 1. Is that correct?
data want (drop=nzero seq);
do Nzero=0 by 1 until (run=10 or eod);
set have end=eod;
end;
if nzero>=3 then seq+1;
else seq=1;
count=seq;
do until(run=10 or eod2);
set have end=eod2;
if run=10 then count=.;
output;
end;
run;
Notes:
If you have run values other than 10 and 0, then counting the consecutive records between 10's is no longer the same as counting "run=0" records. You'd have to modify the program to this:
data want (drop=threezeroesfound seq);
do until (run=10 or eod);
set have end=eod;
if lag2(run)=0 and lag(run)=0 and run=0 then threezeroesfound=1;
end;
if threezeroesfound then seq+1;
else seq=1;
count=seq;
do until(run=10 or eod2);
set have end=eod2;
if run=10 then count=.;
output;
end;
run;
Thanks.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.