DATA Step, Macro, Functions and more

Find consecutive rows

Reply
Occasional Contributor
Posts: 6

Find consecutive rows

I wanna count the consecutive rows (>=3) with RUN=0 between rows with RUN=10. 

Have:

data have;
input  ID day run;
datalines;
1 0 0
1 2 10
1 2 0
1 3 10
1 4 0
1 4 0
1 5 0
1 7 0
1 7 0
1 10 0
1 11 10
1 12 0
1 14 0
1 15 0
1 16 10
1 17 0
1 18 0
1 19 0
1 20 0
1 21 10
1 22 0
1 23 0
1 24 0
1 25 0
1 26 10
1 27 0
1 28 0
1 29 10
1 30 0
;;

want:

 

IDDAYRUNCOUNT
1001
1210.
1201
1310.
1402
1402
1502
1702
1702
11002
11110.
11203
11403
11503
11610.
11704
11804
11904
12004
12110.
12205
12305
12405
12505
12610.
12701
12801
12910.
13001

 

I do appreciate any tips to get this output.

Note: All consecutive rows with RUN=0 and  day <10 will be assigned the same value.

Respected Advisor
Posts: 4,173

Re: Find consecutive rows

Is below a possible scenario and if yes how would the count look like??

 

1 0 0
1 2 10
1 2 0
1 3 0
1 3 10
1 4 0
1 4 0
1 5 0
1 7 0
1 7 0

 

Occasional Contributor
Posts: 6

Re: Find consecutive rows

Count would be:

1

.

1

1

 .

2

2

2

2

2

 

Respected Advisor
Posts: 4,173

Re: Find consecutive rows

Here a way to go:

data have;
input  ID day run;
datalines;
1 0 0
1 2 10
1 2 0
1 3 10
1 4 0
1 4 0
1 5 0
1 7 0
1 7 0
1 10 0
1 11 10
1 12 0
1 14 0
1 15 0
1 16 10
1 17 0
1 18 0
1 19 0
1 20 0
1 21 10
1 22 0
1 23 0
1 24 0
1 25 0
1 26 10
1 27 0
1 28 0
1 29 10
1 30 0
;
run;

data inter1 
     HaveWithGroup(keep=ID day run group);
  set have;
  by run notsorted;

  if first.run and run ne 10 then
    do;
      _group+1;
    end;

  if run=10 then call missing(group);
  else group=_group;

  if run=10 then group_cnt=.;
  else group_cnt+1;

  output HaveWithGroup;

  if last.run and run ne 10 then 
    do;
      if group_cnt<3 then count=1;
      else count+1;
      output inter1;
    end;

run;

data want;
  set HaveWithGroup;
  call missing(count);
  if _n_=1 then 
    do;
      dcl hash h1(dataset:'inter1(keep=group count)', multidata:'n');
      h1.defineKey('group');
      h1.defineData('count');
      h1.defineDone();
    end;
  if run ne 10 then h1.find();
run;

Trusted Advisor
Posts: 1,022

Re: Find consecutive rows

You want to find all instances of 3+ consecutive zeroes between 10's.  When such a series is found you increment the sequence number by 1.  But if a shorter series is found you reset the sequence number to 1.  Is that correct?

 

data want (drop=nzero seq);
  do Nzero=0 by 1 until (run=10 or eod);
    set have end=eod;
  end;
  if nzero>=3 then seq+1;
  else seq=1;

  count=seq;
  do until(run=10 or eod2);
    set have end=eod2;
    if run=10 then count=.;
    output;
  end;
run;

Notes:

  1. This program assumes that run=10 or 0 - no other values appear, as in your sample data.
  2. The "do nzero=" loop starts at zero rather than one, so that when the "run=10" record countributes to the increment of nzero, it does not produce an overcount.
  3. I include the "or eod" and "or eod2" in the "until" conditions to accomodate the possibility that the last record of the dataset does not have run=10.

 

If you have run values other than 10 and 0, then counting the consecutive records between 10's is no longer the same as counting "run=0" records.  You'd have to modify the program to this:

 

data want (drop=threezeroesfound seq);
  do until (run=10 or eod);
    set have end=eod;
    if lag2(run)=0 and lag(run)=0 and run=0 then threezeroesfound=1;
  end;
  if threezeroesfound then seq+1;
  else seq=1;
  
  count=seq;
  do until(run=10 or eod2);
    set have end=eod2;
    if run=10 then count=.;
    output;
  end;
run;
Frequent Contributor
Frequent Contributor
Posts: 105

Re: Find consecutive rows

[ Edited ]

@mkeintz:

Thanks.

Trusted Advisor
Posts: 1,022

Re: Find consecutive rows

I'm not sure I fully comprehend. Please show some data before and after using the new criterion.
Ask a Question
Discussion stats
  • 6 replies
  • 343 views
  • 3 likes
  • 4 in conversation