As you pointed out, the tricky part here is that you have rows that count not only for the current anchor but for the previous anchor too. You certainly cannot process that in one pass of the data. My thought is to read the dataset twice using 2 set statements: the usual one in the data step to detect the anchor and a second one in a do loop that reads the obs after the anchor until all of the contiguous readmissions are counted. Here's what the data step looks like. data want (drop=readm_flg: anch_flg2);
set have;
* only take action on anchor obs;
* this will preserve values of the anchor;
if anch_flg = 'y' then do;
next_obs = _n_ + 1;
readm_count = (readm_flg = 'y'); * reset readmit count to 0 or 1;
* this loop evaluates obs after anchor until next anchor or
* no readmission;
do until (readm_flg2 = 'n' or anch_flg2 = 'y');
* start reading with obs after the anchor (point=next_obs);
set have (rename=(anch_flg=anch_flg2 readm_flg=readm_flg2)
keep=anch_flg readm_flg) point=next_obs nobs=nobs;
if readm_flg2 = 'y' then readm_count + 1;
next_obs + 1; * prepare for next obs;
* with direct access (point=), must manually detect reading
* past eof or other errors. cannot use end= with point=;
if next_obs > nobs or _error_ then leave;
end;
output; * all contiguous readmissions have been counted so output;
end;
run; It produces your desired output with the anchor row data preserved.
... View more