Given your data are
sorted by ID/ENROLL_DATE
each time there is a gap in ENROLL_DATE, consec_months is set to 1, otherwise consec_months is incremented by 1
then you can restate the identification of the desired observation. You want the observation which satisfies
the most recent consec_months=1 record has enroll_date<=dx_date.
the upcoming record
Begins a new ID,
or
follows a GAP in enroll_date, indicated by a CONSEC_MONTH=1 in the following record
... and has an enroll_date > dx_date.
data have;
input id dx_date :ddmmyy10. enroll_date :ddmmyy10. insurance consec_months;
format dx_date ddmmyy10. enroll_date ddmmyy10.;
datalines;
99 12/01/2020 12/01/2020 1 1
99 12/01/2020 01/01/2021 1 2
99 12/01/2020 03/01/2021 1 1
99 12/01/2020 04/01/2021 1 2
99 12/01/2020 05/01/2021 1 3
99 12/01/2020 06/01/2021 1 4
;
data want (drop=_:);
set have (keep=id);
by id ;
merge have
have (firstobs=2 keep=enroll_date consec_months
rename=(enroll_date=_nxt_edate consec_months=_nxt_cm));
if consec_months=1 then _current_start=enroll_date;
retain _current_start; /*Start of current consecutive ENROLL_DATEs */
if _current_start<=dx_date and (last.id=1 or (_nxt_cm=1 and _nxt_edate>dx_date));
run;
The "trick" here is the use of the self-merge with the "firstobs=2" parameter to retrieve the upcoming enroll_date and consec_months.
... View more