I want to create an indicator for the row of data for each person that is the highest count of consecutive months of insurance BUT that also includes the month of diagnosis in that span.
In an example of one case, I have information about the enrollment status in insurance for the month of diagnosis and the 6 months after. The month of diagnosis is 12/2020 and they are enrolled that month and the next month (1/2021). However, they are not enrolled in insurance the following month (2/2021). They are enrolled again in insurance 3/2021 through 6/2021. I have created the counts of consecutive months of insurance with the 2 consecutive spans. I now need to create an indicator that will show which row (per person) has the highest count of consecutive months of insurance while in the same consecutive span as the diagnosis month.
This one case has a situation where the highest count of consecutive insurance is in a different date span than the diagnosis date. I've tried different combinations of first last but it's not doing the trick.
Here's the code and the screenshots of the tables:
*data of one person with 2 different consecutive spans;
data have;
input id dx_date :ddmmyy10. enroll_date :ddmmyy10. insurance consec_months;
format
dx_date ddmmyy10.
enroll_date ddmmyy10.;
datalines;
99 12/01/2020 12/01/2020 1 1
99 12/01/2020 01/01/2021 1 2
99 12/01/2020 03/01/2021 1 1
99 12/01/2020 04/01/2021 1 2
99 12/01/2020 05/01/2021 1 3
99 12/01/2020 06/01/2021 1 4
;
run;
proc print data=have;
run;
*manually showing the output I want for the indicator;
data want;
set have;
if enroll_date='01JAN2021'd and consec_months=2 then want_indicator=1;
else want_indicator=0;
run;
proc print data=want;
run;
Thank you!
There are two different spans of consecutive enrollment in this one case. One is 12/2020 to 1/2021 and the other is 3/2021 to 6/2021. I need the count of the first span (2nd row) because it contains the month of diagnosis (12/2020). Most of my other cases are continuously consecutive, so I was just taking the last row of each but here is an instance where that doesn't work.
Here is an attempt that works for your data. I've corrected your INFORMATs as they didn't seem to match your data:
data have;
input id dx_date :mmddyy10. enroll_date :mmddyy10. insurance consec_months;
format
dx_date ddmmyy10.
enroll_date ddmmyy10.;
datalines;
99 12/01/2020 12/01/2020 1 1
99 12/01/2020 01/01/2021 1 2
99 12/01/2020 03/01/2021 1 1
99 12/01/2020 04/01/2021 1 2
99 12/01/2020 05/01/2021 1 3
99 12/01/2020 06/01/2021 1 4
;
run;
data Want;
drop last_:;
set Have;
by id dx_date enroll_date;
retain last_enroll_date DX_Flag last_DX_Flag;
if first.id then do;
consec_months_derived = 1;
last_enroll_date = .;
DX_Flag = '';
end;
else do;
if enroll_date = intnx('MONTH', last_enroll_date, 1) then do;
consec_months_derived + 1;
if enroll_date >= dx_date and last_DX_Flag in ('','Y') then DX_Flag = 'Y';
end;
else do;
consec_months_derived = 1;
DX_Flag = 'N';
end;
end;
output;
last_enroll_date = enroll_date;
last_DX_Flag = DX_Flag;
run;
I've also created a Y/N flag rather than your 0/1 flag.
Given your data are
then you can restate the identification of the desired observation. You want the observation which satisfies
data have;
input id dx_date :ddmmyy10. enroll_date :ddmmyy10. insurance consec_months;
format dx_date ddmmyy10. enroll_date ddmmyy10.;
datalines;
99 12/01/2020 12/01/2020 1 1
99 12/01/2020 01/01/2021 1 2
99 12/01/2020 03/01/2021 1 1
99 12/01/2020 04/01/2021 1 2
99 12/01/2020 05/01/2021 1 3
99 12/01/2020 06/01/2021 1 4
;
data want (drop=_:);
set have (keep=id);
by id ;
merge have
have (firstobs=2 keep=enroll_date consec_months
rename=(enroll_date=_nxt_edate consec_months=_nxt_cm));
if consec_months=1 then _current_start=enroll_date;
retain _current_start; /*Start of current consecutive ENROLL_DATEs */
if _current_start<=dx_date and (last.id=1 or (_nxt_cm=1 and _nxt_edate>dx_date));
run;
The "trick" here is the use of the self-merge with the "firstobs=2" parameter to retrieve the upcoming enroll_date and consec_months.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.