I need to count number of patients with continuous enrollment after indexdt by month in claims data in SAS. Below is my data structure. startdt is the starting date of insurance coverage and enddt is the end date of insurance coverage. If there a gap between two start dates, the patient is categorized no continuous enrollment (continuous_enroll=0), otherwise has continuous enrollment (continuous_enroll=1).
data input; input ID $1. startdt:mmddyy10. enddt:mmddyy10. indexdt :mmddyy10. continuous_enroll; format startdt date9. enddt date9. indexdt date9. ; datalines; 1 1/1/2020 1/31/2020 3/12/2020 . 1 2/1/2020 2/28/2020 3/12/2020 1 1 3/1/2020 3/31/2020 3/12/2020 1 1 5/1/2020 5/31/2020 3/12/2020 0 2 8/1/2019 8/31/2019 3/16/2020 . 2 3/1/2020 3/31/2020 3/16/2020 0 3 4/1/2020 4/30/2020 4/13/2020 . 3 5/1/2020 5/31/2020 4/13/2020 1 3 8/1/2020 8/31/2020 4/13/2020 0 3 9/1/2020 9/31/2020 4/13/2020 1
; run;
I need to count the number of unique patients with continuous enrollment 1 month after index date. the number of unique patients with continuous enrollment 2 month after index date. the number of unique patients with continuous enrollment 3 month after index date. .... All the way up to 12 months.
Let me know if my question is not clear. Thank you for your help!
Assuming I understood what you mean.
data input;
input ID $1. startdt:mmddyy10. enddt:mmddyy10. indexdt :mmddyy10. continuous_enroll;
format startdt enddt indexdt mmddyy10. ;
datalines;
1 1/1/2020 1/31/2020 1/12/2020 .
1 2/1/2020 2/29/2020 1/12/2020 1
1 3/1/2020 3/31/2020 1/12/2020 1
1 4/1/2020 4/30/2020 1/12/2020 1
1 5/1/2020 5/31/2020 1/12/2020 1
1 6/1/2020 6/30/2020 1/12/2020 1
2 8/1/2019 8/31/2019 3/16/2020 .
2 3/1/2020 3/31/2020 3/16/2020 0
3 4/1/2020 4/30/2020 4/13/2020 .
3 5/1/2020 5/31/2020 4/13/2020 1
3 6/1/2020 6/30/2020 4/13/2020 1
3 8/1/2020 8/31/2020 4/13/2020 0
3 9/1/2020 9/30/2020 4/13/2020 1
;
data key;
set input(keep=id startdt enddt);
do date= startdt to enddt;
output;
end;
format date mmddyy10.;
keep id date;
run;
proc sql;
create table id as
select distinct id,indexdt from input;
quit;
%macro claim(month=);
data want_month_&month.;
if _n_=1 then do;
if 0 then set key;
declare hash h(dataset:'key',hashexp:20);
h.definekey(all:'yes');
h.definedone();
end;
set id;
do date=indexdt to intnx('month',indexdt,&month.,'s');
if h.check() ne 0 then do;delete;leave;end;
end;
run;
%mend;
%claim(month=1)
%claim(month=2)
%claim(month=3)
%claim(month=4)
%claim(month=5)
Do you intend that the "count the number of unique patients with continuous enrollment 1 month after index date" includes every patient counted in "the number of unique patients with continuous enrollment 2 month after index date"? or do you mean "exactly 1 month after and no more"?
As stated 1 month is a subset of the 2 month (and the 3, 4, 5 etc months) because you must be continuously enrolled for 1 month before qualifying as 2.
For your "count" report where does an individual that was continuously enrolled for 3 months, then has a break and then 5 continuous months get reported? I think your report is likely to be more difficult than the time period indicators.
Typically you would not have the data in that form to begin with. Instead you would normally have one dataset with the enrollment information and a separate dataset with the index dates.
The index date data would typically have one observation per ID (or one per instance if it was something like a influenza infection that can occur multiple times to the same individual.)
data index;
input ID $ indexdt :mmddyy. ;
format indexdt yymmdd10. ;
datalines;
1 1/12/2020
2 3/16/2020
3 4/13/2020
;
You might have the enrollment data as multiple monthly records like you show (or perhaps as longer periods). But for the analysis you need will want to collapse it into continuous periods of enrollment.
data enrollment_periods;
input ID $ period (startdt enddt) (:mmddyy.);
format startdt enddt yymmdd10. ;
datalines;
1 1 1/1/2020 6/30/2020
2 1 8/1/2019 8/31/2019
2 2 3/1/2020 3/31/2020
3 1 4/1/2020 6/30/2020
3 2 8/1/2020 9/30/2020
;
If your question is how to go from your monthly data to the collapsed period data then that has been asked and answered many times before on this forum.
Once you have the data in this form then you just need to find the period that surrounds the index date (you might need to make sure one exists) and then you can calculate how many months POST index date of coverage they have. And also how many months PRE index date of coverage they have. Both are important for most analyses. The PRE index date information is need to ensure your INDEX dates are correct. Perhaps the condition of interest pre-dates their insurance coverage and the date you have is just the first time it was recorded into your data source. The POST index date information is needed to allow you to know that you should be able to see any post index outcomes of interest.
So perhaps something like this:
proc sql ;
create table want as
select a.*
, b.period, b.startdt, b.enddt
, case when (b.enddt ge a.indexdt) then 1+intck('month',a.indexdt,b.enddt,'cont') else 0 end
as post_cont
, case when (.z lt b.startdt le a.indexdt) then 1+intck('month',b.startdt,a.indexdt,'cont') else 0 end
as pre_cont
from index a
left join enrollment_periods b
on a.id = b.id
and a.indexdt between b.startdt and b.enddt
order by a.id
;
quit;
Results
post_ Obs ID indexdt period startdt enddt cont pre_cont 1 1 2020-01-12 1 2020-01-01 2020-06-30 6 1 2 2 2020-03-16 2 2020-03-01 2020-03-31 1 1 3 3 2020-04-13 1 2020-04-01 2020-06-30 3 1
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.