BookmarkSubscribeRSS Feed
joy2013
Calcite | Level 5

I have a dataset containing Medicaid claims data, and I want to subset the data to include only individuals who have at least 11 months of continuous enrollment (with at least 15 days of enrollment per each month). The problem is, I have three year's worth of data (2005-2008), so an individual may have only a few months of enrollment in each year they were seen, but it may or may not total the number of months of continuous enrollment that I desire. For example, patient #1001 has claims for 2005 and 2006. Her enrollment began in September 2005 and continued through December 2005, giving her 4 months of continuous enrollment in 2005. Her enrollment then continued from January 2006 to July 2006, giving her 7 months of continuous enrollment in 2006, for a total of 11 months of continuous enrollment. Therefore, I would want to include this person in my dataset. Another example: patient #1003 has claims for 2007 and 2008. He had enrollment in 2007 starting in the month of May through December, giving him a total of 8 months of enrollment in 2007. However, in 2008 he only had one month of enrollment in January, giving him a grand total of 9 months of enrollment. Therefore, I do not want to include patient #1003 in my dataset. What I need is help with is determining the number of months of continuous enrollment, having one number per ID. The variables I have are ID, days_enrollment1-days_enrollment12 (these are number of days of enrollment for January[1]-December[12]), eligible_months (this is # of months eligible for Medicaid per year), and year (year of enrollment, 2005-2008). Again, an individual may have claims filed for multiple years, but I need to find a way to sum up the number of months of continuous enrollment. I feel like the solution to this problem is either really difficult or really easy. Please advise. Thank you in advance.

5 REPLIES 5
ballardw
Super User

You may need to clarify you problem a bit more. You say you have 3 years of data and are looking at 11 continuous months as a target. Suppose id XXXX has a first record point of 1/1/2005 and is continuously enrolled through 11/30/2005 for an obvious 11 months. Then has a gap and is enrolled from 3/1/2006 to 1/31/2007 for another 11 months continuous. Do both periods count? Only one period? If so which one?

And for a stickier bit for your definition of continuous: Suppose we have 9 months in a row of 30 or 31 days of enrollment (or a full month just to make it easy) with a following month of 20 days enrollment and the month after that of 18 days enrollment. Is the month with 18 days counted as continuous with the previous month as it could well be a break of up to 26/27/28 days.

A first step would be to create an appropriate SAS date variable representing, most likely the first day of enrollment per month and one record per period (which makes the cross year boundaries a lot easier).

Something like:

ID     FirstDayOfMonth EnrolledDays

1001 3/1/2005    22

1001 4/1/2005    30

1001 5/1/2005    20

1001 6/1/2005     0

With this structure you have a number of tools available such as First. and Last processing that allow some of what you're looking for. This is sometimes referred to as wide-to-long transposition.

Data trans (keep id FirstDayOfMonth EnrolledDays eligible_months);

     set have;

     array e days_enrollment1 - days_enrollment12;

     do j=1 to 12;

          EnrolledDays=e;

          FirstDayOfMonth= mdy(j,1,year);

          output;

     end;

     format FirstDayOfMonth mmddyy10.;

run;

The next step requires data in ID and date order so it may be necessary to sort the data.

proc sort data=trans; by id FirstDayOfMonth;run;

Data want;

     set trans;

     by ID;

     Retain months 0;

     if first.id then months=0; /* reset counter for first record of each id*/

     if EnrolledDays ge 15 then months +1; /* increment the counter to include current month*/

    /* NOTE: This counts as continuous the case I asked about with the possible 20 day gap*/

     Else If EnrolledDays < 15 then months=0; /* current month doesn't count for incrementing*/

run;

Depending on what you are doing you can get the max or any number of months associated with any ID.

joy2013
Calcite | Level 5

Thank you for your response. I actually want to include patients who had continuous enrollment (11 or more months, with at least 15 days within each month) within the first year of life. So below are examples of patients whom I do and do not want in the final cohort (all values, including DOB,  are made-up):

Do want:

ID          Year          DOB               Eligible_months

1001     2007          01/01/2007               12

This person had 12 months of continuous enrollment in the first year of their life.

Another version of the patient I do want:

ID          Year          DOB               Eligible_months

1002     2006          11/1/2006                 2

1002     2007          11/1/2006               10

This person had continuous enrollment (starting in Nov and Dec 2006 and continuing on for the first 10 months in 2007) for the entire first year of their life.

Don't want:

ID          Year          DOB               Eligible_months

1003     2005          02/2/2005               3

1003     2007          02/2/2005               1

1003     2009          02/2/2005               8

In this case, the person may have had a total of 12 months of enrollment, but only had 3 months of eligibility in the first year of life, 1 month in 2007, and 8 months in 2009, with having no more than 7 months of continuous enrollment.

I did what you outlined above, and followed that by proc summary to get the maximum number of months of continuous enrollment. But now I need the maximum number of months of continuous enrollment only within the first year of life.

ballardw
Super User

Proc summary nway;

where year(dob) = year;

Class id;

will get the summary by id and each calendar year if you keep the YEAR and DOB variable in the example code.

anuli98
Calcite | Level 5

Okay so this didn't work for me, so I passed off this part of the project to an actual biostatistician. However, I do have another question that is probably easier to answer (and for me to do). In the same dataset, I have something like the following:

ID          Date_of_Service     Procedure_Code

1001          4/17/05                         5661

1001          2/16/05                         5661

1001          5/03/07                         5661

1001          2/17/05                         7521

1001          3/15/06                         7521

1001          2/18/05                         4556

1001          9/19/08                         4556

I want to keep all three of the codes (5661, 7521, and 4556), but I only want to keep the one with the earliest date of service (i.e. for code 5661, I want the one from 2/16/05, from 7521 the one from 2/17/05, and for 4556, the one from 2/18/05). All the other dates can be ignored (for now) as I want the earliest date someone had a particular procedure done. How do I do this? Thank you.

ballardw
Super User

proc summary  data=have nway;

     class ID Procedure_code;

     var Date_of_Service;

     output out=want (drop=_type_ _freq_) min=;

run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 3005 views
  • 0 likes
  • 3 in conversation