BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tuckjosh
Fluorite | Level 6

Howdy,

 

I've spent days trying to figure out the solution to this problem I've been having. I've tried to employ pretty much all of the solutions to similar questions I've found online to my code, but none of them give me what I need! To be clear, I'm not looking for PDC or MPR. I'm simply looking to find the mean number of days between patients' refills!

 

I need to calculate the mean (SD) number of days between each patient's pharmacy claims for insulin. I'm using a massive commercial claims dataset. Say a patient has an index fill on 2/2/21 and refills their script on 5/1/21 and 7/1/21. Another patient fills an initial script on 3/4/21 and six refills thereafter... and on and on. The number of refills differs for each patient, as do the number of days supplied and dates of fills and refills. Our goal is to assess the temporal component of insulin filling. I.e., do patients do all of their filling in the first three months of the year? Is it staggered? If yes, how so? Etc. I'd ideally like to be able to correct for overlap and gaps between fills and prescription coverage, respectively. 

 

The data structure is something like this:

 

data temp;
input patient_id svc_dt days_supply_cnt;
format svc_dt mmddyy10.;
datalines;


1 10/21/19 60
1 11/10/19 15
1 1/30/20 30
2 5/1/19 20
2 10/4/19 10

3 1/1/19 70

3 4/11/19 4

3 12/2/19 100

3 5/19/20 10

3 10/1/20 100

4 10/29/19 100

4 10/29/19 100

;


run;

 

My supervisor believes one can simply sum the number days between each fill and divide that by the total number of claims the patient makes during the observation period, but I'm having trouble even getting that. Thanks so much for your help!

 

 

 

J. E. Tucker, MPH (he.him.his)
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

PROC MEANS will do this easily, once you compute the actual number of days between refills for each patient.


So for patient 1, you have to compute the number of days between 10/21/19 and 11/10/19. (Let's say that's 21 days). And then you have to compute the number of days between 11/10/19 and 1/30/20, let's say that is 80 days. So the mean is then (80+21)/2. But that's really (max date – min date)/(purchases-1)

 

So

data temp;
input patient_id svc_dt days_supply_cnt;
format svc_dt mmddyy10.;
datalines;
1 10/21/19 60
1 11/10/19 15
1 1/30/20 30
2 5/1/19 20
2 10/4/19 10
3 1/1/19 70
3 4/11/19 4
3 12/2/19 100
3 5/19/20 10
3 10/1/20 100
4 10/29/19 100
4 10/29/19 100
;

proc means data=temp nway noprint;
    class patient_id;
    var svc_dt;
    output out=want min=min_svc_dt max=max_svc_dt n=n_purchases;
run;

Then you can take the results in data set WANT and do the final calculation in another DATA step:

(max date – min date)/(n_purchases-1)

 

 

 

--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

PROC MEANS will do this easily, once you compute the actual number of days between refills for each patient.


So for patient 1, you have to compute the number of days between 10/21/19 and 11/10/19. (Let's say that's 21 days). And then you have to compute the number of days between 11/10/19 and 1/30/20, let's say that is 80 days. So the mean is then (80+21)/2. But that's really (max date – min date)/(purchases-1)

 

So

data temp;
input patient_id svc_dt days_supply_cnt;
format svc_dt mmddyy10.;
datalines;
1 10/21/19 60
1 11/10/19 15
1 1/30/20 30
2 5/1/19 20
2 10/4/19 10
3 1/1/19 70
3 4/11/19 4
3 12/2/19 100
3 5/19/20 10
3 10/1/20 100
4 10/29/19 100
4 10/29/19 100
;

proc means data=temp nway noprint;
    class patient_id;
    var svc_dt;
    output out=want min=min_svc_dt max=max_svc_dt n=n_purchases;
run;

Then you can take the results in data set WANT and do the final calculation in another DATA step:

(max date – min date)/(n_purchases-1)

 

 

 

--
Paige Miller
tuckjosh
Fluorite | Level 6

Thanks, Paige, but how do I do this for a dataset with nearly 100,000 patients with >1 observation each? I.e., a class statement for patient_id is impossible.

J. E. Tucker, MPH (he.him.his)
PaigeMiller
Diamond | Level 26

No need to change anything. The code I gave works for 5 patients or 100000 patients. That is the power of PROC MEANS.

--
Paige Miller
Reeza
Super User

Honestly not sure how the mean and standard deviation data would answer your stated question. 

 

 Our goal is to assess the temporal component of insulin filling. I.e., do patients do all of their filling in the first three months of the year? 

 

 

Some other things you may want to consider:

  • First, I think that you can only refill for only 100 to 180 days at most (FDA rule) so you can't refill all during the first three months of the year. Not sure if your plans all have the same rules, ours do not so we also need to consider our various plan designs. This may only apply to certain drug classes, can't recall the details. 
  • Instead of this analysis an almost equivalent method is to look at the distribution of the DAYS SUPPLY. This way, large gaps or breaks don't influence your data. 
  • If people are getting their prescription every 100 days, they have 100 days on average. However, that doesn't tell you seasonality of when they get their drugs. You may want to exclude the less than 10 days. Those are usually an emergency refill to bridge a gap or someone forgot their meds at home.
  • It's worth understanding your local rules as well, for example during COVID for several months prescriptions were limited to a 30 days supply in certain area's to prevent a rush on people refilling drugs and insure an adequate supply. If your government did that as well, you may want to exclude that data from your analysis

 

I would create a graph of the days prescription filled weighted by the amount of drugs but it also depends on what you're looking to solve - a drug supply problem or looking at when your insurance claims might peak. May want to weight by costs and/or # of days supply. 

 

title 'Monthly demand for insulin drugs';
proc freq data=temp;
table svc_dt;
weight days_supply_cnt;
format svc_dt yymmn6.;
run;

title 'Quarterly demand for insulin drugs';
proc freq data=temp;
table svc_dt;
weight days_supply_cnt;
format svc_dt yyq.;
run;

My nickel of thought as an insurance analytics manager. 

ballardw
Super User

Main concern I would have with this approach would be cases were for some reason the "interval" really does not reflect some sort of "continuity". Since you do not have anything related to diagnosis, diagnosis date or even the drug(s) involved what happens if your patient has not had anything prescribed for a while because of changes in medical status (no longer prescribed for some reason)? Or patient just stopped coming in for medications for some reason?

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 981 views
  • 8 likes
  • 4 in conversation