SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
chuakp
Obsidian | Level 7

I have a database of prescription claims that contains the dispensing date and end date of each prescription, as defined by days supplied. The goal is to calculate how many whether the patient has continuous coverage without more than a 7-day gap. The issue is that there are times in which a patient may have 2 ore more prescriptions in which the dispensing and end dates overlap. For example, the data might look like this: 

 

PATIENT_ID       RX_ID   DISPENSE_DATE     END_DATE       

1                          A           1/1/2022                    1/8/2022            

1                          B           1/7/2022                    1/9/2022            

1                          C           1/8/2022                    1/11/2022          

1                          D           1/19/2022                  1/25/2022          

1                          E            1/26/2022                 1/30/2022

 

If you only consider the dispense_date and end_date, it looks like consider there is a 8-day gap between C and D. However, RX_ID B overlaps with A for 2 days (1/7/2022-1/8/2022), and C overlaps with A for 1 day (1/8/2022), and I would like to consider the patient as having coverage from 1/1/2022 through 1/14/2022 (i.e., 1/11/2022 plus the number of overlapping dates), in which case there is only a 5-day gap before D. Basically, I want to turn the above database into something like the following (the observations for D and E would stay the same because there is no overlap). 

 

PATIENT_ID       RX_ID   DISPENSE_DATE    END_DATE_COMBINED       

1                          A           1/1/2022                    1/14/2022                              

1                          D           1/19/2022                  1/25/2022                             

1                          E           1/26/2022                 1/30/2022

 

Note: while not in this example data, I'm aware that there could be situations in which prescription D could have started on 1/13/2022, in which case it overlaps with the new range of A after stitching A/B/C together. However, I don't care about that for the purposes of this analysis.

 

I'd appreciate any advice. Thanks.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You don't really have to look ahead (as long as the data is sorted by id and dispense_date).

But you might want to remerge the final result to get the whole overlapping period onto each observation in the group.

data want;
  set have ;
  by patient_id dispense_date;
  retain first_date last_date group;
  if first.patient_id then call missing(first_date,last_date,group);
  if dispense_date > last_date then do;
     group+1;
     first_date=dispense_date; last_date=end_date; 
  end;
  else do;
     first_date=min(first_date,dispense_date);
     last_date=max(last_date,end_date);
  end;
  format first_date last_date yymmdd10.;
run;

data want;
  do _n_=1 by 1 until(last.group);
    merge want groups;
    by patient_id group;
  end;
  do _n_=1 to _n_;
    set want(drop=first_date last_date);
    output;
  end;
run;

Tom_0-1666616820317.png

 

View solution in original post

6 REPLIES 6
Astounding
PROC Star

Just to point you in a good direction, here is what I would suggest.

 

Treat this as a two-step problem.  First assign a new variable GROUPNO based on whether the current observation belongs in the same group as a previous observation or not.  Then group them.  For example, start along these lines:

data want;
   set have;   by patient_id;
   if firstpatient_id then groupno=1;
   prior_end = lag(end_date);
   if ... whatever conditions you would like to indicate a new group should start ...
   then groupno + 1;
run;

This step should give you a result aong the lines of:

PATIENT_ID       RX_ID   DISPENSE_DATE     END_DATE    GROUP_NO      

1                  A      1/1/2022         1/8/2022        1      
1                  B      1/7/2022         1/9/2022        1
1                  C      1/8/2022         1/11/2022       1
1                  D      1/19/2022        1/25/2022       2
1                  E      1/26/2022        1/30/2022       3

From there, calculating the maxium END_DATE for each PATIENT_ID / GROUP_NO should be straightforward.

 

chuakp
Obsidian | Level 7

I appreciate the reply. To clarify, I am not trying to calculate the maximum end_date for each PATIENT_ID / GROUP_NO. I am trying to determine the number of days of overlap within GROUP_NO's, then add that number to the end of the the maximum end_date for each PATIENT/GROUP_NO. Part of my difficulty is determining the number of dates of overlap.

Tom
Super User Tom
Super User

Isn't the number of days of overlap just the difference between the sum of each individual record's duration and the duration of the overall period?  Which you can calculate if you know the first and last date for the period.

data have;
 input PATIENT_ID RX_ID $ DISPENSE_DATE :yymmdd. END_DATE :yymmdd. GROUP_NO ;
 format DISPENSE_DATE END_DATE yymmdd10.;
cards;
1 A 2022-01-01 2022-01-08 1
1 B 2022-01-07 2022-01-09 1
1 C 2022-01-08 2022-01-11 1
1 D 2022-01-19 2022-01-25 2
1 E 2022-01-26 2022-01-30 3
;

data want;
  set have ;
  by patient_id group_no;
  days_supp = end_date - dispense_date + 1;
  retain first_date last_date;
  if first.group_no then first_date = dispense_date ;
  if first.group_no then total_supp = days_supp;
  else total_supp + days_supp;
  last_date=max(last_date,end_date);
  total_days = end_date - first_date +1;
  overlap = total_supp - total_days;
  format first_date last_date yymmdd10.;
run;

proc print;
run;

Result

Tom_0-1666473499339.png

 

 

chuakp
Obsidian | Level 7

Thanks Tom. You've pointed me in the right direction for this problem.

 

One remaining issue is the creation of the groups. In the example data, Rx B overlaps with Rx A and Rx C overlaps with Rx B, so you can just compare the current observation to the previous one to group prescriptions together.

data have;
 input PATIENT_ID RX_ID $ DISPENSE_DATE :yymmdd. END_DATE :yymmdd. GROUP_NO ;
 format DISPENSE_DATE END_DATE yymmdd10.;
cards;
1 A 2022-01-01 2022-01-08 1
1 B 2022-01-07 2022-01-09 1
1 C 2022-01-08 2022-01-11 1
1 D 2022-01-19 2022-01-25 2
1 E 2022-01-26 2022-01-30 3
;

 

But imagine you instead had data like like this, in which C doesn't overlap with A but C does overlap with B. We still want to group C together with A and B, but you can't do that by comparing the current observation to the previous observation.

 

data have;

input PATIENT_ID RX_ID $ DISPENSE_DATE :yymmd. END_DATE :yymmdd.;

format DISPENSE_DATE END_DATE yymmdd10.;

cards;

1 A 2022-01-01 2022-01-08 

1 A 2022-01-04 2022-01-06

1 C 2022-01-07 2022-01-11

1 D 2022-01-19 2022-01-25 

1 E 2022-01-26 2022-01-30 

;

run;

 

 

Tom
Super User Tom
Super User

You don't really have to look ahead (as long as the data is sorted by id and dispense_date).

But you might want to remerge the final result to get the whole overlapping period onto each observation in the group.

data want;
  set have ;
  by patient_id dispense_date;
  retain first_date last_date group;
  if first.patient_id then call missing(first_date,last_date,group);
  if dispense_date > last_date then do;
     group+1;
     first_date=dispense_date; last_date=end_date; 
  end;
  else do;
     first_date=min(first_date,dispense_date);
     last_date=max(last_date,end_date);
  end;
  format first_date last_date yymmdd10.;
run;

data want;
  do _n_=1 by 1 until(last.group);
    merge want groups;
    by patient_id group;
  end;
  do _n_=1 to _n_;
    set want(drop=first_date last_date);
    output;
  end;
run;

Tom_0-1666616820317.png

 

chuakp
Obsidian | Level 7

Tom, thanks for your reply. I was able to figure it out using your code for grouping prescriptions together.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1154 views
  • 0 likes
  • 3 in conversation