Hi SAS Users, I have been having an issues with medication use overlapping. My data looks like this:
Patient_ID | Medication | start_dt | End_dt |
Tom | A | 8/29/2013 | 9/28/2013 |
Tom | A | 10/3/2013 | 11/2/2013 |
Tom | A | 11/4/2013 | 12/4/2013 |
Tom | A | 12/7/2013 | 1/6/2014 |
Tom | A | 1/22/2014 | 8/26/2014 |
Tom | B | 8/29/2013 | 9/28/2013 |
Tom | B | 10/1/2013 | 11/25/2013 |
Tom | B | 12/7/2013 | 1/6/2014 |
Tom | B | 1/12/2014 | 2/11/2014 |
Tom | B | 2/16/2014 | 3/18/2014 |
Tom | B | 3/30/2014 | 4/29/2014 |
Tom | B | 5/13/2014 | 7/7/2014 |
Tom | E | 9/6/2013 | 11/2/2013 |
Tom | E | 2/7/2014 | 2/14/2014 |
Jerry | C | 8/5/2013 | 12/27/2013 |
Jerry | C | 1/2/2014 | 5/23/2014 |
Jerry | C | 5/29/2014 | 7/27/2014 |
Jerry | D | 3/21/2014 | 6/14/2014 |
Jerry | D | 6/16/2014 | 8/8/2014 |
Jerry | E | 6/7/2014 | 6/19/2014 |
Joe | A | 3/28/2016 | 5/27/2016 |
Joe | A | 6/2/2016 | 7/28/2016 |
Joe | A | 8/3/2016 | 9/2/2016 |
Joe | A | 9/7/2016 | 11/5/2016 |
Joe | A | 11/7/2016 | 12/7/2016 |
Joe | A | 12/12/2016 | 1/11/2017 |
Joe | A | 1/13/2017 | 2/12/2017 |
Joe | A | 2/15/2017 | 4/15/2017 |
Joe | C | 7/25/2016 | 8/24/2016 |
Joe | C | 8/26/2016 | 9/25/2016 |
Joe | C | 10/10/2016 | 12/7/2016 |
Joe | C | 12/12/2016 | 1/11/2017 |
Joe | C | 1/20/2017 | 3/22/2017 |
Joe | D | 3/28/2016 | 5/28/2016 |
Joe | D | 6/6/2016 | 11/5/2016 |
Joe | D | 11/7/2016 | 12/7/2016 |
Joe | D | 12/12/2016 | 4/22/2017 |
Joe | E | 5/12/2016 | 5/15/2016 |
So each patient were prescribed medication at least three of the five Medication A, B, C, D, E. My goal is to identify patients with patients with at least three overlapping medications. Overlapping medication is defined if one medication's start_dt and end_dt interacts with the other medication's start_dt and end_dt.
For example, patient Tom's A prescription fill runs from 08/29/2013 - 9/28/2013; Tom's B prescription fill runs from 08/29/2013-9/28/2013; Tom's E prescription fill runs from 09/06/2013-11/02/2013. Hence, Tom's all three medication overlapped and should be flagged.
The end product table should looks like this
Patient_ID | flag_3_meds |
Tom | 1 |
Jerry | 1 |
Joe | 1 |
Note, a patient must have at least three medication overlaps, if A overlaps B, B overlaps C, but A doesn't overlap C, then the flag_3_meds=0.
So what is a functional algorithm to generate the flag_3_meds indicator?
If your data are already group by patient_id, then a two dimensional array (rows for medication, columns for dates) provides a neat way to record each medication history, and then search each date for 3 or more meds:
data have;
input Patient_ID $ Medication $ start_dt :mmddyy10. End_dt :mmddyy10.;
format start_dt end_dt mmddyy10.;
datalines;
Tom A 8/29/2013 9/28/2013
Tom A 10/3/2013 11/2/2013
Tom A 11/4/2013 12/4/2013
Tom A 12/7/2013 1/6/2014
Tom A 1/22/2014 8/26/2014
Tom B 8/29/2013 9/28/2013
Tom B 10/1/2013 11/25/2013
Tom B 12/7/2013 1/6/2014
Tom B 1/12/2014 2/11/2014
Tom B 2/16/2014 3/18/2014
Tom B 3/30/2014 4/29/2014
Tom B 5/13/2014 7/7/2014
Tom E 9/6/2013 11/2/2013
Tom E 2/7/2014 2/14/2014
Jerry C 8/5/2013 12/27/2013
Jerry C 1/2/2014 5/23/2014
Jerry C 5/29/2014 7/27/2014
Jerry D 3/21/2014 6/14/2014
Jerry D 6/16/2014 8/8/2014
Jerry E 6/7/2014 6/19/2014
Joe A 3/28/2016 5/27/2016
Joe A 6/2/2016 7/28/2016
Joe A 8/3/2016 9/2/2016
Joe A 9/7/2016 11/5/2016
Joe A 11/7/2016 12/7/2016
Joe A 12/12/2016 1/11/2017
Joe A 1/13/2017 2/12/2017
Joe A 2/15/2017 4/15/2017
Joe C 7/25/2016 8/24/2016
Joe C 8/26/2016 9/25/2016
Joe C 10/10/2016 12/7/2016
Joe C 12/12/2016 1/11/2017
Joe C 1/20/2017 3/22/2017
Joe D 3/28/2016 5/28/2016
Joe D 6/6/2016 11/5/2016
Joe D 11/7/2016 12/7/2016
Joe D 12/12/2016 4/22/2017
Joe E 5/12/2016 5/15/2016
Mary D 12/12/2016 4/22/2017
Mary A 5/12/2016 5/15/2016
;
%let mindate9=01jan2013;
%let maxdate9=31dec2017;
data want (keep=patient_id flag3_meds);
set have;
by patient_id notsorted;
array drg_dates{1:5,%sysevalf("&mindate9"d):%sysevalf("&maxdate9"d)} _temporary_;
retain min_date max_date;
if first.patient_id then call missing(of drg_dates{*},min_date,max_date);
row=indexc('ABCDE',medication);
min_date=min(min_date,start_dt);
max_date=max(max_date,end_dt);
do col=start_dt to end_dt;
drg_dates{row,col}=1;
end;
if last.patient_id;
flag3_meds=0;
do date=min_date to max_date until(flag3_meds=1);
flag3_meds=sum(0,drg_dates{1,date},drg_dates{2,date},drg_dates{3,date},drg_dates{4,date},drg_dates{5,date}) >=3;
end;
run;
Make sure to set the macrovars MINDATE9 and MAXDATE9 to cover the entire date range for your study.
What do you want for patients without any or fewer than 3 overlaps?
Your example data does not show any patient that does not have such and your desired result isn't very clear for that example.
First data should be in the form of a data step, as below. IF your date variables are not actually SAS dates you will need to create some additional variables that are for this approach to work. I also added a couple of records for a different patient that did not have an overlap of 3 to show the result one way.
This is a bit of brute force where we create a patient, drug and date data set. Then reshape that with transpose by patient and date. If multiple drugs appear on the same day the transposed data will show 2 or more populated output variables.
Then depending on what you may need, add the flag per record and summarize looking at the maximum value to see if the patient had any day with 3 or more.
Alternatively if you do not need to know about any of the patients with only 2 or fewer then transpose data can be used.
data have; input Patient_ID $ Medication $ start_dt :mmddyy10. End_dt :mmddyy10.; format start_dt end_dt mmddyy10.; datalines; Tom A 8/29/2013 9/28/2013 Tom A 10/3/2013 11/2/2013 Tom A 11/4/2013 12/4/2013 Tom A 12/7/2013 1/6/2014 Tom A 1/22/2014 8/26/2014 Tom B 8/29/2013 9/28/2013 Tom B 10/1/2013 11/25/2013 Tom B 12/7/2013 1/6/2014 Tom B 1/12/2014 2/11/2014 Tom B 2/16/2014 3/18/2014 Tom B 3/30/2014 4/29/2014 Tom B 5/13/2014 7/7/2014 Tom E 9/6/2013 11/2/2013 Tom E 2/7/2014 2/14/2014 Jerry C 8/5/2013 12/27/2013 Jerry C 1/2/2014 5/23/2014 Jerry C 5/29/2014 7/27/2014 Jerry D 3/21/2014 6/14/2014 Jerry D 6/16/2014 8/8/2014 Jerry E 6/7/2014 6/19/2014 Joe A 3/28/2016 5/27/2016 Joe A 6/2/2016 7/28/2016 Joe A 8/3/2016 9/2/2016 Joe A 9/7/2016 11/5/2016 Joe A 11/7/2016 12/7/2016 Joe A 12/12/2016 1/11/2017 Joe A 1/13/2017 2/12/2017 Joe A 2/15/2017 4/15/2017 Joe C 7/25/2016 8/24/2016 Joe C 8/26/2016 9/25/2016 Joe C 10/10/2016 12/7/2016 Joe C 12/12/2016 1/11/2017 Joe C 1/20/2017 3/22/2017 Joe D 3/28/2016 5/28/2016 Joe D 6/6/2016 11/5/2016 Joe D 11/7/2016 12/7/2016 Joe D 12/12/2016 4/22/2017 Joe E 5/12/2016 5/15/2016 Mary D 12/12/2016 4/22/2017 Mary A 5/12/2016 5/15/2016 ; /* get drug per date */ data useful; set have; do date=start_dt to end_dt; output; end; format date mmddyy10.; run; proc sort data=useful; by patient_id date; run; /* get a variable for each drug per day caution: if your data is large this may take awhile */ Proc transpose data=useful out=trans ; by patient_id date; var Medication; run; /* count by date*/ data daycount; set trans; Numdrugs= countw(catx(' ',of col:)); Flag_3_meds= (NumDrugs ge 3); run; /* summarize whether EACH patient ever had 3 or more */ proc summary data=daycount nway; class patient_id; var flag_3_meds; output out=want (drop=_:) max=; run; /* if you only want patients with 3 or more a slightly simpler problem */ Proc sql; create table wantalt as select distinct patient_id, 1 as Flag_3 from trans where not missing(col3) ; run;
Note that the Daycount data set can be used to determine
1) how many days were overlapped with 3 or more drugs
2) easily get 2, 4 or 5 overlaps following the similar logic
3) with a little extra work the specific combination (s) of overlap
If your data are already group by patient_id, then a two dimensional array (rows for medication, columns for dates) provides a neat way to record each medication history, and then search each date for 3 or more meds:
data have;
input Patient_ID $ Medication $ start_dt :mmddyy10. End_dt :mmddyy10.;
format start_dt end_dt mmddyy10.;
datalines;
Tom A 8/29/2013 9/28/2013
Tom A 10/3/2013 11/2/2013
Tom A 11/4/2013 12/4/2013
Tom A 12/7/2013 1/6/2014
Tom A 1/22/2014 8/26/2014
Tom B 8/29/2013 9/28/2013
Tom B 10/1/2013 11/25/2013
Tom B 12/7/2013 1/6/2014
Tom B 1/12/2014 2/11/2014
Tom B 2/16/2014 3/18/2014
Tom B 3/30/2014 4/29/2014
Tom B 5/13/2014 7/7/2014
Tom E 9/6/2013 11/2/2013
Tom E 2/7/2014 2/14/2014
Jerry C 8/5/2013 12/27/2013
Jerry C 1/2/2014 5/23/2014
Jerry C 5/29/2014 7/27/2014
Jerry D 3/21/2014 6/14/2014
Jerry D 6/16/2014 8/8/2014
Jerry E 6/7/2014 6/19/2014
Joe A 3/28/2016 5/27/2016
Joe A 6/2/2016 7/28/2016
Joe A 8/3/2016 9/2/2016
Joe A 9/7/2016 11/5/2016
Joe A 11/7/2016 12/7/2016
Joe A 12/12/2016 1/11/2017
Joe A 1/13/2017 2/12/2017
Joe A 2/15/2017 4/15/2017
Joe C 7/25/2016 8/24/2016
Joe C 8/26/2016 9/25/2016
Joe C 10/10/2016 12/7/2016
Joe C 12/12/2016 1/11/2017
Joe C 1/20/2017 3/22/2017
Joe D 3/28/2016 5/28/2016
Joe D 6/6/2016 11/5/2016
Joe D 11/7/2016 12/7/2016
Joe D 12/12/2016 4/22/2017
Joe E 5/12/2016 5/15/2016
Mary D 12/12/2016 4/22/2017
Mary A 5/12/2016 5/15/2016
;
%let mindate9=01jan2013;
%let maxdate9=31dec2017;
data want (keep=patient_id flag3_meds);
set have;
by patient_id notsorted;
array drg_dates{1:5,%sysevalf("&mindate9"d):%sysevalf("&maxdate9"d)} _temporary_;
retain min_date max_date;
if first.patient_id then call missing(of drg_dates{*},min_date,max_date);
row=indexc('ABCDE',medication);
min_date=min(min_date,start_dt);
max_date=max(max_date,end_dt);
do col=start_dt to end_dt;
drg_dates{row,col}=1;
end;
if last.patient_id;
flag3_meds=0;
do date=min_date to max_date until(flag3_meds=1);
flag3_meds=sum(0,drg_dates{1,date},drg_dates{2,date},drg_dates{3,date},drg_dates{4,date},drg_dates{5,date}) >=3;
end;
run;
Make sure to set the macrovars MINDATE9 and MAXDATE9 to cover the entire date range for your study.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.