Hi All . I ma trying to find the total number of patient who are on dual or triple drugs on the same time.
Here is my data
patient id rx_date drug name Rx_end_date
1 3/1/2003 a 3/31/2003
1 3/28//2003 b 4/28/2003
1 2/1/2003 c 7/4/2008
1 4/4/2005 a 5/4/2005
2 1/1/2004 a 2/1/2004
2 1/12/2004 b 2/12/2004
2 2/2/2009 a 3/2/2009
2 2/5/2009 b 3/5/2009
3 1/1/2004 d 2/1/2004
3 6/6/2004 d 8/6/2004
3 6/12/2004 b 7/12/2004
3 6/29/2004 a 7/29/2004
I want the following output considering that the deifnition of dual, triple, or more therapy are those with an rx_Date that occur within 30 days of the other therapy.
% of monotherapy, % of dual, % of truple, % of 3 or more
Hi @lillymaginta,
I've developed a program based on the following assumptions:
/* Create test data */
data have;
input id rx_date :mmddyy. drug $ Rx_end_date :mmddyy.;
format rx: mmddyy10.;
cards;
1 3/1/2003 a 3/31/2003
1 3/28/2003 b 4/28/2003
1 2/1/2003 c 7/4/2008
1 4/4/2005 a 5/4/2005
2 1/1/2004 a 2/1/2004
2 1/12/2004 b 2/12/2004
2 2/2/2009 a 3/2/2009
2 2/5/2009 b 3/5/2009
3 1/1/2004 d 2/1/2004
3 6/6/2004 d 8/6/2004
3 6/12/2004 b 7/12/2004
3 6/29/2004 a 7/29/2004
4 1/1/2004 e 1/26/2004
4 2/15/2004 b 3/1/2004
4 2/15/2004 c 4/10/2004
4 4/1/2004 d 4/1/2004
4 6/15/2004 a 6/18/2004
4 6/1/2004 b 6/18/2004
4 4/12/2004 e 7/2/2004
4 6/16/2004 c 8/2/2004
5 1/1/2004 e 1/26/2004
5 2/15/2004 e 3/1/2004
5 2/15/2004 e 4/10/2004
5 4/1/2004 e 4/1/2004
5 6/15/2004 e 6/18/2004
5 6/1/2004 e 6/18/2004
5 4/12/2004 e 7/2/2004
5 6/16/2004 e 8/2/2004
;
%let posttrt=30; /* post-treatment phase in days after last drug administration */
/* Create list of distinct drug names */
proc sql;
create table druglist as
select distinct drug as start
from have;
quit;
%let drugs=&sqlobs;
%put ***** Input dataset contains &drugs different drugs. *****;
/* Create informat DRUG. for the assignment "drug name --> sequential number"
and format MULTDRUG. for the final frequency table */
data infmt;
retain fmtname '@drug';
set druglist;
label=_n_;
run;
proc format cntlin=infmt;
value multdrug
1 = 'monotherapy'
2 = 'dual therapy'
3 = 'triple therapy'
4-high = '>3-drugs therapy'
;
run;
/* Create dataset reflecting the chronological course of medication */
data timebar;
set have(keep=id rx_date drug rename=(rx_date=date))
have(keep=id rx_end_date drug in=e);
by id;
if e then date=rx_end_date+&posttrt+1;
flag=1-2*e; /* flag=1 for drug start, flag=-1 for first day after post-trt. phase */
keep id date drug flag;
run;
proc sort data=timebar;
by id date;
run;
/* Compute cumulative counts per drug, aggregated on patient-date level */
data counts;
array dcount[&drugs] 3;
do until(last.date);
set timebar;
by id date;
if first.id then call missing(of dcount[*]);
dcount[input(drug, drug.)]+flag;
maxflag=max(maxflag, flag);
end;
keep id date dcount: maxflag;
run;
/* Determine maximum number of concomitant drugs per patient */
data maxcounts;
do until(last.id);
set counts(where=(maxflag>0)); /* Only days with a drug added are */
by id; /* candidates for the maximum! */
array dcount[&drugs];
do i=1 to &drugs;
dcount[i]=(dcount[i]>0); /* count the same drug only once */
end;
ndrugs=max(ndrugs, sum(of dcount[*]));
end;
keep id ndrugs;
label ndrugs = 'Maximum number of concomitant drugs';
run;
/* Count patients with (max.) monotherapy, dual therapy, etc. */
proc freq data=maxcounts;
format ndrugs multdrug.;
tables ndrugs;
run;
Please feel free to suggest any changes.
You say:
"the deifnition of dual, triple, or more therapy are those with an rx_Date that occur within 30 days of the other therapy."
Does your "supply" correspond to number of days of medication or number of pills. If the number of pills is it to be considered at all?
Does the duration of overlap matter? For instance your ID has dates of 2/1/2003 which is within 30 days (2 day overlap) of 3/1/2003 which is also within 30 days of 3/28/2003. Is that 2 "dual" or a single "triple".
Are multiple instances of the same "dual" therapy counted. Suppose Patient x is prescribed drugs A and B on 3/1/2003, 3/31/2003, 4/29/2003. Would that be a count of 3 instances of dual therapy?
How big is your data? This is a 'medicine' cabinet problem partially.
If its too big my solution may be cumbersome so it's important to know.
Hi yes please post it, my data is huge with thousands of records.
Your question is ambiguous. Can you explain why your output would end up with that ?
Sorry for the confusion, I may not be asing this in the right way. I will need need the % of people on mono, dual, and triple therapy.
Hi @lillymaginta,
I've developed a program based on the following assumptions:
/* Create test data */
data have;
input id rx_date :mmddyy. drug $ Rx_end_date :mmddyy.;
format rx: mmddyy10.;
cards;
1 3/1/2003 a 3/31/2003
1 3/28/2003 b 4/28/2003
1 2/1/2003 c 7/4/2008
1 4/4/2005 a 5/4/2005
2 1/1/2004 a 2/1/2004
2 1/12/2004 b 2/12/2004
2 2/2/2009 a 3/2/2009
2 2/5/2009 b 3/5/2009
3 1/1/2004 d 2/1/2004
3 6/6/2004 d 8/6/2004
3 6/12/2004 b 7/12/2004
3 6/29/2004 a 7/29/2004
4 1/1/2004 e 1/26/2004
4 2/15/2004 b 3/1/2004
4 2/15/2004 c 4/10/2004
4 4/1/2004 d 4/1/2004
4 6/15/2004 a 6/18/2004
4 6/1/2004 b 6/18/2004
4 4/12/2004 e 7/2/2004
4 6/16/2004 c 8/2/2004
5 1/1/2004 e 1/26/2004
5 2/15/2004 e 3/1/2004
5 2/15/2004 e 4/10/2004
5 4/1/2004 e 4/1/2004
5 6/15/2004 e 6/18/2004
5 6/1/2004 e 6/18/2004
5 4/12/2004 e 7/2/2004
5 6/16/2004 e 8/2/2004
;
%let posttrt=30; /* post-treatment phase in days after last drug administration */
/* Create list of distinct drug names */
proc sql;
create table druglist as
select distinct drug as start
from have;
quit;
%let drugs=&sqlobs;
%put ***** Input dataset contains &drugs different drugs. *****;
/* Create informat DRUG. for the assignment "drug name --> sequential number"
and format MULTDRUG. for the final frequency table */
data infmt;
retain fmtname '@drug';
set druglist;
label=_n_;
run;
proc format cntlin=infmt;
value multdrug
1 = 'monotherapy'
2 = 'dual therapy'
3 = 'triple therapy'
4-high = '>3-drugs therapy'
;
run;
/* Create dataset reflecting the chronological course of medication */
data timebar;
set have(keep=id rx_date drug rename=(rx_date=date))
have(keep=id rx_end_date drug in=e);
by id;
if e then date=rx_end_date+&posttrt+1;
flag=1-2*e; /* flag=1 for drug start, flag=-1 for first day after post-trt. phase */
keep id date drug flag;
run;
proc sort data=timebar;
by id date;
run;
/* Compute cumulative counts per drug, aggregated on patient-date level */
data counts;
array dcount[&drugs] 3;
do until(last.date);
set timebar;
by id date;
if first.id then call missing(of dcount[*]);
dcount[input(drug, drug.)]+flag;
maxflag=max(maxflag, flag);
end;
keep id date dcount: maxflag;
run;
/* Determine maximum number of concomitant drugs per patient */
data maxcounts;
do until(last.id);
set counts(where=(maxflag>0)); /* Only days with a drug added are */
by id; /* candidates for the maximum! */
array dcount[&drugs];
do i=1 to &drugs;
dcount[i]=(dcount[i]>0); /* count the same drug only once */
end;
ndrugs=max(ndrugs, sum(of dcount[*]));
end;
keep id ndrugs;
label ndrugs = 'Maximum number of concomitant drugs';
run;
/* Count patients with (max.) monotherapy, dual therapy, etc. */
proc freq data=maxcounts;
format ndrugs multdrug.;
tables ndrugs;
run;
Please feel free to suggest any changes.
Hi FreelanceReinhard, thank you for taking the time to answer my question! Your assumptions fit my data. However, the same patient can be mono, dual, triple as the duration of the study period is long. So they can be counted in each of the category if they fit under the definition for any period.
But this is a very helpful start point, ThankS!!!
No problem. The code below would replace the last two steps in the previous program in order to match your latest requirements. (The formatting of the last table should be improved, though, e.g., display proportions as percentages.) Please note that, with this code, patients with no drugs would be disregarded in the calculation of percentages.
/* Determine number of concomitant drugs per patient and date */
data numdrugs;
do until(last.id);
set counts;
by id;
array dcount[&drugs];
do i=1 to &drugs;
dcount[i]=(dcount[i]>0); /* count the same drug only once */
end;
ndrugs=sum(of dcount[*]);
if ndrugs then output;
end;
keep id date ndrugs;
label ndrugs = 'Number of concomitant drugs';
run;
/* Assign categories "mono", "dual", ... therapy */
data categ;
mono=0; dual=0; triple=0; many=0;
do until(last.id);
set numdrugs;
by id;
select(ndrugs);
when(1) mono=1;
when(2) dual=1;
when(3) triple=1;
otherwise many=1;
end;
end;
drop date ndrugs;
run;
/* Count patients with monotherapy, dual therapy, etc. */
title 'Number ("Sum") and proportion ("Mean") of patients with mono, dual, etc. therapy';
proc means data=categ sum mean;
var mono dual triple many;
run;
Thanks FreelanceReinhard, just to make sure I understand your coding correctly. To find the rx_end date I added rx_start + supply and this is how I assigned the end date. In this case the 30 days extra will not be needed. The code you provided would find the frequency of mono, dual even if the overlap was one day, would it be possible to set the overlap period to 30 days? in this case count those as dual if the two drugs were taken for 30 days together?
One more, would it be possible in an extra step to list the drugs names that were taken as dual?
@lillymaginta wrote:
Thanks @FreelanceReinh, just to make sure I understand your coding correctly. To find the rx_end date I added rx_start + supply and this is how I assigned the end date. In this case the 30 days extra will not be needed.
This is no problem. You can simply set macro variable POSTTRT to 0
%let posttrt=0;
or just delete it from the only place in the program where it occurs:
if e then date=rx_end_date+1;
I almost expected that the 30 days were already meant to be included in your end dates, but there were two observations in your sample data where this did not seem to be the case:
2 2/2/2009 a 3/2/2009
2 2/5/2009 b 3/5/2009
The difference between these dates is only 28 days.
@lillymaginta wrote:
The code you provided would find the frequency of mono, dual even if the overlap was one day, would it be possible to set the overlap period to 30 days? in this case count those as dual if the two drugs were taken for 30 days together?
I think, this would be fairly complicated with the current approach, especially in view of your next change request: "would it be possible in an extra step to list the drugs names that were taken as dual?" Presumably, the overlaps would need to be measured separately for each combination of two (and three, ...) drugs, because only then you could say "xxx% of the patients were [for at least 30 days] on 'a+b' dual therapy, yyy% were on 'a+c' dual therapy, zzz% were on 'a+b+c' triple therapy, ..." and so on.
Let's assume that 8 drugs were involved for one patient. This would mean 28 different potential dual therapies and 56 different potential triple therapies. The report would need to explain whether a 40-day 'a+b' dual therapy was counted as such if these 40 days included 30 with 'a+b+c' triple therapy. Also, a patient might have had separate phases of, say, 20 days of 'a+b' dual therapy. Would these be added? Maybe depending on how far they were apart?
You see, even writing the specifications would require considerable thought. Definitely, this new set of specifications would be worth a separate thread (if not a service contract) and more precise and complete information (e.g. the total number of different drugs to be considered).
I actually have a total of 10 med types. So drugs can be from a-j.
i want it to be strictly 30 ore more days overlap so those on 20 will be counted mono and those on 40 days overlap would be counted dual ( they are 30 or more).
Yes, after finding those on dual, I will need to know that out of dual therapy: a+b %, a+c %. At leat I would appreciate some insights on how to find those that overlap.
Agree it is very complicated, been stuck for a week, that's why am here 🙂
I'm glad to see that ballardw helped you with this in the other thread.
Thank you FreelanceReinhard for all the help and for all the suggestions! 5 stars!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.