Hello,
I am a bit new to SAS and any help with my issue would be of great help.
I have a dataset organized as below:
ID | DRUG | START_DT | DAYS_SUPP | END_DT |
1 | A | 2/17/10 | 30 | 3/19/10 |
1 | B | 5/6/09 | 30 | 6/5/09 |
1 | C | 7/9/11 | 60 | 9/7/11 |
1 | E | 3/1/10 | 90 | 5/30/10 |
1 | B | 1/1/09 | 90 | 4/1/09 |
1 | D | 2/1/09 | 30 | 3/3/09 |
1 | C | 5/6/12 | 90 | 8/4/12 |
2 | B | 4/1/12 | 60 | 5/31/12 |
2 | A | 7/1/10 | 30 | 7/31/10 |
2 | C | 8/3/10 | 90 | 11/1/10 |
2 | D | 11/1/13 | 90 | 1/30/14 |
2 | E | 12/5/13 | 90 | 3/5/14 |
2 | A | 2/1/11 | 90 | 5/2/11 |
I would like to identify individuals that are using two or more different drugs at the same time for at least one month. Use of drug A (or B) may begin before (or after) use of drug B (or A); also, duration of drug A can fall completely within the duration of drug B (and vice versa).
If possible, I would also like to keep track of the different types of drug combinations (e.g. AB, CD, ABCE etc).
If a person has used two or more drugs concurrently for at least one month, I would then like to capture that first date when this criteria is met (to merge with another type of dataset).
Any help is much appreciated!
Thank you very much!
I think I can handle all cases with :
data have;
input ID DRUG $ START_DT :mmddyy. DAYS_SUPP END_DT :mmddyy.;
datalines;;
1 A 2/17/10 30 3/19/10
1 B 5/6/09 30 6/5/09
1 C 7/9/11 60 9/7/11
1 E 3/1/10 90 5/30/10
1 B 1/1/09 90 4/1/09
1 D 2/1/09 30 3/3/09
1 C 5/6/12 90 8/4/12
2 B 4/1/12 60 5/31/12
2 A 7/1/10 30 7/31/10
2 C 8/3/10 90 11/1/10
2 D 11/1/13 90 1/30/14
2 E 12/5/13 90 3/5/14
2 A 2/1/11 90 5/2/11
2 F 12/16/13 30 1/14/14
;
data days;
set have;
dum = 1;
do day = start_dt to end_dt;
output;
end;
format day yymmdd10.;
keep id drug day;
run;
proc sort data=days; by id day drug; run;
data dayTable;
array d{32} $10;
length drugList $64;
do n = 1 by 1 until(last.day);
set days; by id day;
d{n} = drug;
end;
do nDrugs = 2 to n;
ncomb = comb(n, nDrugs);
do j = 1 to ncomb;
rc = lexcomb(j, nDrugs, of d{*});
call missing(drugList);
do k = 1 to nDrugs;
drugList = catx("-", drugList, d{k});
end;
output;
end;
end;
keep id day nDrugs drugList;
run;
proc sort data=dayTable; by id nDrugs drugList day; run;
data dayTableSeq;
retain seq;
set dayTable; by id nDrugs drugList;
if first.drugList or day-1 ne lag(day) then seq+1;
run;
data want;
length id nDrugs 8 drugList $64 duration startDate endDate 8;
retain startDate;
set dayTableSeq; by id nDrugs drugList seq;
if first.seq then startDate = day;
if last.seq then do;
endDate = day;
duration = endDate - startDate + 1;
/*if duration >= 28 then*/ output;
end;
format startDate endDate yymmdd10.;
keep id nDrugs drugList startDate endDate duration;
run;
I added a prescription to drug F to generate a 3-drugs case. Note that all sub-combinations are also listed, i.e. if drug combination D-E-F exists then so will combinations D-E, D-F, and E-F with durations equal or longer than the duration of D-E-F.
Post the expected output based on your sample data.
I think I can handle all cases with :
data have;
input ID DRUG $ START_DT :mmddyy. DAYS_SUPP END_DT :mmddyy.;
datalines;;
1 A 2/17/10 30 3/19/10
1 B 5/6/09 30 6/5/09
1 C 7/9/11 60 9/7/11
1 E 3/1/10 90 5/30/10
1 B 1/1/09 90 4/1/09
1 D 2/1/09 30 3/3/09
1 C 5/6/12 90 8/4/12
2 B 4/1/12 60 5/31/12
2 A 7/1/10 30 7/31/10
2 C 8/3/10 90 11/1/10
2 D 11/1/13 90 1/30/14
2 E 12/5/13 90 3/5/14
2 A 2/1/11 90 5/2/11
2 F 12/16/13 30 1/14/14
;
data days;
set have;
dum = 1;
do day = start_dt to end_dt;
output;
end;
format day yymmdd10.;
keep id drug day;
run;
proc sort data=days; by id day drug; run;
data dayTable;
array d{32} $10;
length drugList $64;
do n = 1 by 1 until(last.day);
set days; by id day;
d{n} = drug;
end;
do nDrugs = 2 to n;
ncomb = comb(n, nDrugs);
do j = 1 to ncomb;
rc = lexcomb(j, nDrugs, of d{*});
call missing(drugList);
do k = 1 to nDrugs;
drugList = catx("-", drugList, d{k});
end;
output;
end;
end;
keep id day nDrugs drugList;
run;
proc sort data=dayTable; by id nDrugs drugList day; run;
data dayTableSeq;
retain seq;
set dayTable; by id nDrugs drugList;
if first.drugList or day-1 ne lag(day) then seq+1;
run;
data want;
length id nDrugs 8 drugList $64 duration startDate endDate 8;
retain startDate;
set dayTableSeq; by id nDrugs drugList seq;
if first.seq then startDate = day;
if last.seq then do;
endDate = day;
duration = endDate - startDate + 1;
/*if duration >= 28 then*/ output;
end;
format startDate endDate yymmdd10.;
keep id nDrugs drugList startDate endDate duration;
run;
I added a prescription to drug F to generate a 3-drugs case. Note that all sub-combinations are also listed, i.e. if drug combination D-E-F exists then so will combinations D-E, D-F, and E-F with durations equal or longer than the duration of D-E-F.
the codes run with error.
What do you mean by "the same time"? you mean same START_DT ? data have; input ID DRUG $ START_DT :mmddyy. DAYS_SUPP END_DT :mmddyy.; datalines;; 1 A 2/17/10 30 3/19/10 1 B 5/6/09 30 6/5/09 1 C 7/9/11 60 9/7/11 1 E 3/1/10 90 5/30/10 1 B 1/1/09 90 4/1/09 1 D 2/1/09 30 3/3/09 1 C 5/6/12 90 8/4/12 2 B 4/1/12 60 5/31/12 2 A 7/1/10 30 7/31/10 2 C 8/3/10 90 11/1/10 2 D 11/1/13 90 1/30/14 2 E 12/5/13 90 3/5/14 2 A 2/1/11 90 5/2/11 2 F 12/16/13 30 1/14/14 ; proc sql; select * from have where days_supp gt 30 group by id,start_dt having count(distinct drug) ge 2; quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.