Hello All,
First of all I want to thank this forum and its members for helping out people like me who are new to SAS. So here I am again asking for help as I am stuck.
I have a claims dataset where I want to flag patients if they are taking 5 or more medications at any point of time.
I have:
PTID DRUG_CLASS DATE_FILLED DAYS_SUPPLY
1 A 14JUN2018 30
1 B 14JUN2018 45
1 C 30JUN2018 30
1 D 30JUN2018 30
1 E 30AUG2018 10
1 B 30AUG2018 30
1 C 07SEP2018 30
2 B 02JUN2018 45
2 F 02JUN2018 30
2 H 10JUN2018 15
2 C 10JUN2018 30
2 A 25JUN2018 15
2 H 25JUN2018 15
2 B 25JUN2018 30
2 A 10JUL2018 7
2 G 10JUL2018 30
2 H 10JUL2018 30
2 B 10JUL2018 30
2 H 10JUL2018 15
3 J 04JUN2018 10
3 B 04JUN2018 30
3 C 04JUN2018 30
3 J 10JUN2018 30
3 B 10JUN2018 10
3 C 10JUN2018 10
3 R 13JUL2018 15
3 K 13JUL2018 7
4 T 09JUL2018 90
4 K 09JUL2018 30
4 N 09JUL2018 30
5 B 22JUN2018 15
5 H 09JUL2018 30
5 J 09JUL2018 30
5 A 09JUL2018 45
5 B 22JUL2018 15
5 C 22JUL2018 30
5 H 08AUG2018 30
5 B 08AUG2018 15
5 N 10AUG2018 7
5 H 10SEP2018 30
5 B 10SEP2018 30
Also if any patient has filled the second prescription of the same class before his pills in hand are over we have to consider that fact as well. Like in this dataset patient id 3 filled drugs J, B and C for the second time before his in hands meds are over. So in JUNE 2018 he filled the prescription twice and therefore his J, B and C class of drugs will be over by 14th July 2018 and since he filled another two class of drugs on 13th of July 2018 therefore he should get flagged.
Hi @abhi309 ,
This is how I approached this problem
DATA have;
LENGTH PTID 3
DRUG_CLASS $1
DATE_FILLED 8
DAYS_SUPPLY 3;
FORMAT DATE_FILLED date9.;
INPUT PTID :1. DRUG_CLASS :$1. DATE_FILLED :date9. DAYS_SUPPLY ;
datalines;
1 A 14JUN2018 30
1 B 14JUN2018 45
1 C 30JUN2018 30
1 D 30JUN2018 30
1 E 30AUG2018 10
1 B 30AUG2018 30
1 C 07SEP2018 30
2 B 02JUN2018 45
2 F 02JUN2018 30
2 H 10JUN2018 15
2 C 10JUN2018 30
2 A 25JUN2018 15
2 H 25JUN2018 15
2 B 25JUN2018 30
2 A 10JUL2018 7
2 G 10JUL2018 30
2 H 10JUL2018 30
2 B 10JUL2018 30
2 H 10JUL2018 15
3 J 04JUN2018 10
3 B 04JUN2018 30
3 C 04JUN2018 30
3 J 10JUN2018 30
3 B 10JUN2018 10
3 C 10JUN2018 10
3 R 13JUL2018 15
3 K 13JUL2018 7
4 T 09JUL2018 90
4 K 09JUL2018 30
4 N 09JUL2018 30
5 B 22JUN2018 15
5 H 09JUL2018 30
5 J 09JUL2018 30
5 A 09JUL2018 45
5 B 22JUL2018 15
5 C 22JUL2018 30
5 H 08AUG2018 30
5 B 08AUG2018 15
5 N 10AUG2018 7
5 H 10SEP2018 30
5 B 10SEP2018 30
;
run;
PROC SORT Data=work.have;
BY PTID DRUG_CLASS DATE_FILLED;
run;
proc sql noprint;
select max (cnt)
into :g_maxCnt trimmed
from
(select count(*) as cnt
from work.have
group by PTID,DRUG_CLASS)
;
quit;
DATA want(KEEP=PTID DRUG_CLASS dt: sp: df: flag);
if (0) then SET work.have;
ARRAY dates {&g_maxCnt} 8 dt1 - dt&g_maxCnt;
ARRAY supplies {&g_maxCnt} 3 sp1 - sp&g_maxCnt;
ARRAY diffs {%eval(&g_maxCnt-1)} 4 df1 - df%eval(&g_maxCnt-1);
LENGTH flag 3;
FORMAT dt: date9.;
flag=0;
do _n_=1 by 1 until (last.drug_class);
SET work.have;
BY PTID DRUG_CLASS DATE_FILLED;
dates[_n_] = DATE_FILLED;
supplies[_n_] = DAYS_SUPPLY;
if(_n_ GT 1) then
do;
diffs[(_n_-1)] = dates[_n_] - dates[(_n_-1)];
if (flag NE 1) then
flag = ifn (0 < diffs[(_n_-1)] < supplies[(_n_-1)],1,0) ;
end;
end;
OUTPUT;
RUN;
PROC SORT Data=work.want;
BY PTID DRUG_CLASS dt1;
run;
proc sql;
select ptid, sum(flag)
from want
where flag = 1
group by ptid;
quit;
Hope this helps
How many records are you dealing with? One of the easiest to code and follow is to expand each prescription to one record per day based on the fill date and the duration then count by day the number of prescriptions. But if the data set is already large it will get pretty big as each of those is going to average multiplying the data set by about 30 times.
Also, I am not sure that I follow exactly how you are handling the overlap of the same med. Several of my medications are refilled 15 to 20 days before the prior fill is exhausted to make sure that I don't get a lapse in medication. So an "overlap" of those is actually more of an extension of the previous fill, at least in my sight.
Thank you @ballardw for your quick reply. The dataset is very big, so if I expand the dataset it will be very big.
You are right about the overlap. It does mean extension of the previous fill.
@ballardw I want to try and expand the dataset as you suggested. Can you please give me an idea how to do that? Thank you
Something like this will expand your data. I assumed that the Date_Filled will also be the first day of supply, tracked using Date_Now.
data want;
set have;
by PTID DRUG_CLASS;
format Date_Now date9.;
if first.DRUG_CLASS then do day = 1 to DAYS_SUPPLY;
Date_Now = Date_Filled + day -1;
output;
end;
run;
Thank you @SASKiwi. I expanded the dataset and transposed it. How can I flag the patients who are taking 5 or more med class for 180 days or more.
I wouldn't transpose it. I would do something like this to get the drug classes per day by patient:
proc sql;
create table want2 as
select PTID
,Date_Now
,count(distinct DRUG_CLASS) as DRUG_CLASS_Count
from want
group by PTID
,Date_Now
;
quit;
Hi @abhi309 ,
This is how I approached this problem
DATA have;
LENGTH PTID 3
DRUG_CLASS $1
DATE_FILLED 8
DAYS_SUPPLY 3;
FORMAT DATE_FILLED date9.;
INPUT PTID :1. DRUG_CLASS :$1. DATE_FILLED :date9. DAYS_SUPPLY ;
datalines;
1 A 14JUN2018 30
1 B 14JUN2018 45
1 C 30JUN2018 30
1 D 30JUN2018 30
1 E 30AUG2018 10
1 B 30AUG2018 30
1 C 07SEP2018 30
2 B 02JUN2018 45
2 F 02JUN2018 30
2 H 10JUN2018 15
2 C 10JUN2018 30
2 A 25JUN2018 15
2 H 25JUN2018 15
2 B 25JUN2018 30
2 A 10JUL2018 7
2 G 10JUL2018 30
2 H 10JUL2018 30
2 B 10JUL2018 30
2 H 10JUL2018 15
3 J 04JUN2018 10
3 B 04JUN2018 30
3 C 04JUN2018 30
3 J 10JUN2018 30
3 B 10JUN2018 10
3 C 10JUN2018 10
3 R 13JUL2018 15
3 K 13JUL2018 7
4 T 09JUL2018 90
4 K 09JUL2018 30
4 N 09JUL2018 30
5 B 22JUN2018 15
5 H 09JUL2018 30
5 J 09JUL2018 30
5 A 09JUL2018 45
5 B 22JUL2018 15
5 C 22JUL2018 30
5 H 08AUG2018 30
5 B 08AUG2018 15
5 N 10AUG2018 7
5 H 10SEP2018 30
5 B 10SEP2018 30
;
run;
PROC SORT Data=work.have;
BY PTID DRUG_CLASS DATE_FILLED;
run;
proc sql noprint;
select max (cnt)
into :g_maxCnt trimmed
from
(select count(*) as cnt
from work.have
group by PTID,DRUG_CLASS)
;
quit;
DATA want(KEEP=PTID DRUG_CLASS dt: sp: df: flag);
if (0) then SET work.have;
ARRAY dates {&g_maxCnt} 8 dt1 - dt&g_maxCnt;
ARRAY supplies {&g_maxCnt} 3 sp1 - sp&g_maxCnt;
ARRAY diffs {%eval(&g_maxCnt-1)} 4 df1 - df%eval(&g_maxCnt-1);
LENGTH flag 3;
FORMAT dt: date9.;
flag=0;
do _n_=1 by 1 until (last.drug_class);
SET work.have;
BY PTID DRUG_CLASS DATE_FILLED;
dates[_n_] = DATE_FILLED;
supplies[_n_] = DAYS_SUPPLY;
if(_n_ GT 1) then
do;
diffs[(_n_-1)] = dates[_n_] - dates[(_n_-1)];
if (flag NE 1) then
flag = ifn (0 < diffs[(_n_-1)] < supplies[(_n_-1)],1,0) ;
end;
end;
OUTPUT;
RUN;
PROC SORT Data=work.want;
BY PTID DRUG_CLASS dt1;
run;
proc sql;
select ptid, sum(flag)
from want
where flag = 1
group by ptid;
quit;
Hope this helps
Thank you @AhmedAl_Attar for your reply. I tried what you suggested but I didn't the desired result.
@abhi309 wrote:
Thank you @AhmedAl_Attar for your reply. I tried what you suggested but I didn't the desired result.
Then show the desired result for the given example data.
If you can't determine that, then maybe use a smaller example data set that you can show the result.
@ballardw I would have posted the results but the data I wanted was too complicated to post. Although I have made slight change in the method section of analysis, now what I want with the same data is, to flag the patient if he is taking 5 or more med class for 90 days or more.
So the data I got so far is;
PTID DRUG_CLASS TOTAL_DAYSUPPLY
1 A 30
1 B 75
1 C 60
1 D 30
1 E 10
2 A 22
2 B 105
2 C 30
2 F 30
2 G 30
2 H 75
3 B 40
3 C 40
3 J 40
3 K 7
3 R 15
4 K 30
4 N 30
4 T 90
5 A 45
5 B 75
5 C 30
5 H 90
5 J 30
5 N 7
6 A 45
6 B 90
6 C 90
6 D 120
6 E 90
6 F 200
I have added one more patient (PT ID: 6). I have calculated total number of supply days for each medication class. Now I want to flag the patients who are taking 5 or more drugs for 90 or more days. Like in the above data only patient 6 will be flagged since he has filled B, C and E for 90 days and drug D and F for 120 and 200 days respectively. Other patients don't get flagged.
Hope this clarifies what I want.
Thank you again everyone
With that data, if I understand what you want which is questionable, this may help.
data have; input PTID $ DRUG_CLASS $ TOTAL_DAYSUPPLY; datalines; 1 A 30 1 B 75 1 C 60 1 D 30 1 E 10 2 A 22 2 B 105 2 C 30 2 F 30 2 G 30 2 H 75 3 B 40 3 C 40 3 J 40 3 K 7 3 R 15 4 K 30 4 N 30 4 T 90 5 A 45 5 B 75 5 C 30 5 H 90 5 J 30 5 N 7 6 A 45 6 B 90 6 C 90 6 D 120 6 E 90 6 F 200 ; proc summary data=have nway; class ptid DRUG_CLASS; var TOTAL_DAYSUPPLY; output out=summed (drop=_: where=(TOTAL_DAYSUPPLY ge 90)) sum=; run; proc sql; create table want as select distinct ptid from summed group by ptid having count(DRUG_CLASS) ge 5 ; quit;
You probably should look at the Summed set created by proc summary. It is getting the total of the days supply for each ptid and drug_class. The Where is a data set option on the out= bit that selects only the drugs with 90+ days total.
From that point there are several ways to see if the count of Drug_class is 5 or greater.
If you can't create a data step please at least paste the data as text into a text box created by using the </> icon and pasting text.
Thank you @ballardw this works.
Sorry I accepted the other reply as solution by mistake. I don't know how to change that.
Thanks again
Hi @abhi309
This will give you what you are looking for
data have;
input PTID $ DRUG_CLASS $ TOTAL_DAYSUPPLY;
datalines;
1 A 30
1 B 75
1 C 60
1 D 30
1 E 10
2 A 22
2 B 105
2 C 30
2 F 30
2 G 30
2 H 75
3 B 40
3 C 40
3 J 40
3 K 7
3 R 15
4 K 30
4 N 30
4 T 90
5 A 45
5 B 75
5 C 30
5 H 90
5 J 30
5 N 7
6 A 45
6 B 90
6 C 90
6 D 120
6 E 90
6 F 200
;
run;
proc sql;
create table want as
select agg.ptid
, agg.drug_cnt
, agg._90Plus_cnt
from
/* in-line aggregation */
(select ptid
, count(*) as drug_cnt
, sum(CASE WHEN TOTAL_DAYSUPPLY >= 90 then 1 else 0 end) as _90Plus_cnt
from have
group by ptid) agg
/* Desired filtering condition */
where agg.drug_cnt GE 5
and agg._90Plus_cnt GE 5
order by ptid;
quit;
Thank you @AhmedAl_Attar this works. I really appreciate your time and help.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.