Hello SAS experts,
I would like to request for a help to calculate number of days covered & 80% adherence for multiple prescriptions for a patient . Use the Drug ID field of NDC codes to determine if the prescriptions are the same or different. Here are different scenarios I need to capture. Any help would be greatly appreciated.
a) If multiple prescriptions for different medications are dispensed on the same day, calculate the number of days covered by a medication using the prescriptions with the longest days supply.
b) For multiple different prescriptions dispensed on different days with overlapping days supply, count
each day in the treatment period only once .
c) If multiple prescriptions for the same medication are dispensed on the same day
or on different days, sum the days supply and use the total to calculate the number of days covered by a medication . For example, three prescriptions for the same medication are dispensed on the same day, each with a 30-day supply. Sum the days supply for a total of 90 days covered by the medication. Subtract any days supply that extends beyond December 31 of the measurement year.
Here is the sample data:
Mbrid NDC DOS supply drug_id
3A790 551110 02/10/2016 90 d04105
3A790 651112 02/10/2016 60 d04106
3A790 012390 05/20/2016 90 d04107
3A790 123456 06/20/2016 90 d04108
3A790 581111 08/24/2016 90 d04109
3A790 581113 08/24/2016 90 d04109
3A790 591114 09/24/2016 90 d04109
3A790 591115 10/16/2016 90 d04109
Thanks,
KK
You might want to share what the desired output would look like for your example input data. And what your individual variables mean.
I guess that mbrid is something like "member id" and you are looking for a report on each individual mbrid?
Is "measurement year" the same as the year of DOS?
You don't describe what 80% adherence would be as far as I can see.
So you only care about a single year? What about drugs issued near the end of the prior year? Is that data included?
First filter out records with no supply in 2016 and sort them by mbrid drugid and dos. This means you can capture reissue of a drug in consecutive records and see whether the supply overlaps:
data have;
input Mbrid :$5. NDC DOS :mmddyy10. supply drug_id :$6.;
format dos date9.;
datalines;
3A790 551110 02/10/2016 90 d04105
3A790 651112 02/10/2016 60 d04106
3A790 012390 05/20/2016 90 d04107
3A790 123456 06/20/2016 90 d04108
3A790 581111 08/24/2016 90 d04109
3A790 581113 08/24/2016 90 d04109
3A790 591114 09/24/2016 90 d04109
3A790 591115 10/16/2016 90 d04109
run;
%let refyear=2016;
%let lobound=%sysfunc(inputn(01jan&refyear,date9.));
%let upbound=%sysfunc(inputn(31dec&refyear,date9.));
%let caldays=%eval(1+&upbound-&lobound);
%put _user_;
proc sort ;
by mbrid drug_id dos;
where dos<"31dec&refyear"d and dos+supply>="01jan&refyear"d;
run;
The LOBOUND and UPBOUND values are being generated as the numeric values for 01jan2016 and 31dec2016, to use latter as array bounds. The array COVDATES will therefore be indexed on date value:
data want (keep mbrid pdc);
array covdates{&lobound:&upbound} _temporary_;
set have;
by mbrid drug_id;
if first.mbrid then call missing(of covdates{*});
possible_start=lag(dos)+lag(supply);
if first.drug_id then start_date=dos;
else start_date=max(dos,possible_start);
end_date=(start_date+supply-1);
start_date=max(start_date,&lobound);
end_date=min(end_date,&upbound);
do d=start_date to end_date;
covered{d}=1;
end;
if last.mbrid;
pdc=sum(of covered{*})/dim(covered);
run;
A start date and end date have to be determined for each supply. If it's the first record for a given drug the startdate is DOS and enddate is DOS+SUPPLY. But later records have to be examined to see by how much, if any, overlap there is with preceding records for the same drug:
data want (keep=mbrid pdc ncovered);
array covdates{&lobound:&upbound} _temporary_;
retain end_date;
set have;
by mbrid drug_id;
if first.mbrid then call missing(of covdates{*},end_date);
if first.drug_id then start_date=dos;
else start_date=max(dos,end_date+1);
end_date=start_date+supply-1;
do d=max(start_date,"01jan&refyear"d) to min(end_date,"31dec&refyear"d);
covdates{d}=1;
end;
if last.mbrid;
ncovered=sum(of covdates{*});
pdc=ncovered/&caldays;
run;
Notice the "do d=" loop only coveres the 2016 portion of start_date to end_date.
Great! Thanks a lot mkeintz . I really appreciate your quick response. I will test run the code against the real data and let you know. Thanks a lot again.
Hi Mkeintz,
Could you please help me to tweak your code if the "have" dataset has index_date and end_date. I need PDC between index_date and end_date. Also can you please tell me where you are looking for same or different medications? I need to find if multiple prescriptions for different medications are dispensed on the same day, calculate the number of days covered by a medication using the prescriptions with the longest days supply, For multiple different prescriptions dispensed on different days with overlapping days supply, count each day in the treatment period only once. If multiple prescriptions for the same medication are dispensed on the same dayor on different days, sum the days supply and use the total to calculate the number of days covered Thank you in advance for your response.
data have;
input Mbrid :$5. NDC DOS :mmddyy10. supply drug_id :$6. index_date :mmddyy10. end_date :mmddyy10.;
format dos index_date end_date date9.;
datalines;
3A790 551110 02/10/2016 90 d04105 02/10/2016 12/31/2016
3A790 651112 02/10/2016 60 d04106 02/10/2016 12/31/2016
3A790 012390 05/20/2016 90 d04107 02/10/2016 12/31/2016
3A790 123456 06/20/2016 90 d04108 02/10/2016 12/31/2016
3A790 581111 08/24/2016 90 d04109 02/10/2016 12/31/2016
3A790 581113 08/24/2016 90 d04109 02/10/2016 12/31/2016
3A790 591114 09/24/2016 90 d04109 02/10/2016 12/31/2016
3A790 591115 10/16/2016 90 d04109 02/10/2016 12/31/2016
run;
I
Because the data are PROC SORTed by MBRID DRUG_ID DOS, all records for the same drug are consecutive and in chronological order. If you study use "FIRST." automatic dummy variables that arises from use of the BY statement in the data step, you'll see it is easy to determine whether the record-in-hand is the first record for a given drug, or not. If it's the first, I use DOS as startdate. But if not I compare DOS to the prior (DOS+SUPPLY) to see whether there is an overlap. If there is, then I move startdate to the maximum of the current DOS vs (1+prior end_date).
Why do you think I am double counting a day with multiple drugs? Did you simulate some test data and look at the results? If not I recommend you create a dataset having 2 records with the same date range but different drugs. Run it through the program and see whether it provides what you requested. My understanding of your request was the you wanted to count a day as covered if it had ANY drug.
Similarly, for cases with 2 records of the same drug on the same DOS, do you believe that the program fails to recognize the increased supply, and therefore extends the days covered? Again, make a test data set, run the program to test your assumption. It will help you understand the code.
If any of these tests do not produce what you request, then provide a dataset of the incoming data, and a dataset of the desired resulting data. That will make it clear to me.
Finally, instead of 01jan2016 through 31dec2016, you want a different coverage period: index_date through end_date (not to be confused with the end_date I created in my proposed solution). Are those values constant over the entire data set? If so, just change the Macrovariables LOBOUND and UPBOUND. And the proc sort, can
use a WHERE statement of
where dos<&upbound and dos+supply>=&lobound;
Use of &refyear is no longer useful. You'll also have to change the bounds of the DO loop in the data step, but I leave it to you to determine the appropriate "tweak".
You should be able to determine how to change the remainder of the program
Thanks MKeintz. I will take a deeper dibe into the code and test it. One more clarification please. I noticed that you created 2 want datasets with PDC calculation in each of them. Are you showing 2 different methods of PDC calculation ?
That was just sloppy editing on my part. They are a single technique, and I should have shown just one example.
Use the second if you want the NCOVERED variable as well as PDC.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.