I am working with data on prescription drugs (drug-level data) with start and end dates for prescriptions for large number of individuals. For each individual in drug, say, XYZ, I want to identify the first date when they had filled prescriptions for at least 40 days within 80 days time-frame without a gap of more than 20 days (allowable gap of 20 days). Below is a part of the data.
data drug_dates;
input id $ start_dt end_dt;
informat start_dt yymmdd10.
end_dt yymmdd10.;
format start_dt yymmdd10.
end_dt yymmdd10.;
datalines;
A 2017/01/01 2017/01/10
A 2017/01/16 2017/02/04
A 2017/03/02 2017/03/11
A 2017/04/01 2017/04/15
A 2017/05/06 2017/05/15
A 2017/05/26 2017/06/14
;
run;
Below is the logic I tried to use. Basically, I expanded all the prescriptions vertically, with each row for each day, and tried to identify the gaps. But I still cannot identify the first date which indicates that they have reached 40 days of use in 80 days without gap of more than 20 days. I am not sure how to do an iterative sum across rows. Thank you!
data drug_dates1;
set drug_dates;
do each_day=start_dt to end_dt;
output;
end;
format each_day yymmdd10.;
run;
proc sql;
create table all_dates as
select distinct id, min(start_dt) as min_start format yymmdd10., max(end_dt) as max_end format yymmdd10.
from drug_dates
group by id;
quit;
data all_dates1;
set all_dates;
do each_day=min_start to max_end;
output;
end;
format each_day yymmdd10.;
run;
proc sql;
create table combined as
select *
from all_dates1 a left join drug_dates1 b
on a.id=b.id and a.each_day=b.each_day
order by id, each_day;
quit;
You may want to introduce some windowing variables, then do selection based on their values. In this example, these variables are called within80days and within40days.
proc sql;
select *,
case
when -80 <= end_dt - start_dt <= 80
then 'Yes'
else 'No'
end as within80days,
case
when -40 <= end_dt - start_dt <= 40
then 'Yes'
else 'No'
end as within40days
from drug_dates
;
quit;
Reference
Jack Shostak, SAS Programming in the Pharmaceutical Industry, 2nd Ed.
Equivalent to
-80 <= end_dt - start_dt <= 80
for symmetric range is
abs(end_dt-start_dt) le 80
I often miss negative signs
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.