Hi there,
I am working with pharmacy claim data, and would like to calculate how many consecutive days a person was on a prescription. The data I have is constructed by claim, so each claim corresponds to one prescription, if a patient get one drug on 10/21/2015 for 20 days of supply, then he got an refill on 11/10/2015 for 20 days, since there was no gap between the refill and the end date of last prescription, their days of supply would be summed up. Thus I would like to know how many consecutive days this person was on a prescription.
The data structure is something like this:
data temp;
input patid fill_dt : mmddyy10. days_sup;
format fill_dt mmddyy10.;
datalines;
1 10/21/2015 20
1 11/10/2015 20
1 11/30/2015 30
2 5/1/2013 20
2 5/20/2013 10
2 6/10/2013 3
;
run;
if a claim's end date which is defined as the fill_dt + days_sup is less than or equal to the next claim's fill_dt, I would like to sum their's days_sup together.
Therefore the desired output would be like:
1 10/21/2015 70
2 5/1/2013 30
2 6/10/2013 3
I have struggle with for a few days. Any help would be appreciated! Thanks!
This works for your example data:
proc sort data=temp; by patid fill_dt; run; data want; set temp; by patid fill_dt; retain series_date day_tot; format series_date mmddyy10.; if first.patid then do; series_date=fill_dt; day_tot=days_sup; end; else if fill_dt le (series_date+day_tot) then do; day_tot=day_tot+days_sup; end; else if fill_dt gt (series_date+day_tot) then do; output; series_date=fill_dt; day_tot=days_sup; end; if last.patid then do; output; call missing(series_date, day_tot); end; keep patid series_date day_tot; label series_date = 'First Fill date for sequence' day_tot = 'Total days supply for series' ; run;
BY processing allows you identify the first and last record for a patid value to set/reset accumulating variables. Retain says to keep the variable values across iterations of the data step.
Specific OUTPUT statements control when the data is actually written to the data set.
This works for your example data:
proc sort data=temp; by patid fill_dt; run; data want; set temp; by patid fill_dt; retain series_date day_tot; format series_date mmddyy10.; if first.patid then do; series_date=fill_dt; day_tot=days_sup; end; else if fill_dt le (series_date+day_tot) then do; day_tot=day_tot+days_sup; end; else if fill_dt gt (series_date+day_tot) then do; output; series_date=fill_dt; day_tot=days_sup; end; if last.patid then do; output; call missing(series_date, day_tot); end; keep patid series_date day_tot; label series_date = 'First Fill date for sequence' day_tot = 'Total days supply for series' ; run;
BY processing allows you identify the first and last record for a patid value to set/reset accumulating variables. Retain says to keep the variable values across iterations of the data step.
Specific OUTPUT statements control when the data is actually written to the data set.
Thank you ballardw. The solution works perfectly. I always have difficulty with retain and output statements, it is hard to get what I want by playing with them. Any references? Thanks!
data temp;
input patid fill_dt : mmddyy10. days_sup;
end_date=fill_dt+days_sup;
format fill_dt end_date mmddyy10.;
datalines;
1 10/21/2015 20
1 11/10/2015 20
1 11/30/2015 30
2 5/1/2013 20
2 5/20/2013 10
2 6/10/2013 3
;
run;
data temp;
set temp;
by patid;
if first.patid or fill_dt>lag(end_date) then group+1;
run;
data want;
set temp;
by group;
retain date;
if first.group then do;sum=0;date=fill_dt;end;
sum+days_sup;
if last.group;
format date mmddyy10.;
keep patid sum date;
run;
proc print;run;
Thank you Ksharp. The idea of creating another group indicator is fantastic!
data have;
input patid fill_dt : mmddyy10. days_sup;
format fill_dt mmddyy10.;
datalines;
1 10/21/2015 20
1 11/10/2015 20
1 11/30/2015 30
2 5/1/2013 20
2 5/20/2013 10
2 6/10/2013 3
;
run;
data temp;
set have;
by patid;
retain k;
if first.patid then do; grp=0;k=0;end;
if not(fill_dt <= k) then grp+1;
k=intnx('days',fill_dt,days_sup);
drop k;
run;
proc sql;
create table want as
select patid ,fill_dt,sum(days_sup) as sum
from temp
group by patid, grp
having fill_dt=min(fill_dt)
order by patid, fill_dt;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.