Hi all, I made a simple sample and hope this explains my question clearly. What I want to do is calculate the proportion of days covered by medication. The data set "Have" contains end_date: the end date of the medication observation period. stat_dt: the start date of the medication observation period. fill_dt: when patients refill their drug. supply_days: supply days of each fill. days_in: I calculate this by "end_date" - "start_date", is the observations period in days between end date and start date. It is obvious each patient will have different observation length. Firstly I transposed "Have" twice to a wide format with each row representing one patient, " have_fill_dates" transposed fill date and "have_days_supply" transposed supply days, then I merged two data sets and created many dummy days representing each day in observation period. Then I got the problem, how to specific different length for each patient in array statement? I used "####" to represent the problems need fix. The first reason I create many single dummy days during the observation period rather than just sum up supply days and then divide by observation period, which is much more convenient than this array is that I will later need to calculate proportion by month and by quarter, I need these dummy days. the second reason is that there are some overlaps, such as the second observation of stuyd_id 2, he refilled his drugs on " 09/08/2016" which is within 30 days after previous refill date "08/22/2016", and I need to move "09/08/2016" to "09/22/2016" to then assume patients refill their drugs right after exhausting previous fill. My primary question is how to create different array length? "####" part in below program. And the second question is any suggestion on how to move the fill_dt right after exhausting the previous fill more efficient in the array do loop part. for example, move "09/08/2016" to "09/22/2016" for the second observation of study id 2. Thank so much! data have;
informat study_id $4.;
informat end_date mmddyy10.;
informat fill_dt mmddyy10.;
informat supply_days best8.;
informat start_dt mmddyy10.;
informat days_in best9.;
input study_id end_date fill_dt supply_days start_dt days_in;
format end_date fill_dt start_dt mmddyy10.;
datalines;
1 02/13/2018 07/07/2017 30 08/27/2017 170
1 02/13/2018 08/25/2017 30 08/27/2017 170
1 02/13/2018 12/19/2017 35 08/27/2017 170
1 02/13/2018 01/23/2018 25 08/27/2017 170
1 02/13/2018 03/09/2018 35 08/27/2017 170
2 10/14/2017 08/22/2016 30 10/27/2016 352
2 10/14/2017 09/08/2016 30 10/27/2016 352
2 10/14/2017 12/13/2016 30 10/27/2016 352
2 10/14/2017 01/11/2017 30 10/27/2016 352
2 10/14/2017 02/04/2017 30 10/27/2016 352
2 10/14/2017 02/11/2017 30 10/27/2016 352
2 10/14/2017 05/01/2017 30 10/27/2016 352
2 10/14/2017 05/28/2017 30 10/27/2016 352
2 10/14/2017 08/01/2017 25 10/27/2016 352
;
run;
proc sort data=have; by study_id end_date;run;
proc transpose data=have out=have_fill_dates (drop=_name_) prefix=fill_dt;
by study_id end_date start_dt days_in;
var fill_dt;
run;
proc transpose data=have out=have_days_supply (drop=_name_) prefix=days_supply;
by study_id end_date start_dt days_in;
var supply_days;
run;
/*merge fill dates and days supply*/
data both_havex;
merge have_fill_dates have_days_supply;
by study_id;run;
/*Need help*/
data both_have;
set both_havex;
array daydummy(days_in or ####) day1-day####;
array filldates(*) fill_dt1 - fill_dt141;
array days_supply(*) days_supply1 - days_supply141;
do ii=1 to ####; daydummy(ii)=0;
end;
do ii=1 to ####;
do i=1 to 141 while (filldates(i) ne .);
if filldates(i) <= pre_2y_dt + ii -1 <= filldates(i)+days_supply(i)-1
then daydummy(ii)=1;
end;
end;
drop i ii;
dayscovered=sum(of day1 - day####);
p_dayscovered=dayscovered/####;
run;
... View more