Hello everyone,
I have a data of prescription drug supplies. I want to adjust this data based on a time window of 90 days. Here are what I have and what I want:
data have;
input id date_start:yymmdd10. date_stop:yymmdd10. duration increment_start increment_stop;
datalines;
1 1/15/2008 1/30/2008 16 1 1
1 1/31/2008 2/14/2008 15 1 1
1 4/13/2008 5/2/2008 20 1 2
1 5/30/2008 6/29/2008 31 2 2
1 7/7/2008 8/6/2008 31 2 3
2 1/24/2008 2/18/2008 26 1 1
2 2/19/2008 2/19/2008 1 1 1
2 4/21/2008 5/11/2008 21 1 2
2 5/12/2008 5/13/2008 2 2 2
2 7/11/2008 8/3/2008 24 2 3
;
data want;
input id date_start:yymmdd10. date_stop:yymmdd10. duration increment_start increment_stop;
datalines;
1 1/15/2008 1/30/2008 16 1 1
1 1/31/2008 2/14/2008 15 1 1
1 4/13/2008 4/13/2008 1 1 1
1 4/14/2008 5/2/2008 19 2 2
1 5/30/2008 6/29/2008 31 2 2
1 7/7/2008 7/12/2008 6 2 2
1 7/13/2008 8/6/2008 25 3 3
2 1/24/2008 2/18/2008 26 1 1
2 2/19/2008 2/19/2008 1 1 1
2 4/21/2008 4/22/2008 2 1 1
2 4/23/2008 5/11/2008 19 2 2
2 5/12/2008 5/13/2008 2 2 2
2 7/11/2008 7/21/2008 11 2 2
2 7/22/2008 8/3/2008 13 3 3
;
Any idea or help is much appreciated!
This will also take care of durations longer than 90 days
data want;
retain is0dt;
set have; by id;
if first.id then is0dt = date_start;
ds = date_stop;
do while (date_start <= ds);
inc = 1 + intck("day90", is0dt, date_start, "c");
date_stop = min(ds, intnx("day", is0dt, inc*90-1));
output;
date_start = intnx("day", date_stop, 1);
end;
drop is0dt ds;
run;
@Alireza_Boloori wrote:
Hello everyone,
For example, for the third observation for ID = 1, the supply of drug is started before day 90, while the end date of supply is after 90 days. So, I had to split that observation into two rows, and also modify each row (e.g., data_start, data_stop, duration, etc.)
In term of the variables shown how do we know that the " supply of drug is started before day 90" and "the end date of supply is after 90 ". I might guess correctly but the problem should include the variable names used and rules involving the variables. I am guessing that the day_counter variables are involved but it is not explicitly clear which is start or end, similar with the time_window variables.
There seems to be an implied manipulation of the dates and duration_day variable but that rule is not very clear.
You likely know a whole lot about what each of these are because you have been working with them but we have little or not context and it helps if we don't have to guess which what goes where.
@Alireza_Boloorican you please post sample code that you are working with?
what have you tried?
You have a lot listed above with rules that are still unclear based on data that is not present in your sample data due to patient engagement in fulfilling prescriptions or having those prescriptions renewed.
Please post what you have tried in the form of a datastep, SQL query.
Thanks
@VDD Thanks. I've made my description simpler and removed unnecessary variables.
Also, please find the code that I use for this splitting. It only creates the duplicated row, but I couldn't find any way to modify the variables across the duplicated rows.
data want;
set have;
output;
if increment_start~= increment_stop then do;
output;
end;
run;
This will also take care of durations longer than 90 days
data want;
retain is0dt;
set have; by id;
if first.id then is0dt = date_start;
ds = date_stop;
do while (date_start <= ds);
inc = 1 + intck("day90", is0dt, date_start, "c");
date_stop = min(ds, intnx("day", is0dt, inc*90-1));
output;
date_start = intnx("day", date_stop, 1);
end;
drop is0dt ds;
run;
@PGStats Thank you very much for your help!
Sir @PGStats Elegance at best
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.