I am trying to calculate the maximum daily dose in the last 7 days [t-7, t]. Each row has a dose and the respective day and I need a new column with the maximum value in the last seven days (that day inclusive). I can have situations where some days aren't available.
Please see below the type of data I have and what I'm aiming for:
data have;
input id $ dose day;
cards;
A 1 1
A 1 2
A 1 3
A 1 4
A 1 5
A 2 6
A 2 7
A 3 8
A 1 9
;
run;
data want;
input id $ dose day max7days;
cards;
A 1 1 1
A 1 2 1
A 1 3 1
A 1 4 1
A 1 5 1
A 2 6 2
A 1 7 2
A 3 8 3
A 1 9 3
;
run;
I'm fairly new SAS and would appreciate suggestions on the best method to perform this calculation. Please let me know if I should provide mor details so you can help me. Thanks in advance!
If you have SAS/ETS there are procs for doing things like that.
You could just let PROC SQL join the dataset with itself and use the MAX() aggregate function.
proc sql;
create table want as
select a.id, a.day, a.dose
, max(b.dose) as max7days
from have a
inner join have b
on a.id = b.id
and b.day between a.day-6 and a.day
group by a.id, a.day, a.dose
;
quit;
But it might be a lot faster to do it in a data step.
If all DAY values are present it is pretty simple:
data want2;
set have;
by id day ;
array history[7] _temporary_;
if first.id then call missing(of history[*]);
history[ceil(day/7)] = dose;
max7dose = max(of history[*]);
run;
But if there are gaps then you will need to work a little harder.
data want2;
set have;
by id day ;
array history[7] _temporary_;
lag_day = lag(day);
if first.id then call missing(of history[*]);
else do lag_day=lag_day+1 to day-1;
history[ceil(lag_day/7)]=.;
end;
history[ceil(day/7)] = dose;
max7dose = max(of history[*]);
drop lag_day;
run;
If you have SAS/ETS there are procs for doing things like that.
You could just let PROC SQL join the dataset with itself and use the MAX() aggregate function.
proc sql;
create table want as
select a.id, a.day, a.dose
, max(b.dose) as max7days
from have a
inner join have b
on a.id = b.id
and b.day between a.day-6 and a.day
group by a.id, a.day, a.dose
;
quit;
But it might be a lot faster to do it in a data step.
If all DAY values are present it is pretty simple:
data want2;
set have;
by id day ;
array history[7] _temporary_;
if first.id then call missing(of history[*]);
history[ceil(day/7)] = dose;
max7dose = max(of history[*]);
run;
But if there are gaps then you will need to work a little harder.
data want2;
set have;
by id day ;
array history[7] _temporary_;
lag_day = lag(day);
if first.id then call missing(of history[*]);
else do lag_day=lag_day+1 to day-1;
history[ceil(lag_day/7)]=.;
end;
history[ceil(day/7)] = dose;
max7dose = max(of history[*]);
drop lag_day;
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.