Between a "Start_Date" and "End_Date", which are user input prompt values, I would like to find the maximum two week total demand value by inv_id.
I have attache an Excel file that has a daily_dmd tab which contains the table that has the summary table of total demand by day by inv_id. I can provide the code as necessary that generated this data set.
Also in the Excel file is the list of all the date values and the date 14 days prior for which I would want find the total two week demand by INV_ID and then find the max of all those listed date ranges.
I do not have access to the PROC EXPAND method which is the rub.
End goal is to have a table with two columns; INV_ID and MAX_TWOWEEKDMD.
Thanks in advance for all help.
Below code first creates a rolling sum per inv_id over a given date range (here 14 days) and then selects the row with the max sum per inv_id.
%let start_dt='01jan2018'd;
%let stop_dt='31dec2020'd;
%let day_range=14;
libname source xlsx '~/test/DataSets.xlsx';
data have;
set source.daily_dmd;
INV_TXN_DT2=datepart(INV_TXN_DT);
format INV_TXN_DT2 date9.;
run;
/* create rolling sum */
proc sql;
create table rolling_sum as
select
inv_id,
INV_TXN_DT2,
INV_TXN_DT2-&day_range+1 as prior_dt format=date9.,
daily_dmd,
(
select sum(daily_dmd)
from have i
where
i.inv_id=o.inv_id
and i.INV_TXN_DT2 between INV_TXN_DT2-&day_range+1 and o.INV_TXN_DT2
) as
TWOWEEKDMD
from have o
where INV_TXN_DT2 between &start_dt and &stop_dt
order by inv_id, INV_TXN_DT2
;
quit;
And now the rolling sum code combined with the selection of the max sum
/* select max value from rolling sum
DISTINCT result set because there can be multiple dates with a max sum
*/
proc sql;
create table want as
select distinct
inv_id,
max(TWOWEEKDMD) as MAX_TWOWEEKDMD
from
(
select
inv_id,
(
select sum(daily_dmd)
from have i
where
i.inv_id=o.inv_id
and i.INV_TXN_DT2 between INV_TXN_DT2-&day_range+1 and o.INV_TXN_DT2
) as
TWOWEEKDMD
from have o
where INV_TXN_DT2 between &start_dt and &stop_dt
)
group by inv_id
having max(TWOWEEKDMD) = TWOWEEKDMD
;
quit;
Below code first creates a rolling sum per inv_id over a given date range (here 14 days) and then selects the row with the max sum per inv_id.
%let start_dt='01jan2018'd;
%let stop_dt='31dec2020'd;
%let day_range=14;
libname source xlsx '~/test/DataSets.xlsx';
data have;
set source.daily_dmd;
INV_TXN_DT2=datepart(INV_TXN_DT);
format INV_TXN_DT2 date9.;
run;
/* create rolling sum */
proc sql;
create table rolling_sum as
select
inv_id,
INV_TXN_DT2,
INV_TXN_DT2-&day_range+1 as prior_dt format=date9.,
daily_dmd,
(
select sum(daily_dmd)
from have i
where
i.inv_id=o.inv_id
and i.INV_TXN_DT2 between INV_TXN_DT2-&day_range+1 and o.INV_TXN_DT2
) as
TWOWEEKDMD
from have o
where INV_TXN_DT2 between &start_dt and &stop_dt
order by inv_id, INV_TXN_DT2
;
quit;
And now the rolling sum code combined with the selection of the max sum
/* select max value from rolling sum
DISTINCT result set because there can be multiple dates with a max sum
*/
proc sql;
create table want as
select distinct
inv_id,
max(TWOWEEKDMD) as MAX_TWOWEEKDMD
from
(
select
inv_id,
(
select sum(daily_dmd)
from have i
where
i.inv_id=o.inv_id
and i.INV_TXN_DT2 between INV_TXN_DT2-&day_range+1 and o.INV_TXN_DT2
) as
TWOWEEKDMD
from have o
where INV_TXN_DT2 between &start_dt and &stop_dt
)
group by inv_id
having max(TWOWEEKDMD) = TWOWEEKDMD
;
quit;
The sql approach shows that SQL can be forced to do something it is not well-suited for - namely rolling sums across rows. But if the data are already sorted by INV_ID INC_TXN_DT, then there is a much, much simpler and more efficient approach.
data want (drop=_:);
array two_weeks {0:13} _temporary_;
call missing(of two_weeks{*});
do until (last.inv_id);
set have;
by inv_id;
_dmd_date=datepart(inc_txn_dt);
_lag_dmd_date=lag(dmd_date);
if first.id=0 and (_dmd_date-1>_lag_dmd_date) then do _d=_lag_dmd_date+1 to _dmd_date-1;
two_weeks{mod(_d,14)}=0;
end;
two_weeks{mod(_dmd_date,14)}+daily_dmd;
max_two_weeks=max(max_two_weeks,sum(of two_weeks{*}));
end;
run;
I suspect this approach is much closer to how PROC EXPAND works underneath the hood.
Code corrected per @ErinKSimmons 's note.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.