That doesn't look like it needs dynamic programming. You can use a temporary array to build your own stack of previous value. Use the MOD() function to index into it in a wrap around fashion. The only thing that might need to be dynamic is finding the maximum or B (or LAG) to define how large a stack of lagged values to make. But why not just make it larger then you ever expect to need?
First let's make your posted printout into an actual dataset. I will change the column headers into valid variable names.
Also let's calculate the relative row number (if you are missing rows then fix that first or else the logic will need to be more complex). (Note you could do that in the next step if you wanted instead).
data have ;
length name $32 lag row qty want_lagN want_sum want_rolling 8;
input name $ lag qty want_lagN want_sum want_rolling ;
if name ne lag(name) then row=0;
row+1;
cards;
Dog 2 1 . 1 .
Dog 2 2 . 3 .
Dog 2 3 1 6 5
Dog 2 4 3 10 7
Dog 2 5 6 15 9
Dog 2 6 10 21 11
Cat 3 1 . 1 .
Cat 3 2 . 3 .
Cat 3 3 . 6 .
Cat 3 4 1 10 9
Cat 3 5 3 15 12
;
Now just read in the data. Reset the "stack" when changing to a new "NAME".
data want ;
array lags [0:100] _temporary_;
set have;
by name notsorted;
if first.name then call missing(of lags[*] sum rolling_sum );
sum+qty;
lagn=lags[mod(row,lag)];
if not missing(lagN) then rolling_sum=sum-lagN;
lags[mod(row,lag)]=sum;
run;
Results:
want_ want_ rolling_ Obs name lag row qty lagN want_sum rolling sum sum lagn 1 Dog 2 1 1 . 1 . 1 . . 2 Dog 2 2 2 . 3 . 3 . . 3 Dog 2 3 3 1 6 5 6 5 1 4 Dog 2 4 4 3 10 7 10 7 3 5 Dog 2 5 5 6 15 9 15 9 6 6 Dog 2 6 6 10 21 11 21 11 10 7 Cat 3 1 1 . 1 . 1 . . 8 Cat 3 2 2 . 3 . 3 . . 9 Cat 3 3 3 . 6 . 6 . . 10 Cat 3 4 4 1 10 9 10 9 1 11 Cat 3 5 5 3 15 12 15 12 3
I can't really relate your narrative to the code you've posted. Because you write that the code returns the correct result, I've kept it as original as possible and only added some logic to dynamically generate your IF statements based on what's in the Loc table.
PROC IMPORT DATAFILE = "/folders/myfolders/SAS Practice/CPS_MODEL_PRACT/sample_loop_CPS.xlsx"
DBMS = XLSX
OUT = LOOP_SAMPLE REPLACE;
RUN;
PROC SQL;
CREATE TABLE LOC AS
SELECT DISTINCT Loc, RLT AS RLT
FROM LOOP_SAMPLE;
QUIT;
filename codegen temp;
data _null_;
file codegen;
/* file print;*/
set loc;
length code $80;
code=cats('IF Loc ="',loc,'" THEN A = LAG',rlt,'(COUNT);');
if _n_>1 then code=catx(' ','else',code);
put code;
run;
PROC SORT DATA = LOOP_SAMPLE;
BY Loc;
RUN;
DATA HAVE;
SET LOOP_SAMPLE;
COUNT + Qty;
%include codegen / source2;
ROLLING_TOTAL = COUNT - A;
RUN;
PROC MEANS DATA = HAVE MAX;
BY Loc;
VAR ROLLING_TOTAL;
RUN;
filename codegen clear;
Macro variables &skus and &rlt will have a single value during compilation time of the SAS data step and for this reason the SAS code is static.
IF Loc IN(&SKUS) THEN A = LAG&RLT(COUNT);
You can't have something dynamic the way you have it in mind because during compilation time of the SAS data step all variables and functions must be "spelled out". You can't create additional variables or functions during data step execution time.
Good that you posted additional code. Below answered one of my questions. You also would need a By Loc; for this to work.
by loc;
IF FIRST.Loc = 1 THEN COUNT = 0;
What I still don't understand and also can't see anywhere in your code: You're talking about a rolling window and quarterly. I can't see any attempt in your code to have such a rolling window. It's just a "lagged" summing up per Loc (which could easily get implemented via a Proc SQL). Any sort of rolling window would only sum entries within a specified date range based on the date of the current row.
Can you please further explain this rolling window thing?
A | B |
Dog | 2 |
Cat | 3 |
A | Lag | Qty | lag(Cum_Qty) | Cum_Qty | Rolling Total |
Dog | 2 | 1 | 1 | ||
Dog | 2 | 2 | 3 | ||
Dog | 2 | 3 | 1 | 6 | 5 |
Dog | 2 | 4 | 3 | 10 | 7 |
Dog | 2 | 5 | 6 | 15 | 9 |
Dog | 2 | 6 | 10 | 21 | 11 |
Cat | 3 | 1 | 1 | ||
Cat | 3 | 2 | 3 | ||
Cat | 3 | 3 | 6 | ||
Cat | 3 | 4 | 1 | 10 | 9 |
Cat | 3 | 5 | 3 | 15 | 12 |
The example above is what I am looking for essentially, which you have provided in terms of getting across what I am doing just for that specific dataset, but not if I introduce more variables in column A with different values for lag. How the rolling window works is the Rolling Total of 5 for instance is 2+3 in Qty (2 values or a "window" of 2) and 7 is 4+3 (continuing the window of 2). The fixed window moves forward each row (or day if there was a date column). For Cat the Rolling Total of 9 is 4+3+2 (3 values for a window of 3) and 12 is 5+4+3 (fixed 3 value window). I want to use the value Lag equaling 2 or 3 as an input for the lag function but lagN(Cum_Qty) won't allow you to directly insert the value corresponding to the Lag column, which is why I developed a separate table using Proc Sql to then get a macro variable inserted into the lag function so I could end up with lag&lag(Cum_Qty) which would resolve to 2 for Dog and 3 for Cat (hence the dynamic aspect). Below I have posted the progress I have made.
PROC SQL;
CREATE TABLE LOC AS
SELECT DISTINCT Loc, RLT AS RLT
FROM LOOP_SAMPLE;
QUIT;
So Cum_Qty is just summing up all rows (lagged, per loc) meaning if you've got dates from 2016 to 2020 it contains in the end the sum of all these years?
ok. I'll give it a go. The solution will likely be something along the line as solved here.
@rich93johnson wrote:
At the end of 4 years you would have the Rolling Total being equivalent to the total of Qty of December 31st, 2020 and December 30th, because the window of 2 moves with, subtracting out all the other days.
So do you want a rolling 3 month look-back period or do you want to start afresh every quarter? If starting afresh each quarter: Would you include lagged values from the previous quarter or only include them if they are from the same quarter?
@Tom s code nicely demonstrates how much more efficient a SAS data step can process sequentially as compared to traditional SQL.
I've applied @Tom s code to more realistic sample data - and I've also posted the SQL I came up with.
Interestingly the results start to differ starting with day 93.
I believe what's missing in @Tom 's code is logic for a time based windows for column sum. I leave it to you/Tom to investigate and eventually fix.
...actually: I believe something in my SQL is not working as it should. The data step approach is so much better that I'm now too lazy to fix this. I wish I could give Tom 10 likes for what he's posted.
proc datasets lib=work kill nolist;
run;quit;
data LOOP_SAMPLE;
FORMAT
Item BEST12.
Date DATE9.
Loc $CHAR4.
Qty BEST12.
RLT BEST12.;
call streaminit(12345);
/* do date='03oct2016'd to '12jan2017'd;*/
do date='03oct2016'd to '12jan2020'd;
do item=1 to 3;
do loc='DDSP','DDCN','DDCT';
if loc='DDSP' then rlt=20;
else if loc='DDCN' then rlt=10;
else rlt=20;
qty=ceil(rand('uniform',0,999));
output;
end;
end;
end;
stop;
run;
proc datasets lib=work nolist nowarn mt=(data view);
delete v_lag_cum_qty v_cum_qty want1;
run;
quit;
proc sql;
create view v_lag_cum_qty as
/* create table v_lag_cum_qty as*/
select
l.loc, l.item, l.date,
sum(r.qty) as lag_cum_qty
from
loop_sample l
left join
loop_sample r
on
l.loc=r.loc and l.item=r.item
and r.date between intnx('month',l.date-l.rlt,-3,'s')+1 and l.date-l.rlt
group by l.loc, l.item, l.date
;
create view v_cum_qty as
/* create table v_cum_qty as*/
select
l.loc, l.item, l.date,
sum(r.qty) as cum_qty
from
loop_sample l
left join
loop_sample r
on
l.loc=r.loc and l.item=r.item
and r.date between intnx('month',l.date,-3,'s')+1 and l.date
group by l.loc, l.item, l.date
;
create table want1 as
select
l.*,
r1.lag_cum_qty,
r2.cum_qty,
r2.cum_qty - r1.lag_cum_qty as Rolling_Total
from
loop_sample l
inner join
v_lag_cum_qty r1
on l.loc=r1.loc and l.item =r1.item and l.date=r1.date
inner join
v_cum_qty r2
on l.loc=r2.loc and l.item =r2.item and l.date=r2.date
order by l.loc, l.item, l.date
;
quit;
proc print data=want1(obs=95 firstobs=88);
run;
proc sort data=LOOP_SAMPLE;
by loc item date;
run;
data want2;
set LOOP_SAMPLE;
by loc item date;
array lags [0:100] _temporary_;
if first.item then
call missing(of lags[*] row sum rolling_sum );
row+1;
sum+qty;
lagn=lags[mod(row,rlt)];
if not missing(lagN) then rolling_sum=sum-lagN;
lags[mod(row,rlt)]=sum;
run;
proc print data=want2(obs=95 firstobs=88);
run;
That doesn't look like it needs dynamic programming. You can use a temporary array to build your own stack of previous value. Use the MOD() function to index into it in a wrap around fashion. The only thing that might need to be dynamic is finding the maximum or B (or LAG) to define how large a stack of lagged values to make. But why not just make it larger then you ever expect to need?
First let's make your posted printout into an actual dataset. I will change the column headers into valid variable names.
Also let's calculate the relative row number (if you are missing rows then fix that first or else the logic will need to be more complex). (Note you could do that in the next step if you wanted instead).
data have ;
length name $32 lag row qty want_lagN want_sum want_rolling 8;
input name $ lag qty want_lagN want_sum want_rolling ;
if name ne lag(name) then row=0;
row+1;
cards;
Dog 2 1 . 1 .
Dog 2 2 . 3 .
Dog 2 3 1 6 5
Dog 2 4 3 10 7
Dog 2 5 6 15 9
Dog 2 6 10 21 11
Cat 3 1 . 1 .
Cat 3 2 . 3 .
Cat 3 3 . 6 .
Cat 3 4 1 10 9
Cat 3 5 3 15 12
;
Now just read in the data. Reset the "stack" when changing to a new "NAME".
data want ;
array lags [0:100] _temporary_;
set have;
by name notsorted;
if first.name then call missing(of lags[*] sum rolling_sum );
sum+qty;
lagn=lags[mod(row,lag)];
if not missing(lagN) then rolling_sum=sum-lagN;
lags[mod(row,lag)]=sum;
run;
Results:
want_ want_ rolling_ Obs name lag row qty lagN want_sum rolling sum sum lagn 1 Dog 2 1 1 . 1 . 1 . . 2 Dog 2 2 2 . 3 . 3 . . 3 Dog 2 3 3 1 6 5 6 5 1 4 Dog 2 4 4 3 10 7 10 7 3 5 Dog 2 5 5 6 15 9 15 9 6 6 Dog 2 6 6 10 21 11 21 11 10 7 Cat 3 1 1 . 1 . 1 . . 8 Cat 3 2 2 . 3 . 3 . . 9 Cat 3 3 3 . 6 . 6 . . 10 Cat 3 4 4 1 10 9 10 9 1 11 Cat 3 5 5 3 15 12 15 12 3
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.