Hi @novinosrin - sorry about that.
In the data that I have, there is no ID variable and so I replied that the lag worked which it did. I am just curious to know how having an ID variable will impact the code as I assumed it to be same as using a by statement.
Apologies about that. I am trying to learn this thing to an extent where I don't have to come back to the forum if I do get a request sooner or later to include an ID variable.
If you have an ID variable, then you need a way to prevent the initial sums of one ID from being contaminated by data from the prior ID - i.e., whenever you start a new ID, the sum has to be reset to zero or missing: And in addition, while the rolling sum requires subtracting the lag12(A) value, it should do so only when the lag12(region) is the same as the current region. I.e.:
data want;
set have;
by region;
if first.region then sum_a=.;
sum_a + a + -1*ifn(lag12(region)=region,lag12(a),0);
run;
This uses the following features of SAS
But you ask about doing this for a set of variables (say A, B, and C). The natural SAS approach is to declare A B and C as an array, and to define an analogous array with variables SUMA, SUMB, and SUMC. You might be tempted to do this, which just puts the sum statement (and it associated IFN and LAG functions) inside a loop over the 3 variables A, B, C.
data want;
set have;
by region;
array vars {3} a b c;
array sums {3} suma sumb sumc;
if first.region then call missing(of sums{*});
do v=1 to 3;
sums{v} + vars{v} + -1*ifn(lag12(region)=region,lag12(vars{v}),0);
end;
run;
But this would give the wrong result, because the "LAG12(region)" function would be run 3 times for each operation. The LAG function is a FIFO queue updater, so in this case it would step through the underlying history of region values 3 times as fast as you want. The best option is to take the LAG12(region) evaluation outside of the loop:
data want (drop=_LGREG12 v);
set have;
by region;
array vars {3} a b c;
array sums {3} suma sumb sumc;
if first.region then call missing(of sums{*});
_LGREG12=lag12(region);
do v=1 to 3;
sums{v} + vars{v} + -1*ifn(_LGREG12=region,lag12(vars{v}),0);
end;
run;
regards,
Mark
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.