DATA Step, Macro, Functions and more

LAG for missing values - moving average

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

LAG for missing values - moving average

Hi,

 

 I am trying to create moving averages by looking at lag - lag 1 and lag2 values of rewards redemption. Here is the formula current = (lag - lag2) / (lag2 - lag3).

 

However, the current value is not stored sequencially in the rows on update. Meaning,

 

when i find row4 current value is missing, i am using this formula to come up with row4 rewards. Now, row4 -> current = 100

 

When moving to row5 - new rewards calculated in row4 isnt available . lag1 -> row4> current = 0 (i guess lag values are not dynamic)

 

 

 

DATA vtest;
SET v10;
by v_id group id;

current  = red1;


lag_red1 = lag(newredemption1);
lag2_red1 = lag2(newredemption);
lag3_red1 = lag3(newredemption);


expected_increase_pts = (lag_red1*(lag_red1 / lag2_red1)) - lag2_red1;
previos_redeem_growth = (lag_red1 - lag2_red1) / (lag2_red1 - lag3_red1);

current  = current  + (expected_increase_pts * previos_redeem_growth);

retain current ;

 

RUN;


Accepted Solutions
Solution
‎03-24-2016 09:31 AM
Super User
Posts: 11,343

Re: LAG for missing values - moving average

It would really help to understand what you are attempting if you provide some data in the form of a datastep for input and what you expect for the output for that example data.

 

 

I do wonder, since you say "current" is sometimes missing that you want

current  = sum(current , (expected_increase_pts * previos_redeem_growth));

 

Something to consider: you need to address all of the potential missing values before doing division with the lagged values.

Lag3 will not exist until you get to the 4th observation. So you are going to have division by missing which you may want to consider.

And if your newredemption variable is evermissing you're going to get intermittent other calculations with missing values.

 

Lag values are only for variables read in through a SET or MERGE statement. If you want to keep a calculated variable then you use RETAIN. When to reset to initial values or missing for retained variables is sometimes interesting. You might want something like:

 

Retain TempCurrent 0; /* this set an intial value of 0*/

 

<misc calculations>

if missing(current) then current=sum(TempCurrent, (expected_increase_pts * previos_redeem_growth));

else current =  sum(current , (expected_increase_pts * previos_redeem_growth));

/* and then reset the retained value*/

TempCurrent = current;

View solution in original post


All Replies
Solution
‎03-24-2016 09:31 AM
Super User
Posts: 11,343

Re: LAG for missing values - moving average

It would really help to understand what you are attempting if you provide some data in the form of a datastep for input and what you expect for the output for that example data.

 

 

I do wonder, since you say "current" is sometimes missing that you want

current  = sum(current , (expected_increase_pts * previos_redeem_growth));

 

Something to consider: you need to address all of the potential missing values before doing division with the lagged values.

Lag3 will not exist until you get to the 4th observation. So you are going to have division by missing which you may want to consider.

And if your newredemption variable is evermissing you're going to get intermittent other calculations with missing values.

 

Lag values are only for variables read in through a SET or MERGE statement. If you want to keep a calculated variable then you use RETAIN. When to reset to initial values or missing for retained variables is sometimes interesting. You might want something like:

 

Retain TempCurrent 0; /* this set an intial value of 0*/

 

<misc calculations>

if missing(current) then current=sum(TempCurrent, (expected_increase_pts * previos_redeem_growth));

else current =  sum(current , (expected_increase_pts * previos_redeem_growth));

/* and then reset the retained value*/

TempCurrent = current;

Super User
Posts: 10,035

Re: LAG for missing values - moving average

[ Edited ]

Yes. LAG() is not dynamic. Use ARRAY instead .

Post some real data to explain your question well.

Occasional Contributor
Posts: 17

Re: LAG for missing values - moving average

Thank you 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 294 views
  • 0 likes
  • 3 in conversation