04-13-2017 12:26 PM
I'm using code the uses the lag function but for some reason the lag1 function is calling values from two periods before. I've attached screenshot where you can see the issue. I'm not sure what's causing this but my whole code relies a lot on lags so hopefull this is something that can be resolved.
retain vol_x vol_x1 vol_x2;
if Price_USD gt 0 then vol_24hr_units = Vol_24hr/Price_USD;
vol_1hr_units = vol_24hr_units/24 + (vol_24hr_units-lag1(vol_24hr_units))/2;
vol_x = (vol_1hr_units + lag1(vol_1hr_units) + lag2(vol_1hr_units) + lag3(vol_1hr_units) + lag4(vol_1hr_units))/5;
price_x = (Price_USD + lag1(Price_USD) + lag2(Price_USD) + lag3(Price_USD) + lag4(Price_USD))/5;
04-13-2017 12:35 PM
Using the lag function gets (methinks) complicated with computed values. I think you actually want to keep the last five values in a FIFO array and simply retain them. I or someone else can provide a better suggestion if you provide an example dataset in the form of a datastep.
Art, CEO, AnalystFinder.com
04-13-2017 12:55 PM
Thanks for your reply. I've been through the code many times and it just doesn't make send why it's not working. I'll post an example. There's a useful macro on the forum where you can create the sas code for your dataset. Do you have the code to create it?
04-13-2017 01:02 PM
This is happening because you apply the LAG function to VOL_X before actually computing VOL_X. One easy fix would be to replace the creation of VOL_X1 and VOL_X2 with these statements:
VOL_X2 = VOL_X1;
VOL_X1 = VOL_X;
Since all 3 variables are being retained, that's all that would be needed. Just place these statements at the same location where VOL_X1 and VOL_X2 are now being computed.
04-13-2017 01:03 PM - edited 04-13-2017 01:08 PM
You are storing the result of LAG1() and LAG2() of VOL_X at the top of your data step and then later on you are changing the value of VOL_X. SInce the LAG() function just returns the values that it stores it EXECUTES this will have an impact on what values it returns.
So your code does these step on each pass through the data step :
vol_x = some value (either missing or retained from previous step);
vol_x = some new value
So values resulting value of VOL_X1 will be in order
_n_=1 - Missing since there is no lagged value to return on the frist obsevation.
_n_=2 - Missing (since that is the first value of VOL_X that you saved)
_n_=3 - some new value that you calculated on the first pass through the data step and retained over to the second pass.