I am currently having issues with forecasting out future forecasts with differencing in the model with proc varmax. Varmax is identifying the changes in the independent variables (+100 shock) properly when there is no differening in the variables, however in the differenced environment, the first forecasted value is exacty the value of the estimated beta coefficient in the model, and then the rest of the forecasted values are 0.
Is therea way to correctly calcualte & plot future forecasts in a differenced environment with proc varmax?
(example code of what I am attempting to do):
proc varmax data=forecast_dataset plots=ALL;
model retvar = ratevar_shock100 / noint dif=(retvar(1) ratevar_shock100(1));
output lead=24 back=12 out=forecasts;
run;
All values in data set are populated up until last 12 rows 'retvar' has missing values which are to be forecasted.
Thank you for your time and any suggestions.
-Ryan
Hello Ryan -
Please excuse for delay - our developers have been able to replicate the problem which you are reporting.
At this point in time I would suggest to open a track with Technical Support, who are already aware of this communication.
Thanks!
Udo
Hello Ryan -
Many thanks for your question about using VARMAX with differencing in the model.
Our R&D team is currently investigating the question at hand - it would really help with you could provide us with a complete example - including data (even "faked" data will work.
Another thought which crossed my mind was if a more simplistic approach might help addressing your problem at hand as well. If I understand correctly you would like to forecast "retvar" using "ratevar_shock100" as a input variable (assuming that there is cross-correlation between the two). In this case a UCM or ARIMAX approach might be applicable as well.
Thanks!
Udo
Essentially we have 96 rows of observed data, including 96 observations of "retvar" and 96 observations of "ratevar." It is known that retvar is directly correlated with ratevar. What we are trying to do is add 12 "observations" to ratevar that are all equal to ratevar + 1, to do this we just add a column to the dataset "ratevar_shock100" which contains the observed values of ratevar up until the 96th observation, and then the values are equal to the 96th (last) ratevar observation + 1, while retvar contains missing values, these missing values in retvar are what we want to forecast (i.e. see how the value of retvar will adapt as the +1 shock is induced in ratevar). The last 12 observations + 12 to-be-forecasted observations are as such:
retvar | ratevar_shock100 |
0.298528 | 0.2836085 |
0.287183 | 0.2828952 |
0.288975 | 0.2500238 |
0.213738 | 0.2418523 |
0.241152 | 0.2398289 |
0.217088 | 0.2389318 |
0.223159 | 0.2431974 |
0.243546 | 0.2464614 |
0.234781 | 0.2378387 |
0.235236 | 0.2213375 |
0.222661 | 0.2134652 |
0.183533 | 0.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
. | 1.2088864 |
We have many models to forecast this data, some using further variables, but our simplest model is:
proc varmax data=forecast_dataset plots=ALL;
model retvar = ratevar_shock100 / noint dif=(retvar(1) ratevar_shock100(1));
output lead=24 back=12 out=forecasts;
run;
And even this model is zeroing out as described in the original post.
For now we are just running varmax on the observed data set to obtain values for the beta coefficients in the model, and then just manually calculating the forecasts. However, we would like to be able to have varmax compute the forecasts accurately as it would be much more efficient and require much code/manual calculation.
Here is an example of the forecasts we are obtaining from proc varmax with the above model statement:
retvar_actual | retvar_forecasts | ratevar_shock100 |
0.298527538 | 0.344868859 | 0.2836085 |
0.287183252 | 0.298459817 | 0.2828952 |
0.288975225 | 0.284062431 | 0.2500238 |
0.213737544 | 0.28819942 | 0.2418523 |
0.241152046 | 0.213545441 | 0.2398289 |
0.217087828 | 0.241066875 | 0.2389318 |
0.223158987 | 0.217492805 | 0.2431974 |
0.243545745 | 0.223468873 | 0.2464614 |
0.234780894 | 0.242727104 | 0.2378387 |
0.235235685 | 0.233214265 | 0.2213375 |
0.222660901 | 0.234488287 | 0.2134652 |
0.183533031 | 0.222226188 | 0.2088864 |
. | 0.094940294 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
. | 0 | 1.2088864 |
And here is an example of correct forecast data obtained from manual calculation with beta value obtained from varmax on observed dataset:
retvar_actual | retvar_forecasts | ratevar_shock100 |
0.298527538 | 0.298459817 | 0.2836085 |
0.287183252 | 0.284062431 | 0.2828952 |
0.288975225 | 0.28819942 | 0.2500238 |
0.213737544 | 0.213545441 | 0.2418523 |
0.241152046 | 0.241066875 | 0.2398289 |
0.217087828 | 0.217492805 | 0.2389318 |
0.223158987 | 0.223468873 | 0.2431974 |
0.243545745 | 0.242727104 | 0.2464614 |
0.234780894 | 0.233214265 | 0.2378387 |
0.235235685 | 0.234488287 | 0.2213375 |
0.222660901 | 0.222226188 | 0.2134652 |
0.183533031 | 0.183533031 | 0.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
. | 0.278473325 | 1.2088864 |
Please let me know if this makes sense, and/or if you would like any more information.
Thanks a lot for your time,
-Ryan
Also, something I forgot to add that may be of value: the beta value for the model obtained from running varmax on the observed data set is: XL0_1: 0.094940294
Which is what we use to manually calculate the forecasts outside of varmax, and also is interestingly exactly the first forecast value given from varmax's forecasts before they 0 out.
Hello Ryan -
Please excuse for delay - our developers have been able to replicate the problem which you are reporting.
At this point in time I would suggest to open a track with Technical Support, who are already aware of this communication.
Thanks!
Udo
- So do I get a SAS t-shirt or something for discovering this bug? 😛
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.