BookmarkSubscribeRSS Feed
kybowma
Calcite | Level 5

I posted earlier, however it seems that I accidently deleted or it was removed.  If this post/question is considered "taboo" or is posted in the wrong community, sorry and please let me know.

 

My objective is to implement a model which was scored with the PROC ARIMA procedure in SAS.  Working with SAS Tech support I was able to get a more simple explanation of the backend equation that the ARIMA procedure uses.  Here was the correspondence:

 

yt = yt-1 + C +  w1x1t  + w2x2t  + ϕ ((yt-1 - yt-2) - w1x1t-1 - w2x2t-1 ) + at

 

The random error term at time t, at, takes on its expected value of 0 and effectively drops out of the equation.  In the forecast horizon, when actual values of the lags of y on the right-hand-side of the equation are no longer available, the corresponding predicted value is used in place of the lagged y values.

 

*** Update 1 ***

Through trial and error I noticed that the above (if I implemented it right) produced a static difference for type='Actuals'.  I noticed the difference being -C*ϕ, thus the below code accommodates for this and is now replicating the ARIMA procedure model fit.

New equation:

yt = yt-1 + C +  w1x1t  + w2x2t  + ϕ ((yt-1 - yt-2) - w1x1t-1 - w2x2t-1 ) + at - Cϕ

 

Here is the parameter estimates which I believe the model to need:

Capture.PNG

Which I believe to imply from the equation above that C=MU=-15.59089, ϕ=AR1,1=-0.57206, w1=NUM1=-0.0249 and w2=NUM2=0.05201.

 

Can someone assist me with calculating the values in the type='Forecast' portion of the below data?  My assumption is that this is close, however it is still off.  For the type='Actuals' I match to within extreme rounding.  Any help would be greatly appreciated.

data arima_data;
    format date date9.;
    length type $ 8;
    do date='01JAN2000'd to intnx('month',today(),-12,'B');
        date=intnx('month',date,0,'B');
        type='Actual';
        y=ranuni(28269)*1000;
        x1=ranuni(28123)*1000;
        x2=ranuni(28722)*1000;
        output;
        date=intnx('month',date,0,'E');
    end;
    /* Out of time. */
    do date=date to today();
        date=intnx('month',date,0,'B');
        type='Forecast';
        y=.;
        x1=ranuni(28123)*1000;
        x2=ranuni(28722)*1000;
        output;
        date=intnx('month',date,0,'E');
    end;
run;

proc arima data=arima_data;
    /* Difference model with 2 x terms */;
    identify var=y(1) crosscorr=(x1 x2) noprint;
    /* Including 1 lag. */
    estimate input=(x1 x2) p=1 noest noprint
        ar=-0.57206
        mu=-15.59089
        initval=(-0.02439 x1 0.05201 x2)
    ;
    forecast interval=month id=date out=arima_forecast lead=12 noprint;
run;quit;

data merge_arima_forecast;
    merge arima_data (in=mas)
          arima_forecast (in=fcst keep=date forecast residual)
    ;
    by date;
run;

data implementation;
    set merge_arima_forecast;
    by date;

    /* From SAS Tech Support:
    y(t)=y(t-1) + C + beta1*x1(t) + beta2*x2(t) + AR(1)((y(t-1) - y(t-2)) - beta1*x1(t-1) - beta2*x2(t-1)) + a(t)
    Assumed, C=Mu (-15.59089), AR(1)=Auto Regressive Parameter (-0.57206), beta1 (-0.02439) and beta2 (0.5201)
    a(t) is the error term which is essentially zero and drops out of the model.
    */

    y_lag=lag(y);
    y_lag2=lag2(y);
    x1_lag=lag(x1);
    x2_lag=lag(x2);

    c=-15.59089;
    ar=-0.57206;
    beta1=-0.02439;
    beta2=0.05201;

    /* Actuals */
    *if type='Actual' then imp_forecast=sum(y_lag,c,beta1*x1,beta2*x2,ar*((y_lag-y_lag2)-beta1*x1_lag-beta2*x2_lag));
    /* Updated with -c*ar, matches the ARIMA fit. */
    if type='Actual' then imp_forecast=sum(y_lag,c,beta1*x1,beta2*x2,ar*((y_lag-y_lag2)-beta1*x1_lag-beta2*x2_lag),-c*ar);
    /* Forecast */
    lag_imp_forecast=lag(imp_forecast);
    lag2_imp_forecast=lag2(imp_forecast);
    if type='Forecast' then imp_forecast=sum(lag_imp_forecast,c,beta1*x1,beta2*x2,ar*((lag_imp_forecast-coalesce(y_lag2,lag2_imp_forecast))-beta1*x1_lag-beta2*x2_lag),-c*ar);

    arima_forecast_diff=forecast-imp_forecast;

run;
2 REPLIES 2
Ksharp
Super User
Post it at Forecast forum, since it is a time series analysis question.
kybowma
Calcite | Level 5

Hi Ksharp, I posted there originally, however it was flagged as spam by SAS.  SAS has sense marked it as a legible question.

 

https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Implementation-of-AR-1-Model-from-AR...

 

I appreciate your feedback!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3061 views
  • 0 likes
  • 2 in conversation