BookmarkSubscribeRSS Feed
kybowma
Calcite | Level 5

My objective is to implement a model which was fit using the ARIMA procedure in SAS.  Via SAS technical support I was able to have them translate the models functional form (not the best at math) as follows:

 

yt = yt-1 + C +  w1x1t  + w2x2t  + ϕ ((yt-1 - yt-2) - w1x1t-1 - w2x2t-1 ) + at

 

The random error term at time t, at, takes on its expected value of 0 and effectively drops out of the equation.  In the forecast horizon, when actual values of the lags of y on the right-hand-side of the equation are no longer available, the corresponding predicted value is used in place of the lagged y values.

 

To test the above I create fake data and harvested the coefficients from the ARIMA procedures:

 

Capture.PNG

Where C=-15.59089, ϕ=-0.57206, w1=-0.02439 and w2=0.05201, the error term a(t) drops out and is essentially 0 as stated in the above.

 

Programmatically speaking, if I implement the above equation (if I implemented correctly) I should get the exact output that the ARIMA procedure (with forecast statement) produces.  Unfortunately, I am off by more than rounding errors.  The example data and manual scoring is in the following code, can someone please correct my mistakes so that I can match the ARIMA forecast output (FORECAST variable to imp_forecast variable).  My assumption that the model fit (Actuals data denoted by the "type" variable) is close, as the "arima_forecast_diff" is static during this time at -8.91.

*** UPDATE 1 ***

Given I was off by a constant for the fitted model (i.e. type='Actuals') I was able to determine the root cause of the difference and isolate to the following: If I add back -C*AR(1) (-1(-15.59089-1*-0.57206))=8.9189245334 which was the exact difference.  Still working on the forecasted difference.  So, the modeling equation becomes:

 

yt = yt-1 + C +  w1x1t  + w2x2t  + ϕ ((yt-1 - yt-2) - w1x1t-1 - w2x2t-1 ) + at  - C*ϕ

 

Which yields (not regarding rounding errors) the exact output from the FORECAST statement in the ARIMA procedure.

 

 

data arima_data;
    format date date9.;
    length type $ 8;
    do date='01JAN2000'd to intnx('month',today(),-12,'B');
        date=intnx('month',date,0,'B');
        type='Actual';
        y=ranuni(28269)*1000;
        x1=ranuni(28123)*1000;
        x2=ranuni(28722)*1000;
        output;
        date=intnx('month',date,0,'E');
    end;
    /* Out of time. */
    do date=date to today();
        date=intnx('month',date,0,'B');
        type='Forecast';
        y=.;
        x1=ranuni(28123)*1000;
        x2=ranuni(28722)*1000;
        output;
        date=intnx('month',date,0,'E');
    end;
run;

proc arima data=arima_data;
    /* Difference model with 2 x terms */;
    identify var=y(1) crosscorr=(x1 x2) noprint;
    /* Including 1 lag. */
    estimate input=(x1 x2) p=1 noest noprint
        ar=-0.57206
        mu=-15.59089
        initval=(-0.02439 x1 0.05201 x2)
    ;
    forecast interval=month id=date out=arima_forecast lead=12 noprint;
run;quit;

data merge_arima_forecast;
    merge arima_data (in=mas)
          arima_forecast (in=fcst keep=date forecast residual)
    ;
    by date;
run;

data implementation;
    set merge_arima_forecast;
    by date;

    /* From SAS Tech Support:
    y(t)=y(t-1) + C + beta1*x1(t) + beta2*x2(t) + AR(1)((y(t-1) - y(t-2)) - beta1*x1(t-1) - beta2*x2(t-1)) + a(t)
    Assumed, C=Mu (-15.59089), AR(1)=Auto Regressive Parameter (-0.57206), beta1 (-0.02439) and beta2 (0.5201)
    a(t) is the error term which is essentially zero and drops out of the model.
    */

    y_lag=lag(y);
    y_lag2=lag2(y);
    x1_lag=lag(x1);
    x2_lag=lag(x2);

    c=-15.59089;
    ar=-0.57206;
    beta1=-0.02439;
    beta2=0.05201;

    /* Actuals */
    *if type='Actual' then imp_forecast=sum(y_lag,c,beta1*x1,beta2*x2,ar*((y_lag-y_lag2)-beta1*x1_lag-beta2*x2_lag));
    /* Updated with -c*ar, matches the ARIMA fit. */
    if type='Actual' then imp_forecast=sum(y_lag,c,beta1*x1,beta2*x2,ar*((y_lag-y_lag2)-beta1*x1_lag-beta2*x2_lag),-c*ar);
    /* Forecast */
    lag_imp_forecast=lag(imp_forecast);
    lag2_imp_forecast=lag2(imp_forecast);
    if type='Forecast' then imp_forecast=sum(lag_imp_forecast,c,beta1*x1,beta2*x2,ar*((lag_imp_forecast-coalesce(y_lag2,lag2_imp_forecast))-beta1*x1_lag-beta2*x2_lag),-c*ar);

    arima_forecast_diff=forecast-imp_forecast;

run;

 

 

 

1 REPLY 1
SAS_Vpt
Calcite | Level 5

Thanks for the updates, this really helped me out.

 

Do you know the formula for a ARIMA p=1 q=1 model? 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

Discussion stats
  • 1 reply
  • 1734 views
  • 0 likes
  • 2 in conversation