Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- Re: forecasting equation in PROC ARIMA

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-16-2015 02:30 PM
(2840 views)

Hello,

I am using the NOEST option in the ESTIMATE statement of the ARIMA procedure and it has lead me to an unexpected result. I have fit a model to predict net charge off (Y) data with Unemployment rate (X) as the input, no ARMA terms and no intercept just to isolate the issue, and extracted the parameter estimates, say Num1 (Shift=0) = 0.5 and Num1,1 (Shift=0) = -0.6.

The strange thing is that the trend of Y dictates the forecasts, when i expected that the right hand side of the forecast equation is soley a function of X.

For example I have two data sets:

**data** test_y_increase;

input Annl_NCO_rate Unemployment_Rate_FB monthend_date;

cards;

1 14 1

1 13.1 2

2 12 3

2 11.5 4

4 10 5

4 9.9 6

4 8 7

5 7 8

6 6.2 9

. 5 10

. 4 11

. 3.3 12

. 2 13

. 1 14

;

**run**;

**data** test_y_decrease;

input Annl_NCO_rate Unemployment_Rate_FB monthend_date;

cards;

6 14 1

6 13.1 2

5 12 3

4 11.5 4

3 10 5

2 9.9 6

2 8 7

1 7 8

1 6.2 9

. 5 10

. 4 11

. 3.3 12

. 2 13

. 1 14

;

**run**;

In both datasets, the values of X are the same (decreasing). Given my parameters for Concurrent and lag 1 of X, when I apply the published model to the series Y, it seems that the trend of Y has influence of the forecasts. This is unexpected to me because the model I have applied is only a function of X.

Here is my ARIMA code:

%let NumFactor1 = .5;

%let NumFactor2 = -.6;

**proc** **arima** data=test_y_increase;

title "Increasing Y";

identify var=Annl_NCO_rate(**1**) crosscorr=( Unemployment_Rate_FB(**1**) ) CLEAR CENTER;

estimate input =( (**1**)Unemployment_Rate_FB )

initval =( &NumFactor1.$(&NumFactor2.)Unemployment_Rate_FB

)

noest

NOINT;

forecast id=monthend_date BACK=**0** lead=**5** out=out_test_increase;

**run**;

**quit**;

**proc** **arima** data=test_y_decrease;

title "Decreasing Y";

identify var=Annl_NCO_rate(**1**) crosscorr=( Unemployment_Rate_FB(**1**) ) CLEAR CENTER;

estimate input =( (**1**)Unemployment_Rate_FB )

initval =( &NumFactor1.$(&NumFactor2.)Unemployment_Rate_FB

)

noest

NOINT;

forecast id=monthend_date BACK=**0** lead=**5** out=out_test_decrease;

**run**;

**quit**;

title;

Do you know why these forecasts are not only a function of X, or what the forecast equation is ?

Thanks,

Ryan

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you to Kenneth Sanford for getting Donna Woodward's response:

Because the response variable for the two models has a first difference associated with it, the forecasts will be a function of the lag of the previous actual value of the response variable (when available) or the lag of the previous forecast when lagged actuals are no longer available. If differencing had not been specified for the response variable in these two models, then the forecasts would, indeed, have only be a function of the input variable, X.

To illustrate how the lagged dependent variable is incorporated into the forecast equation when the response variable is differenced, let’s look at a simple random walk model:

Proc arima;

Identify var=y(1);

Estimate noint;

Run;

In backshift notation, this model is written as: (1-B)y_t = a_t

Performing the backshift operation, we get: y_t – y_t-1 = a_t

The forecast model for y_t is therefore: y_t = y_t-1 + a_t.

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
%let NumFactor1 = .5;
%let NumFactor2 = -.6;
data test_i;
input y x date;
dx = dif(x);
ldx = lag(dx);
lx = lag(x);
ly = lag(y);
cards;
1 14 1
1 13.1 2
2 12 3
2 11.5 4
4 10 5
4 9.9 6
4 8 7
5 7 8
6 6.2 9
. 5 10
. 4 11
. 3.3 12
. 2 13
. 1 14
;
run;
data test_d;
input y x date;
dx = dif(x);
ldx = lag(dx);
lx = lag(x);
ly = lag(y);
cards;
6 14 1
6 13.1 2
5 12 3
4 11.5 4
3 10 5
2 9.9 6
2 8 7
1 7 8
1 6.2 9
. 5 10
. 4 11
. 3.3 12
. 2 13
. 1 14
;
run;
proc arima data=test_i plots=none;
title "Increasing Y";
identify var=y(1) crosscorr=( x(1) ) noprint CLEAR;* CENTER;
estimate input =( (1)x )
initval =( &NumFactor1.$(&NumFactor2.)x
)
noest
NOINT;
forecast id=date BACK=0 lead=5 out=out_test_increase printall;
run;
quit;
data test_i;
set test_i;
retain tmp 0;
if _n_ <= 2 then ldx = -0.9;
if _n_ <= 1 then dx = -0.9;
tfInput = &NumFactor1.*dx - &NumFactor2.*ldx;
if ly ^= . then forecast = ly + tfInput;
else forecast = tmp + tfInput;
tmp = forecast;
run;
proc print data=test_i;
var y ly tfInput forecast;
run;
proc arima data=test_d plots=none;
title "Decreasing Y";
identify var=y(1) crosscorr=( x(1) ) CLEAR;* CENTER;
estimate input =( (1)x )
initval =( &NumFactor1.$(&NumFactor2.)x
)
noest
NOINT;
forecast id=date BACK=0 lead=5 out=out_test_decrease printall;
run;
quit;
data test_d;
set test_d;
retain tmp 0;
if _n_ <= 2 then ldx = -0.9;
if _n_ <= 1 then dx = -0.9;
tfInput = &NumFactor1.*dx - &NumFactor2.*ldx;
if ly ^= . then forecast = ly + tfInput;
else forecast = tmp + tfInput;
tmp = forecast;
run;
proc print data=test_d;
var y ly tfInput forecast;
run;
```

I am not quite sure I understand your question but here is what I make of it:

I am ignoring the CENTER option in your ARIMA code for simplicity. Your model spec is:

identify var=y(**1**) crosscorr=( x(**1**) );

estimate input =( (**1**)x) initval =( &NumFactor1.$(&NumFactor2.)x) noest NOINT;

The forecast function for this is:

tfInput = NumFactor1*dif(x) - NumFactor2*lag(dif(x)).

forecast = lag(y) + tfInput when lag(y) is available

= lag(forecast) + tfInput.

This does depend on y (and not just on x).

*************Verification code attached***************;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks ! A quick question, what if I had an AR term, say p=1 where AR = -.3 . My Estimate statement now looks like this :

proc arima data=test_i plots=none;

title "Increasing Y - ar";

identify var=y(1) crosscorr=( x(1) ) noprint CLEAR;* CENTER;

estimate p=1 input =( (1)x ) ar=&ar.

initval =( &NumFactor1.$(&NumFactor2.)x)

noest

NOINT;

forecast id=date BACK=0 lead=5 out=out_test_increase_ar printall;

run;

quit;

How would you code this using the logic in testi ?

proc arima data=test_i plots=none;

title "Increasing Y - ar";

identify var=y(1) crosscorr=( x(1) ) noprint CLEAR;* CENTER;

estimate p=1 input =( (1)x ) ar=&ar.

initval =( &NumFactor1.$(&NumFactor2.)x)

noest

NOINT;

forecast id=date BACK=0 lead=5 out=out_test_increase_ar printall;

run;

quit;

How would you code this using the logic in testi ?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you to Kenneth Sanford for getting Donna Woodward's response:

Because the response variable for the two models has a first difference associated with it, the forecasts will be a function of the lag of the previous actual value of the response variable (when available) or the lag of the previous forecast when lagged actuals are no longer available. If differencing had not been specified for the response variable in these two models, then the forecasts would, indeed, have only be a function of the input variable, X.

To illustrate how the lagged dependent variable is incorporated into the forecast equation when the response variable is differenced, let’s look at a simple random walk model:

Proc arima;

Identify var=y(1);

Estimate noint;

Run;

In backshift notation, this model is written as: (1-B)y_t = a_t

Performing the backshift operation, we get: y_t – y_t-1 = a_t

The forecast model for y_t is therefore: y_t = y_t-1 + a_t.

⏰

Time is running out to save with the early bird rate. Register by Friday, March 1 for just $695 - $100 off the standard rate.

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.** **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.