Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- How does exactly Proc ARIMA generate forecast

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 08-23-2021 04:10 PM
(831 views)

In general, a time series forecast depends the initial values or the period to start the forecast. Suppose I have a AR(3) model estimated with the sample between 2010 Jan and 2020 Dec. Does proc ARIMA uses the first three months: 2010 Jan, 2010 Feb and 2010 March as the initial condition to generate the forecast and roll forward or use the last three months: 2020 Oct, 20220 Nov and 2020 Dec?

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Start with https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/etsug/etsug_arima_overview.htm

In the overview

The ARIMA procedure analyzes and forecasts equally spaced univariate time series data, transfer function data, and intervention data by using the autoregressive integrated moving-average (ARIMA) or autoregressive moving-average (ARMA) model. An ARIMA model

predicts a value in a response time series as a linear combination of its own past values, past errors (also called shocks or innovations), and current and past values of other time series.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks. I came here from online help document with confused head. For example, this is the equation shown in the page "Forecast details"

x^t+k=∑i=1k−1π^ix^t+k−i+∑i=k∞π^ixt+k−i

However, it is unclear what is the first period the program starts generating forecasts. I also notice there is an option "align" with feasible values being BEGINNING/MIDDLE/END, but without an specification on how they are defined.

Because the iterative nature of the forecasts from time series, the forecasts will differ if we use different starting points.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

> In general, a time series forecast depends the initial values or the period to start the forecast. + your question.

You are absolutely right!

And I have known this (I think it's clearly stated in one of our SAS/ETS courses how PROC ARIMA deals with this "initialization for forecast generation") but I do not dare to make a definitive statement on it anymore.

It depends on the estimation method:

- METHOD=CLS (
*infinite memory forecasts*, also called*conditional forecasts*) - METHOD=ULS or METHOD=ML (
*finite memory forecasts*, also called*unconditional forecasts*)

What method do you use? By default, METHOD=CLS (at least in SAS 9.4 Maintenance Level 7).

I think in your AR(3) model example, the CLS forecast starts with 2010 Jan, 2010 Feb and 2010 March as the initial condition to generate the forecasts as it assumes that the unknown values of the response series before the start of the data are equal to the mean of the series.

I note also this paragraph in the doc.

A complete description of the steps to produce the series forecasts and their standard errors by using either of these methods is quite involved, and only a brief explanation of the algorithm is given in the next two sections. Additional information about the finite and infinite memory forecasts can be found in Brockwell and Davis (1991). The prediction of stationary ARMA processes is explained in Chapter 5, and the prediction of nonstationary ARMA processes is given in Chapter 9 of Brockwell and Davis (1991).

I will follow-up this question and maybe consult some internal resources on it.

Kind regards,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

On top of my previous reply (see above!!) ...

The ALIGN= option is used to align the ID variable to the beginning, middle, or end of the time ID interval specified by the INTERVAL= option.

Suppose your interval is HOUR or DTHOUR, then 07h09 becomes 07h00 (align=beginning) or 07:29:59 (align=middle) or something close to 08h00 (align=ending).

Look at the dataset WANT in this example and play with beginning | middle | end :

```
PROC IML;
phi = {1 -0.5};
theta = {1 0.8};
y = armasim(phi, theta, 0, 1, 46, -1234321);
*print y;
create temp from y;
append from y;
close temp;
QUIT;
data have;
set temp(rename=(col1=y));
datetimehour=INTNX('dthour','20AUG2021:14:00:00'dt,_N_);
format datetimehour datetime18.;
run;
proc arima data=have;
identify var=y;
estimate p=3;
FORECAST ALIGN=BEGINNING /* MIDDLE | ENDING */ LEAD=5
ID=datetimehour INTERVAL=DTHOUR
NOPRINT out=want;
quit;
/* end of program */
```

Good luck,

Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi, Koen:

Thank you so much.

Even after reading the explanation on what "conditional forecast" and "unconditional forecast" are, I still cannot tell how the results will differ.

I think it is necessary for us to understand how SAS generates the forecast. From the practitioner's perspective, we usually starting the forecast from the most recent observations (last observations in the estimation sample). The forecast will be very different if SAS actually starts from the earliest observations used in estimation. If we don't understand the difference, we will misinterpret the results.

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.