Forecasting Concepts Part 2: Forecasting the Right Way

4 Likes

The Right Way

If ordinary regression is the wrong way to model time series data, then what is the right way? You can use time series methods such as exponential smoothing and ARIMA. This article describes ARIMAs and ESMs in more detail.

Recall Ordinary Regression for Data Without a Time Component

If we have cross-sectional data, we can use ordinary regression. We recall the simplest ordinary regression model:

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Where Y is the dependent variable (the "target" i.e., variable being predicted), ß₀ is the Y-intercept, ß₁ is the slope, X₁ is the independent variable. The independent variable may also be called the input, underlying factor, or explanatory variable. ? is the error.

As shown in the example below, where calories and exercise are inputs for our dependent variable weight, we can have slightly more complex model with multiple input variables.

Here Y is weight, X₁ is calories, and X₂ is exercise. We have three parameters, ß₀ (intercept/constant) and ß₁ and ß₂ (partial regression coefficients) and, again, a combined error term ?.

Modeling For Data With a Time Component

But if we have data with a time component, we must use an appropriate time series model. For example, ARIMA, exponential smoothing models, unobserved components models (UCM) or intermittent demand models (IDM).

ARIMA (Auto-regressive Integrated Moving Average)

We recall that autocorrelation means that current values in a time series (Y_t = Y at the current time t) are positively or negatively correlated with earlier values. The correlation between the current value Y_t and the immediately previous value Y_t-1 is called first order autocorrelation. The correlation between the current value Y_t and two periods back Y_t-2 is called second order autocorrelation, etc.

To address issues associated with autocorrelation and assumption violations (see Part 1 of this post), we essentially build lags into the model.

AR

Let’s compare a very simple first order auto regressive model (p=1) to an OLS regression model.

In time series, prediction Y_t is now a function of the historic observations (Y_t-n). In our simplest case, we see that instead of X, we now have Y at time t-1 as our independent variable. FYI, by convention, it is common to use the Greek letter φ instead of ß to represent the autoregressive parameters, so we see our AR(1) model below.

MA

Moving average models are used to model short-lived abrupt patterns in the data. The Greek letter θ is commonly used to represent the moving average parameters. A simple first order moving average model looks like:

Notice that here we are applying lags to the ERROR term and finding the appropriate coefficients (θ). This is in contrast to the auto regressive model above, where we applied lags to the OBSERVATIONS and found appropriate coefficients (φ).

ARIMA

In summary, we see that ARIMA includes:

AR (autoregressive) terms, where we place lags on the observations
Integrated terms, where we model differenced values between successive time points
MA (moving average) terms, where we consider lags on the error

ARIMA (p, d, q)

p is the number of autoregressive terms and adds lags on the observations (once the series has been made stationary)
d is the order of differencing needed to adjust the series to make it stationary
q is the number of moving average terms (lagged forecast errors)

In summary:

Graphic courtesy of Joe Katz and Anthony Waclawski

The term ARMA is for series that do not require differencing. The term ARIMAX applies to an ARIMA model that also includes independent variables (also called “underlying factors” or “explanatory variables” or “inputs”).

Exponential Smoothing Model (ESM)

You can think of exponential smoothing models as special cases of ARIMA models.

Random Walk

A random walk model simply says that today’s predicted value equals yesterday’s actual value (plus some error).

You can also think of a random walk as an ARIMA.

A random walk with drift says that today’s predicted value equals yesterday’s actual value plus a constant (plus error).

Or you could think of a random walk as a weighted average, where all of the weights are 0, except the most recent one, which is 1.

In any case, although it has its advantage in simplicity, a random walk (without drift) is generally not a very helpful model to predict far into the future. This is particularly true if there are trends or seasons or cycles in the data, as shown below.

Random walks are called naïve models. We can see that our 95% confidence interval (blue shaded area) is quite wide and we see our MAPE is pretty high (8.02).

Stationarity

A stationary time series is one whose statistical properties (e.g., mean, variance, autocorrelation) are constant over time. Time series with either trends (increasing or decreasing over time) or seasonality are not stationary. But never fear. You can create a stationary time series from a nonstationary time series by essentially modeling out the seasons, trends, etc.

Seasons and cycles might include things like:

daily (e.g., air temperature may peak in the afternoon every day)
weekly (e.g., shoe sales may peak weekly on Saturdays)
seasonal (e.g., natural gas usage may peak in winter)

Differencing is commonly used if there is a trend as an appropriate transformation to make a nonstationary time series stationary. First order differencing is computing the difference between consecutive observations. Be aware that this will eliminate one observation from your data set, because you cannot difference the initial observation from anything.

Seasonal differencing, e.g., for monthly data, computes the difference between an observation and the observation 12 time periods ago. For example, subtract January 2018 from January 2019, and so on. Be aware that a twelfth difference eliminates 12 observations from your data set. If you do not have a sufficiently long historic data set, this can be a problem.

Examples

Below we see the same data (ELECGENSUBSET) that was modeled above with the random walk model. But here instead we model with a seasonal autoregressive order of 1 and include an intercept in the model.

ar[1]=12

This gives us a MAPE of 3.47.

Let’s see how well a model with a seasonal moving average order of 1 that includes an intercept in the model performs.

ma[1]=12

Our MAPE is improved to 2.73.

The Easy Way

Use SAS forecasting tools and appropriate methods are built in! SAS Viya tools with forecasting include:

SAS Visual Analytics. I call this the gateway drug to forecasting. SAS VA lets you quickly and easily forecast one time series at a time. You can also use the friendly point-and-click interface to conduct “what if” analyses where you can manipulate either the inputs (independent variables) or the outputs (target/goal) to see what will happen to your forecast.
SAS Visual Forecasting. A more advanced tool for knowledgeable data scientists and statisticians looking to forecast many time series, including hierarchical time series. Now it even includes neural network time series analyses. This product includes a user-friendly html interface (Model Studio Pipeline interface) as well as advanced options available through programming in SAS Studio 5 or open source, such as Python or R. Note that the SAS Visual Forecasting license also includes the SAS 9 Forecast Server procedures and the SAS 9 ETS procedures.
SAS Econometrics. Includes methods specific to the economics domain. Accessible only through programming in SAS Studio 5 or open source; there is no pretty user interface. If you are an economist, this is for you.

My next article will compare SAS 9 forecasting and CAS forecasting procedures.

Sources and Additional Information

For a deeper dive into forecasting methods see:

SAS Education Course Time Series Modeling Essentials by George Fernandez, Marc Huber, Jay Laramore, Danny Modlin, and Chip Wells
SAS Education Course Forecasting Using SAS Software: A Programming Approach by Dickey & Woodfield
Introduction to ARIMA: nonseasonal models (Duke University)
Diggle, Peter J., Kung-Yee Liang, and Scott L. Zeger. 1994. Analysis of Longitudinal Data

BethEbersole · ‎12-07-2021

I've received some requests for these electricity generation data. They are publicly available at https://www.eia.gov/electricity/.