It's About "Time" for Bayesian

2 Likes

With the increase in popularity of Bayesian analysis, more and more areas of statistical analyses are starting to explore the possibilities and advantages of incorporating Bayesian techniques. In this post, we will discuss some tips and techniques to bring Bayesian analysis to Time Series.

Let's begin our discussion of Bayesian time series structure with autoregressive elements. Time series analysis is no stranger to the value of a series (Y) at the current time point being dependent on a weighted combination of parameters (phi) and the value of the series at some past time point. To model this element, the lag( ) function was typically used in a preprocessing move prior to analysis. This ultimately put the work on you to create variables within the data set that contained lagged values of the response series.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

In PROC MCMC, we have access to lead and lagged values for random variables that are indexed. What exactly do I mean by indexed? Two types of random variables are indexed in the MCMC procedure. The first is the response variable, our time series. In the MODEL statement, this variable is indexed by observations. The second is a random variable placed in the RANDOM statement. This variable is indexed by the SUBJECT= option.

To access both the lead and lagged values of these indexed variables, we state the variable name followed by either .L# or .N# to access lagged or next values respectively. For example, if our response series was named SALES and placed on the MODEL statement, SALES.L2 would represent the value of the SALES series two time points prior and SALES.N3 would represent the values of the SALES series three time points into the future. This greatly aids in the creation of models that would benefit from lagged elements such as a third-order autoregressive model (AR(3)).

When lagged values are utilized within a model, we do have an issue that also needs to be addressed. To forecast the total sales at time position 5 in the series, we would include the values from time positions 4, 3, and 2. This is not a problem because those values are found within our response series. What happens if we wanted to forecast position 3 in the series? We have the total sales of time positions 2 and 1. You might now see the problem that we have.

Do we simply drop these types of observations due to missing information in the model? No! In PROC MCMC, we can build in a way to account for the initial states of lagged variables that extend beyond our known data. What do I mean by initial states?

As we approach the start of the time series, we run out of information for our lagged time values. This was a problem before the ICOND= option. In the MODEL statement or the RANDOM statement, we can account for these initial states (or initial conditions).

In our example, we can include ICOND=(alpha beta gamma) in the MODEL statement. These initial states are treated as parameters in the problem, and we place priors against them just like any other parameter in our model. Three items are listed due to the maximum number of initial states needed being three when we are at the very beginning of the series. Using this technique, we do not lose data at the front from missing values.

Now that we see how to use indexing and ICOND to our advantage in the MODEL statement, how does this help us in the RANDOM statement?

Performing a Bayesian time series analysis also enables you to use a dynamic linear model setup. This setup is a very general type of nonstationary time series model. With this, you can create models with time-varying coefficients where you can explore stochastic shifts in regression parameters.

To do this, we use random-effects models that specify time dependence between successive parameter values in the form of smoothness priors. The best application of this structure is for seasonality components.

As you recall, seasonality components are deviations from the trend. These seasonality values sum to zero across the length of the seasonal period. For example, let's look at sales data that has been accumulated to quarterly averages. Upon inspection, we determine that there is a seasonal pattern existing across the quarterly values.

From a deterministic approach, these quarterly seasonal component values will sum to zero across four consecutive time points. This is due to the period of quarterly data being four in length. Taking a more dynamic approach, this sum is zero in the mean of the distribution with an additional variability.

The additional benefit of using the dynamic approach to this seasonal component is that we can now use the lag and next elements as well as initial conditions during the modeling process. Because we know that the sum of all the seasonal components should add to zero in the mean, we can model the seasonal component at the current time point as the sum of the negative previous seasonal components.

There are other components that we can entertain such as trends that follow a random walk with drift where this drift could follow a first-order autoregressive process.

Let’s look at a code example.

proc mcmc data=UKcoal nmc=100000 seed=123456 outpost=posterior propcov=quanew;
parms alpha0;
parms mu0;
parms s0 s1 s2;
parms theta1;
parms theta2;
parms theta3;
parms theta4;
parms theta_phi;
parms phi;
prior phi~normal(0,var=exp(theta_phi));
prior alpha0~normal(0,var=theta2);
prior mu0~normal(0,var=100);
prior s:~normal(0,var=theta3);
prior theta:~igamma(shape = 3/10, scale = 10/3);
random alpha~normal(phi*alpha.l1,var=exp(theta2)) subject=t icond=(alpha0);
random s~normal(-s.l1-s.l2-s.l3,var=exp(theta3)) subject=quarter icond=(s2 s1 s0) monitor=(s);
random mu~normal(mu.l1 + alpha.l1,var=exp(theta1)) subject=t icond=(mu0);
x=mu + s;
model c~normal(x,var=exp(theta4));
run;

The nine PARMS statements reference the model parameters. These are the ICOND parameters defined, the variance parameters, and the coefficient parameters. The three seasonal parameters are placed in the same block. This blocking is done to improve the mixing.

The PRIOR statements place normal priors on most parameters and inverse gamma priors on the variance components. Sampling the variance parameters on the logarithmic scale is also a way to improve the mixing.

The RANDOM statements are each indexed by the variable t and each contain lagged elements to random walk with drift and a seasonal effect. The ICOND option ensure that no observations are ignored due to a lack of information during the lag effects.

Note that adding a MONITOR option to the RANDOM lines would allow you to also have the lagged items presented within the diagnostic and posterior summary output.

The statement prior to the model line defines the value of the response at time t related to the other parameters of the problem.

The MODEL statement specifies that the response variable has a normal distribution.

The PREDDIST statement can be included if posterior predictive distributions would be needed from a Bayesian scoring perspective.

For more information concerning Bayesian Time Series, click here to see a SAS Support page sample about Bayesian Time Series or check out the paper written by Aric LaBarr, “The Bayesians are Coming! The Bayesians are Coming! The Bayesians are Coming to Time Series!”

Find more articles from SAS Global Enablement and Learning here.

It's About "Time" for Bayesian

Free course: Data Literacy Essentials

Get Started