The first post in this series (Adventures with State Space Models: Introduction) introduced State Space Models (SSMs) and described how the various signal components in the data, like trend, seasonality and input variable effects can be modeled individually. Modeling components individually provides information about how each one evolves over time, and, because the component models are additive, an overall forecast for the dependent variable is produced as a combination of the individual component forecasts. In this post, we’ll discuss SSMs in more detail. The discussion remains pretty basic, component-wise, and the demonstration focuses on adding a dynamic, seasonal component to a model that’s similar to the one described previously. Additional dynamic components provide a straight-forward way to introduce further details, and to describe how different types of components are accommodated in the SSM.
Let’s revisit the scaled down, SSM specification from last time.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Recall that equation 1 is the Observation equation, and that equation 2 is the State. Equation 2 specifies how the model gets from one time interval to the next and regulates the model’s dynamic components. In the previous demonstration, the State represented a one dimensional, Level component for a timeseries, Sales. A random walk was a reasonable way to represent how the Level for Sales evolved over time, so we just renamed Alpha and built the model without thinking about the State details too much.
Adding additional, dynamic components to the model means that the dimension of the state will need to increase. Implications of this in terms of the State elements, the variance parameters and so on will be discussed. We’ll also describe ways to put general restrictions on how dynamic components in the model evolve. Finally, we’ll discuss how the state enters or maps into the Observation equation. This information isn’t critical for the model that we’ll build here, but it will be useful for future demonstrations.
First, let’s take a look at the data used in this post’s demonstrations. The plot shows quarterly observations on Widget sales that start in Q1, 1990 and run through Q2, 2021. The input, Promo, is a binary variable that flags three quarters in the data. There’s no discernible, linear looking trend, but a fairly strong seasonal component is evident.
Taking a closer look at the seasonal pattern in the Widgets data, the plot shows a deterministic or static representation of the seasonal pattern. It’s derived as an additive seasonal decomposition.
Since the decomposition is deterministic, there are only four unique seasonal values, as shown below. Q1 is the seasonal peak quarter, and it’s about 5.7 units (Widgets) above the annual average. The seasonal trough in Q3 is about 6.7 units below the annual average. Note that zero represents the annual average (see the plot above); the four unique seasonal factors (SC) sum to zero.
Adding a Dynamic Seasonal Component to the Model
In SSMs, model identification is relatively straight-forward. We’ve determined that, at least to start, the model needs to accommodate three components: a static input, Promotions, and two possibly dynamic components that will capture the level and the seasonal patterns. We’ll fit the model in the SSM Procedure (SAS/ETS) as follows.
If you’ve read the first post in this series, the following statements will be familiar:
STATE and COMPONENT are two new statements. These statements work together to create the model’s Seasonal component, Dyn_Seasonal.
There are two parameters or variances estimated in the model. The State Dimension of this model is 4. More details on the State Dimension are presented below.
The variance estimate associated with State_Seas is significant, which indicates that the seasonal component, Dyn_Seasonal, is dynamic and not deterministic. The variance estimate for Loclin is borderline in terms of significance. However, the plot of this component, below, indicates that a best fit, flat line is not a good match for this component’s evolution over time. The Regression Parameter Estimate for Promo represents the average impact on Widget sales for the three instances that promo = 1.
Note, in SSMs, the terms ‘Model Parameters’ or simply ‘Parameters’ refer to variances or hyper-parameters that regulate the model’s dynamic properties. This is the reason that the Number of Parameters is listed as 2 in the Model Summary.
The plot below shows the evolution of the Dyn_Seasonal component. The amplitude of the pattern is relatively small from 1993 to 1995 and increases in the most recent data.
The plot below shows the evolution of the trend component, Loclin, over time.
The forecast for Widgets (Loclin, Dyn_Seasonal and Promo) overlaid on the Loclin and Dyn_Seasonal components is below.
A Look at the Details
Now, we’re going to specify and fit essentially the same model but in a more detailed and involved way. What follows is not a recommendation for the best way to fit a model with dynamic Seasonal and Level components in the SSM Procedure. As we’ve seen in the syntax above, developers have provided convenient statements and options that will automatically handle the creation of commonly used model components. However, taking a more detailed or ‘roll-your-own’ approach for the model’s Seasonal component will hopefully provide intuition on parts of the model we’ve abstracted from so far and build a foundation for creating more general and interesting SSMs moving forward.
First, we’ll revisit the scaled down SSM specification and generalize it a bit.
We’ve added a term, or matrix, Z to the Observation equation. Z is the State Effect, and it’s main job is to map State elements into the domain of the Observation equation. This is what the Component statement was doing in the syntax above. The Component syntax peeled off the first element of the three-dimensional State (more below), State_Seas to create the model component Dyn_Seasonal. State elements must be converted into model components before entering the Observation equation.
A T matrix has been added to Equation 2, and it’s called the State Transition. A new instance of the State is obtained by multiplying its previous instance by the square matrix T. In the previous post, the State was a one-dimensional random walk, so T was just the scalar, 1. There are a couple of details about SSMs that are useful to keep in mind. First, the T matrix must be square. We’ll see why below. Second, for the model to accommodate dynamic components, the raw materials or State elements that make up these components need to be specified in the State. The State, regardless of dimension, is a recursion, so State elements need to be written in this form.
Let’s keep these details in mind and build some intuition with the specification of dynamic components in the SSM. We’ll start by revisiting the deterministic seasonal decomposition, described above. Because the four seasonal factors were derived using an additive decomposition, they sum to zero and are denominated in the units of the dependent variable, widgets.
For the listed seasonal factors to enter the SSM as a dynamic seasonal component, they need to be specified as part of the State. Two things need to happen; first, we need to add a variance term to convert the seasonal representation from deterministic to dynamic. Second, the equation needs to be re-written as a recursion.
First, we’ll add the variance. Any four sequential seasonal factors represent a full cycle. A dynamic representation can be written as follows.
Now, the seasonal factors sum to zero in the mean. Gamma is a normally distributed random variable with mean zero and standard deviation, Sigma. If Gamma’s variance is zero, then the pattern reverts to being deterministic.
Next, we’ll re-write the equation as a recursion, and fit it into the State equation, listed above.
Any seasonal factor can be specified as a linear combination of the preceding three factors plus the random variable, Gamma. The weights in the linear combination are all -1. Accommodating the State Transition, or T matrix, the linear combination looks like the following.
This is ok, but we’re not quite there yet. Remember, the T matrix must be square!
In this example, the cost of making the T matrix square is adding two identities (e.g., multiply the second row of T by the current value of the state, element wise, and sum to get the second value of the state at time t+1). The variance of the random variable Gamma has also been converted into a 3x3 covariance matrix.
That takes care of the parts of the State equation that regulate how the seasonal pattern evolves. Recall, there’s also a dynamic Level component. It’s specified as a random walk using the TREND statement in the model we fit above.
Now, expand the State equation to include the level elements.
Notice that, as State elements are added we get a block diagonal structure in the T and COV matrices. It’s worth pausing for a minute to consider this. The introduction to this series of posts began with the statement that the various signal components in the model, like seasonality and trend, are mutually independent, and that the component models are additive. The structure of the State equation with two dynamic elements in the T and COV matrices provides a straight-forward illustration of how this works. Also note that the State has four elements.
Now, we’ll use the SSM Procedure to specify and fit the model with dynamic Level and Seasonal components implementing some of the details we just described.
Because the State blocks are independent, we don’t need to worry about details like the order that they are specified in. Our job is to specify the individual State blocks that define the dynamic components. The software figures out how to put them together to produce the overall State equation.
The widgets forecast and the forecasts of the model’s individual components are essentially the same as before.
The purpose of presenting the ‘roll-your-own’ Seasonal component in a SSM was to describe details that will be useful in future posts on this topic. Now, we’ve got the foundation to move forward to more interesting models. Dynamic input variables are up next, so stay tuned for more SSM action!
Find more articles from SAS Global Enablement and Learning here.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.