Welcome to the third installment of our State Space model (SSM) series. The first post in this series (Adventures with State Space Models: Introduction) introduced SSMs as a collection of independent, additive components and detailed differences between the two component types: dynamic and static. The second post (Adventures with State Space Models 2: More Dynamic Components and Details) focused on accommodating more than one dynamic component in the model and presented some necessary details. This post builds on the previous two, and the focus is on fitting a SSM with a dynamic input variable component.
The dynamic component we’ll introduce here evolves as a function of time, identically to the dynamic components introduced in previous posts. However, for many analysts, familiarity with interpreting ordinary regression models makes the idea of dynamic inputs non-intuitive at first. A dynamic input variable’s estimated effects can vary across time. For example, in a SSM that specifies units of sales as a function of a dynamic price input, a trajectory or timeseries of estimated price effects is produced. This is more general and potentially contains much more useful information than the point estimate produced by a static, linear model. This post starts with an example of estimating a static model of units of sales as a function of price and other variables. We’ll then generalize the model to include price as a dynamic input.
Data: the Soda data consists of weekly observations on CASES, case sales of soda. Other variables include observations on the case price charged, OWNPRICE, competitor prices, COMPPRICE and a binary variable that codes promotional activity, PROMOTION. The sales and price variables have been log transformed, and there are about 4 years of data.
The Static Price Input Model: the SSM Procedure syntax specifies LNCASES as a function of ordinary, static regressors LNOWNPRICE, LNCOMPPRICE and PROMOTION in the MODEL statement. The model does contain a, possibly, dynamic component specified with the TREND statement. LOCALLT is a local linear (LL) type. More details are given below.
Details on the Local Linear trend component; the LL trend type is a more general version of the random walk (RW) trend introduced in this series’ previous post. The LL trend can be specified as follows:
Dynamic trend characteristics are a function of the equations’ variances. The top equation is the Level, and the bottom equation is the Slope. If both variances, sigma-squared MU and sigma-squared BETA, are zero, the LL trend reverts to a deterministic linear trend. If the slope equation variance is zero and the level equation variance is non-zero, Beta becomes a constant, and the LL trend reverts to a random walk with drift. Setting sigma-squared MU to zero, as shown in the syntax, results in an integrated random walk (IRW) trend representation if the slope equation variance is estimated to be non-zero. Here, the IRW was identified as the best trend representation for the data through a process of trial and error.
The model’s estimation results indicate a negative relationship between OWNPRICE and CASES as expected. Because both variables have been log transformed, the parameter estimate can be interpreted as an elasticity; a one percent change in OWNPRICE leads to a 1.217 percent decrease in case sales, on average over the range of the data.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
The estimated (Slope equation) variance of the IRW trend is significantly different than zero, indicating that the trend representation is dynamic.
Reported Information Criteria provide baseline fit measures.
The Dynamic Price Input Model: now, we want to estimate a relationship between case sales and own price that can change as a function of time. Recall from the previous post that dynamic components start as STATE equation elements, they require a variance that regulates the way they evolve, and they need to be mapped into the domain of the dependent variable via a COMPONENT statement. For commonly used dynamic model components, the TREND statement does all this for us. However, there’s not a common dynamic input variable component specification. The following syntax implements the listed steps.
New syntax:
The model’s estimated parameters are shown below. The significance of PVAR indicates that the OWNPRICE component is dynamic. However, the trend, LOCALLT, has become static with OWNPRICE in the model.
The estimated effects of the static input variables are roughly the same as in the previous model.
Measures of the penalized, overall fit have improved substantially relative to the baseline.
It appears that we have a better fitting model, and that the relationship between case sales and price is dynamic. Now, we’ll see what further information the model can provide about this estimated relationship. The OP_ELASTICITY component is plotted using the following syntax.
The relationship between price and sales is estimated to be inelastic at about -0.35 in the early data. Consumers became more price sensitive over time. Own price elasticity has a maximum value of -1.15 in the week beginning 12JAN 2002. During this week, consumer responses can be described as marginally elastic, or a 1% increase in price is estimated to lead to a 1.15% decrease in sales in this week. To summarize; the producer starts with a fair amount of pricing power, but it diminishes over time. It will also be interesting to discover if consumers become more or less price sensitive at certain times of the year. To explore this, the following syntax accumulates the OP_ELASTICITY estimates to a month interval using an average accumulation method and then produces a seasonal (SC) decomposition. The seasonal decomposition is additive in this case, so the Seasonal or SC values are denominated in units of the original series (OP_ELASTICITY) and re-scaled around zero.
There are twelve unique seasonal component measures, one for each month of the year.
While the SC values are proportionally small compared to the OP_ELASTICITY estimates, the following inferences seem reasonable; the producer has the most pricing power in December (_SEASON_=12) on average. Consumers tend to be most price sensitive in April.
In the first three posts in this series, we’ve focused on advantages of SSMs, like flexibility and interpretability, in a one Y at a time or univariate context. The next post in this series introduces another advantage of SSMs; their facility in accommodating multivariate relationships. For our multivariate demonstration, we’ll be traveling back in time to the Yukon to explore population dynamics, so stay tuned for more SSM action!
Find more articles from SAS Global Enablement and Learning here.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.