Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- Re: Unobserved Components Model Model Diagnostic

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-14-2015 03:12 AM
(2139 views)

I am using the **Unobserved Components Model** on two variables with 55 observations each. I have two objectives. The first is to decompose the data and analyze the individual components the second is to forecast. As my data is annual I use the trend cycle model with dummy variables for structural breaks and outliers.

I am facing two problems.

- To check the fit of the model what diagnostics should I use. The R squareof my model is negative and the Adjusted R square is missing
- I know that I can plot the graphs of (the one-step-ahead residuals, residual histogram and the Q-Q plot, autocorrelation function and the partial autocorrelation function). But how can I calculate the:

- Predicted error variance
- the one step ahead prediction errors
- the normality statistic based on the third and fourth moments
- heteroscedasticity statistic based on the ratio of the sample for the first one third of the prediction errors
- the Box-Ljung serial correlation statistic
- Durbin-Watson
- Skewness kurtosis

I am attaching my data and mentioning the code that i am using. Kindly help me solve this issue.

```
Aluminium
proc ucm data = metals;
id year interval = year;
model al=break1973 outlier1979 outlier1994 outlier2008;
irregular;
level ;
slope;
cycle;
cycle;
estimate plot=all;
run;
proc ucm data = metals;
id year interval = year;
model al=break1973 outlier1979 outlier1994 outlier2008;
irregular;
level variance=0 noest;
cycle;
cycle;
estimate plot=all;
run;
ZI
proc ucm data = metals;
id year interval = year;
model zi=break1973 outlier1988 outlier1995 outlier2006 outlier2008;
irregular;
level;
slope;
cycle;
cycle;
estimate plot=all;
run;
proc ucm data = metals;
id year interval = year;
model zi=break1973 outlier1988 outlier1995 outlier2006 outlier2008;
irregular;
level variance=0 noest;
slope;
cycle;
cycle;
estimate plot=all;
run;
```

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

In this post I will try to address your UCM questions so far. First some general comments:

Modeling and forecasting a time series is not easy without some understanding of the series being modeled. Very often several models can be proposed that appear to to fit the historical data reasonably well (his is true of both ARIMA models and UCMs). Model diagnostics (such as residual analysis) is useful but still requires context to decide which of the discovered features of the model are real and which might not be so. Cross-validation type methods, which are very effective in addressing overfitting in the ordinary regression modeling are not as effective in the time series setting. The policy about the handling of the outliers discovered during the exploratory stage is also not quite clear cut and (again) requires context info. In light of these, my personal preference is to try simple models that fit the data reasonably well and not to try to overfit the historical region. Outliers are left unhandled unless they distort the main features (such as trend) of the series. Without additional context, the model given at the end of this post seems adequate to me. Of course, whether the discovered cycle (of period 13 years) is "real" or not cannot be answered without domain info. Now answers to your specific questions:

1. Negative R-square: The R-square in usual ordinary regression is based on "regression residuals" (Y - X beta-hat). The UCM R-square is based on "one-step-ahead" residuals. One-step-ahead residual at a particular time is based on data prior to that time point. Therefore, UCM R-square is not guaranteed to be non-negative (this is mentioned in the UCM doc). Moreover, when the UCM model contains dummy regressors, very often only a few non-missing one-step-ahead residuals are available for residual analysis. This is because non-missing residuals are availble only after adequate number of observations are processed to initialize the diffuse components (which include regressors) in the model. All of your models suffer from this condition of inadequate number of non-missing residuals for residual analysis.

2. You can use the OUTFOR= option in the FORECAST statement to output series forecasts, residuals (their standard errors) and many other things. UCM provides rich graphical support for residual analysis (as you have noticed). If you want to compute some of the statistics you mention by hand, you can use the OUTFOR data set and use PROC IML or PROC UNIVARIATE.

My suggested program:

**proc** **ucm** data=metals;

model ZI;

irregular;

level variance=**0** noest checkbreak;

slope;

cycle plot=smooth;

estimate plot=panel;

forecast plot=decomp;

**run**;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you sir your reply and pointing out the above important points. It, and references (especially Harvey 1989 chapter 5 page 268 and 1992), helped me to clarify my doubts I. As you pointed out, I will not try to overfit the model and drop the outliers.

However sir I still have one issue. I want to incorporate the 1973 structural break to show its impact on the data (real metal prices which were distorted due to the oil price shock). As you mentioned the theory also supports this inclusion. Could you suggest how I can include and show the structural break in the model? Thank you once again.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Just include a level-shift variable that is zero before the event and 1 at and after the event in the input data set. Use this variable as a regressor. See the Nile level break detection example in the UCM doc.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I try the program (below) and then add the intervention (below). However my r square reduces (instead of improving) while model selection criteria, AIC and BIC criteria improve and the break is significant.

So is the model without the intervention better than the one with the intervention? Is the r square deteriorating as my data set is small (55 points).

**proc** **ucm** data=metals;

model al;

irregular;

level variance=**0** noest checkbreak;

slope;

cycle plot=smooth;

estimate plot=panel;

forecast plot=decomp;

**run**;

**proc** **ucm** data=metals;

model al=break1973;

irregular;

level variance=**0** noest checkbreak;

slope;

cycle plot=smooth;

estimate plot=panel;

forecast plot=decomp;

**run**;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

When using the Unobserved Components Model is it possible to include an independent vaiable and generate out of sample forecasts?

I am using the UCM but whenever I include an independent variable (along with a fixed level stochastic slope and cycle) i get an error and the results do not contain out of sample forecasts.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi!!

did you solve the problem?? i had the same problem, when i introduce intervention variables r squares reduces considerably.

regards,

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

In UCM regression coefficients are part of the state vector. They make up the section of the model state vector that has diffuse prior. Diffuse Kalman filter recursively computes one-step-ahead forecasts of the model state and response values. This recursive process also produces the one-step-ahead residuals. The one-step-ahead forecasts and residuals are set to missing until enough observations are processed so that all the diffuse state elements can be estimated. This scenario is similar to recursive estimation of regression vector in ordinary regression setting where the observation are processed one-at-a-time in a sequential fashion. In this setting one must first process sufficient number of observations so that the resulting design matrix is invertible before one can produce a valid estimate of the regression vector. When you specify an ntervention variable in such a setting, say the intervention is at 10th observation, i.e., the variable is zero for the first nine observations and is 1 thereafter, you must process at least 10 observations for the design matrix to become invertible. Because of this recursive nature of diffuse state estimation, the number of residuals available to compute residuals based fit statistics (such as RSuare) can become quite small when intervention variables are introduced as regressors in a UCM model. UCM one-step-ahead residuals are not the same as regression residuals (i.e. PROC REG residuals). For UCM models RSquare statistic need not increase because one adds a regressor in the model (in fact, in many time series model settings, including ARIMA and UCM, RSquare can even be negative).

Whether adding intervention improves the model can be determined based on a variety of other considerations: first, is the regression coefficient significant, have information criteria such as AIC, BIC improved, and so on.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Really great.

Thank you very much

##- Please type your reply above this line. Simple formatting, no

attachments. -##

Thank you very much

##- Please type your reply above this line. Simple formatting, no

attachments. -##

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.