02-06-2014 12:37 AM
Just a basic question on SAS Forecast Studio (SFS).
If the best model selected (based on holdout MAPE) by SFS has one or more parameters not statistically significant, should that model be discarded and we should look at the other models?
02-06-2014 10:32 AM
In my opinion you should be using holdout samples and select the best model based on the holdout statistic.
You may want to use out-of-sample data to assess performance of your winning model.
With regards to significance tests you may want to check out this paper by J. Scott Armstrong: "Significance tests harm progress in forecasting" (ScienceDirect).
02-06-2014 10:59 AM
I want to chime in on what Udo says. Yes, use a predictive diagnostic for your forecasts.
The significance tests refer to one or more parameters independently. And what the lack of significance tells you is that the estimated measurement of that single parameter is imprecise. It doesn't mean it is zero. In fact, the effects could be very important to your predictive model. It could be that data size, collinearity or a number of factors are messing with the size of your SE's.
02-06-2014 12:57 PM
First of all, thanks very much Udo and ets_kps for replying on the doubt.
To elaborate a little bit more I am adding the screenshot on a similar situation. It's an example of Univariate time series analysis, keeping 20% data as holdout.
The best model selected by automatic forecasting is an ESM model, with both Level and Trend estimates coming with insignificant p-value.
Can we still consider this model as the best model and use for prediction. Do these p-values have anything to do with the quality or reliability of the forecasts.
Similar cases happen when the input variables are tested to enhance the model and to see the influence of the added variable on the predictive ability of model.
The AR terms of the best (ARIMA) model chosen by the FS in such cases appear with insignificant p-values. Is it a matter of sample size (in training/holdout) and we need to try a different (bigger) sample to train the model? Or we can still use the best model as it has the lowest MAPE and hence highest forecast accuracy?
Thanks very much in advance.
02-10-2014 06:47 PM
I would go for the model with better holdout performance. You can use out-of-sample data on top of that to verify the predictive power of your model.
Ken - any thoughts from your end?
02-11-2014 12:00 AM
I am in agreement with Udo. The MAPE will tell you about the predictive power, which appears to be your primary objective. The components could be not statistically significant for a slew of reasons, including small sample size. It is possible that your final estimates are statistically significant when you add back the held out data (which you should do).
Let us know if you would like any additional guidance. -Ken