BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DrSharma
Calcite | Level 5

Hi,

Just a basic question on SAS Forecast Studio (SFS).

If the best model selected (based on holdout MAPE) by SFS has one or more parameters not statistically significant, should that model be discarded and we should look at the other models?

1 ACCEPTED SOLUTION

Accepted Solutions
udo_sas
SAS Employee

Hello -

I would go for the model with better holdout performance. You can use out-of-sample data on top of that to verify the predictive power of your model.

Ken - any thoughts from your end?

Thanks,

Udo

View solution in original post

5 REPLIES 5
udo_sas
SAS Employee

Hello -

In my opinion you should be using holdout samples and select the best model based on the holdout statistic.

You may want to use out-of-sample data to assess performance of your winning model.

With regards to significance tests you may want to check out this paper by J. Scott Armstrong: "Significance tests harm progress in forecasting" (ScienceDirect).

Thanks,

Udo

ets_kps
SAS Employee

I want to chime in on what Udo says. Yes, use a predictive diagnostic for your forecasts.

The significance tests refer to one or more parameters independently. And what the lack of significance tells you is that the estimated measurement of that single parameter is imprecise. It doesn't mean it is zero. In fact, the effects could be very important to your predictive model.  It could be that data size, collinearity or a number of factors are messing with the size of your SE's.

Best-Ken

DrSharma
Calcite | Level 5

First of all, thanks very much Udo and ets_kps for replying on the doubt.

To elaborate a little bit more I am adding the screenshot on a similar situation. It's an example of Univariate time series analysis, keeping 20% data as holdout.

The best model selected by automatic forecasting is an ESM model, with both Level and Trend estimates coming with insignificant p-value.

Can we still consider this model as the best model and use for prediction. Do these p-values have anything to do with the quality or reliability of the forecasts.

Univariate - With 20Pct Holdout Data - Insignificant Parameter Estimates.png

Similar cases happen when the input variables are tested to enhance the model and to see the influence of the added variable on the predictive ability of model.

The AR terms of the best (ARIMA) model chosen by the FS in such cases appear with insignificant p-values. Is it a matter of sample size (in training/holdout) and we need to try a different (bigger) sample to train the model? Or we can still use the best model as it has the lowest MAPE and hence highest forecast accuracy?

Thanks very much in advance.

Regards

DrSharma

udo_sas
SAS Employee

Hello -

I would go for the model with better holdout performance. You can use out-of-sample data on top of that to verify the predictive power of your model.

Ken - any thoughts from your end?

Thanks,

Udo

ets_kps
SAS Employee

Hi DrSharma,

I am in agreement with Udo.  The MAPE will tell you about the predictive power, which appears to be your primary objective.  The components could be not statistically significant for a slew of reasons, including small sample size. It is possible that your final estimates are statistically significant when you add back the held out data (which you should do).

Let us know if you would like any additional guidance. -Ken

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2341 views
  • 0 likes
  • 3 in conversation