BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmyrto
Quartz | Level 8

When we have a holdout sample for model selection how can we automatically use the holdout sample for the predictions using the best model ?

1 ACCEPTED SOLUTION

Accepted Solutions
alexchien
Pyrite | Level 9

The holdout sample is used for model selection. Once the best performed model for the holdout sample is selected, the full range of data will be used to fit the model and generate forecasts. Please refer to the HOLDOUT option from the forecast server procedure User's Guide.

 

HOLDOUT=n
specifies the size of the holdout sample to be used for model selection. The holdout sample is a subset
of actual time series that end at the last nonmissing observation. If the ACCUMULATE= option is
specified, the holdout sample is based on the accumulated series. If the holdout sample is not specified,
the full range of the actual time series is used for model selection.

 

For each candidate model specified, the holdout sample is excluded from the initial model fit and
forecasts are made within the holdout sample time range. Then, for each candidate model specified,
the statistic of fit specified by the CRITERION= option is computed by using only the observations in
the holdout sample. Finally, the candidate model, which performs best in the holdout sample, based on
this statistic, is selected to forecast the actual time series.
The HOLDOUT= option is used only to select the best forecasting model from a list of candidate
models. After the best model is selected, the full range of the actual time series is used for subsequent
model fitting and forecasting. It is possible that one model will outperform another model in the
holdout sample but perform less well when the entire range of the actual series is used.
If the MODEL=BESTALL and HOLDOUT= options are used together, the last one hundred observations
are used to determine whether the series is intermittent. If the series is determined not to be
intermittent, holdout sample analysis is used to select the smoothing model.

View solution in original post

1 REPLY 1
alexchien
Pyrite | Level 9

The holdout sample is used for model selection. Once the best performed model for the holdout sample is selected, the full range of data will be used to fit the model and generate forecasts. Please refer to the HOLDOUT option from the forecast server procedure User's Guide.

 

HOLDOUT=n
specifies the size of the holdout sample to be used for model selection. The holdout sample is a subset
of actual time series that end at the last nonmissing observation. If the ACCUMULATE= option is
specified, the holdout sample is based on the accumulated series. If the holdout sample is not specified,
the full range of the actual time series is used for model selection.

 

For each candidate model specified, the holdout sample is excluded from the initial model fit and
forecasts are made within the holdout sample time range. Then, for each candidate model specified,
the statistic of fit specified by the CRITERION= option is computed by using only the observations in
the holdout sample. Finally, the candidate model, which performs best in the holdout sample, based on
this statistic, is selected to forecast the actual time series.
The HOLDOUT= option is used only to select the best forecasting model from a list of candidate
models. After the best model is selected, the full range of the actual time series is used for subsequent
model fitting and forecasting. It is possible that one model will outperform another model in the
holdout sample but perform less well when the entire range of the actual series is used.
If the MODEL=BESTALL and HOLDOUT= options are used together, the last one hundred observations
are used to determine whether the series is intermittent. If the series is determined not to be
intermittent, holdout sample analysis is used to select the smoothing model.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1622 views
  • 0 likes
  • 2 in conversation