BookmarkSubscribeRSS Feed

Tips and Tricks: Improve Forecast Accuracy: Q&A, Slides, and On-Demand Recording

Started ‎03-08-2024 by
Modified ‎03-08-2024 by
Views 190

Tips and Tricks: Improve Forecast Accuracy Using Interactive Modeling in SAS® Visual Forecasting

Q&A, Slides, and On-Demand Recording

 

Watch this Ask the Expert session to learn how to improve your forecasting results by adjusting model parameters for individual forecasts in the easy-to-use user interface of SAS Visual Forecasting. 

 

 Watch the webinar

 

You will learn how to use the Interactive Modeling node in SAS Visual Forecasting to:

  • Tweak individual forecasts to improve accuracy.
  • Build a model from scratch.
  • Manually select a champion model.

 

The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.

 

Q&A

How do you know which is the best Model? What factors should I look at?

The best model is going to depend on your domain. We used MAPE, which is commonly used in economics. It's mean absolute percent error. It's very easy to interpret. That's how a lot of economists are going to determine the best model, also keeping in mind simplicity. In your field, you might also consider things like the Akaike Information criterion, Schwartz Bayesian criterion, which is going to basically reward you for a less complex model. In biological science, you might be using the root mean square error. So, this is going to be based entirely on what is your field and you could look at what people have published and what criteria they are using. If you have more detailed questions about your specific situation, you can ping me after the webinar or anytime East Coast time.

 

Can you see the code generated by the pipeline?

You can download the code for the model, but I can't quickly show that because it is a little bit involved.  See my SAS Communities blog at https://communities.sas.com/t5/SAS-Communities-Library/Downloading-Models-from-SAS-Visual-Forecastin....

 

How do we know if it is overfitting or not?

I'm a huge fan of a holdout sample and, frankly, I do it outside of even SAS Visual Forecasting. I like to hold out data completely outside of this process and then try the model I came up with and see if it will generalize to the new data. You can only do that if you have a long enough time period, right? We don't always have that luxury, but in this case, as you can see here, I have 13 years, which is a pretty long time period.

 

BethEbersole_0-1709906691590.png

 

 

I could hold out the last three years and then even add a holdout within the software. It's kind of comparable to predictive modeling with visual machine learning. You can have a test as well as a validation data set. That’s what I would do if you had the luxury of a long time period.  If you don't have the luxury, you could sub sample, try that sub sample set out, and then make sure that it will generalize to the data you pulled out. You could subsample over the whole time period.

 

Below is a screen shot of how to set the holdout within the auto-forecasting node. 

BethEbersole_1-1709906691600.png

 

 

Why doesn’t SAS VF always give me the best results by default?

First let me say that SAS Visual Forecasting is a huge time saver and will give you many excellent and highly accurate models automatically if you have decent time series data for a reasonable length of time.  But not every series in your data set might get a great result automatically.  I like to use the analogy of when you're having a served dinner buffet at a wedding or a conference and you have three choices - fish, crab cakes or beef tenderloin. They also get to choose their dessert, right? But then you get my cousin Eva, and she's vegan and gluten free. She's going to need a special meal, right? So just like that, some of your data sets are not going to be able to get a good forecast easily. There's something kludgy about them sometimes. You want to put eyes on them, you want to look at the graph, you want to see if maybe there were some errors. One of the things that we commonly dealt with when I was looking at Chesapeake Bay data would be lab changes. We would see a sudden drop or a sudden increase in, for example, nitrates. But it was an artifact of the data. Anytime your automatic forecasting isn't giving a good forecast, you want to dive in with your own eyes and see what's going on. Because it could be some problem with the data, but it could be that you could get a slightly better model if you tweak the model around. I consider these the problem children models that they just don't easily by default give you a good model. And in some cases, actually you're getting the best model that you can get with that data. That's what I tried to show with the petroleum Louisiana naive model, which is a random walk. It might be the best you're going to get with that data set.

 

Is there a way to apply customization to the model at a higher level of hierarchy? For instance, to all "coal."

Not currently in the Interactive Modeling node.  The Interactive Modeling node lets you work with individual time series only.  But this is on the road map, and we hope to see it soon.

 

How do you select the historical data for forecasting?

I'm not sure what you mean.  You want the longest time period that you have. I would always start with the longest time period that you have and would only truncate that if, when you look at it, there's big changes. Depending on the field you're in, it may be more and more volatile. If we're talking stock market prices, you may want to truncate it. But you're going to look at the whole time period first and then you may decide you're going to truncate it, if more recent is really more relevant. If you're looking at something like water temperatures, you're going to probably want to use the whole time period. There could be a long-term trend in there that you want to capture. So, it's going to depend on your data, but I would always start with the full time before I started truncating.

 

Is there a way to choose the domain or field in SAS to get the best forecasting models?

No, that's why, for SAS Visual Forecasting, you need someone who is knowledgeable. You don't need 30,000 statisticians or 20 statisticians. But you need one, at least, forecasting expert to use SAS Visual Forecasting so they would then know the field. They would then know the domain. If you want to go back to an even simpler tool, you could use SAS Visual Analytics. It is for anybody. You don't need anyone with any expertise to use SAS Visual Analytics. That will by default give you the best and would maybe lead you into hiring a forecaster and using SAS Visual Forecasting. It won't default for your domain. You would need someone on your team who knows that.

 

I missed the beginning of the webinar, are there any "packages/modules," for lack of a better term, that are being used beyond Viya and visual analytics that need to be purchased to do these forecasts?

Everything I showed you is part of SAS Visual Forecasting. You don't have to buy anything extra on top of SAS Visual Forecasting. And this is a Model Studio interface, which is part of SAS Visual Forecasting. So, you get the programming, the Model Studio interface, you get all of that. 

 

Is there a sample size limit?

If you are asking about a) the data size limit, SAS Viya software products are designed to work with extremely large data sets and run the analytics efficiently in parallel, but you will also need to consider a practical/feasible limit based on your compute resources and configuration.  If you are instead asking about b) how much data to include in your holdout sample or out of sample region, this would depend on how long your time series data set is, and whether you have seasonal or other cycles in the data.  For example, in my data set we had seasonal data by month for 13 years.  You could set the holdout region by setting an integer (for example, 12 or 24 months) or by setting a percentage (for example, 10%). With cyclic or seasonal data, you would want the holdout data to encompass at least one full rotation of the seasons.  For example, if you have monthly data that is seasonal, you will want to holdout at least one year (12 months).  See this link for more information on holdout samples and out-of-sample region in SAS Visual Forecasting.  https://go.documentation.sas.com/doc/en/vfcdc/default/vfug/n0mf6wi57q8huin1mroj229k4tur.htm#n09h96b3...

 

Is there an ARCH or GARCH option in automatic forecasting?

Although SAS Visual Forecasting auto-forecasting node can consider many types of models (exponential smoothing models, ARIMA and ARIMAX models, intermittent demand models, unobserved component models) it does not consider ARCH or GARCH models.

 

How would you add an ARCH or GARCH model to the pipeline?

ARCH and GARCH models are supported via SAS/ETS procedures; SAS/ETS is included with SAS Visual Forecasting.  These procedures require coding.  PROC AUTOREG and PROC VARMAX are the two SAS/ETS procedures you can use to create ARCH and GARCH models.  It may be possible to add these via a SAS code node into the pipeline in the future.  I will continue to investigate this.

 

 

Recommended Resources

SAS Demo: Use SAS Visual Forecasting’s Interactive Modeling Node to Hone in On Accuracy by Beth Eber...

Interactive Modeling in SAS Visual Forecasting by Joe Katz

Please see additional resources in the attached slide deck.

 

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.

Version history
Last update:
‎03-08-2024 09:28 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Tags