BookmarkSubscribeRSS Feed

An A to Z Overview of Forecasting in SAS® Q&A, Slides, and On-Demand Recording

Started ‎10-25-2022 by
Modified ‎12-14-2022 by
Views 727

Watch this "Ask the Expert" session to learn how to use SAS procedures (PROC ESM, PROC ARIMA,PROC TIMESERIES) or the low/no code UI in SAS Visual Forecasting to prepare your data and produce reliable forecasts. You will also learn how you can scale your open source algorithms to run in a distributed manner in the cloud using SAS Viya.

 

Watch the webinar

 

You will learn:

  • How to program time series analysis in SAS.
  • What procedures are available and how can they be used.
  • How forecasting can be performed automatically in SAS Viya with no or little coding and be put into production effectively.
  • How to scale your open source algorithms to run in a distributed manner in the cloud.
  • Why time series analysis is more than just specifying a certain model. Data preparation and exploring the time series are important as well.

The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.

 

Q&A

How can we use SAS to do time-series cross-validation as described at https://otexts.com/fpp3/tscv.html ?

Cross validation is a technique we introduced early on. A direct comparison can be found in the blogs here:

  1. Udo Sglavo on Cross-validation using SAS Forecast Server (Part 1 of 2)
  2. Udo Sglavo on Cross-validation using SAS Forecast Server (Part 2 of 2)

 

We also automated the process using Forecast Studio where you don’t need to write any code! Link for more information here: https://support.sas.com/resources/papers/proceedings14/SAS213-2014.pdf

We are in the process of developing automated cross-validation in SAS Visual Forecasting in SAS Viya as well and this will be available soon.

 

 

How do you replace missing values using splines in SAS?

The easiest way is to use PROC EXPAND from SAS/ETS or SAS Econometrics. PROC EXPAND allows you to convert to different time intervals, and also interpolate the values for this time series. This can be used for different purposes. For example, if you have missing data points in your time series and you would like to interpolate them, you can use PROC EXPAND to convert your timeseries to another granularity or to interpolate values. This link to PROC EXPAND https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/etsug/etsug_expand_toc.htm .

The general decision to interpolate data or to other methods needs to be clarified upfront. Note that using splines or any type of interpolation is not always a good practice for time series. The presence of cyclic components is not usually captured by interpolation methods. Model-based imputation or data replacement are better approaches.

 

 

Can you set up an automatic data batch run to feed the models?

Yes, we can set up such data batches and this is very frequently used on all our customer sites for 2 steps. Data preparation, which may capture the data from for example an SAP system or other databases, prepares the data in the appropriate granularity and then feeds it into forecasting. This can mean it is prepared as a table and the Model Studio interface can be opened manually to run the forecast, which is sometimes the case where data scientists would like to have full control of the data. They can start and forecast manually.

Another alternative is to run it as a full batch shop. Element #1 prepares the data. Element #2 takes an existing pipeline and runs this pipeline re-evaluating all models, finding out whether the model should be refilled or recalibrated, creates the forecast, and writes IT output data set. The answer is yes, it can be done in both cases.

 

 

SAS is specializing in statistical analysis. Why has Python become an integral part of SAS? What can Python do that SAS cannot do in Advanced Analytics?

Why has Python become an integral part of SAS? It is because we are open to everything. We don't want to exclude Python programmers or any other programmers from our technologies and that’s why you can program in SAS, Python, R, Java and Lua. Our framework is flexible and we’ll keep adding new open source languages when they become popular.

It's not the point that Python can do things that SAS can’t as every programming language has its advantages. In the open-source world there are algorithms coming out every day and if our users want to take advantage of a specific algorithm that just came out (which we haven't tested yet and brought into SAS), they can still use it! The point is to get the best of both worlds.

We also would like to make SAS Viya experience as convenient as possible for our users and we give them the option to use their language of choice. We are confident that we can offer them the analytics they need out of out of the box, and at the same time we are open to users who would like to program in an open-source language.

 

 

How do you select p, q, and d in Proc Arima?

You use the autocorrelation function plots and other statistics and decide about the optimal choice of these parameters. In the estimate statement in PROC ARIMA you specify the values

 

The automated modeling option in pipelines or with Proc TS model gives you the chance to automatically select this procedure. Based on the data, you can, for example specify an error measure which you would like to minimize. Let's assume the MAPE or the the root mean squared error. Now you ask the software to find the ARIMA model which is the best fit for that data. These procedures are more advanced and automatically compare different model approaches and select the parameters.

 

In SAS Visual Forecasting UI, if we're not very happy with some of our forecasts and you want to dig deeper, we can create our own models or modify existing ones using the interactive modeling node. So you can simply open the node and create a new model from scratch setting your own parameters in a point and click manner. Or you can take an ARIMA model that's automatically created by the system and modify the parameters in an interactive way. Using the interactive modeling node, you have the ability to see all the different plots that a forecast needs to decide the right parameters. For example, you can see the auto-correlation, the partial auto-correlation, the white noise plot etc. All those plots are automatically generated inside the node. For more information check the article here: https://communities.sas.com/t5/SAS-Communities-Library/Honing-in-on-Troublesome-Time-Series-Interact...

 

 

What kind of background knowledge (e.g., in statistics, cs, etc.) should one have to start getting into time series modelling using these cutting-edge techniques?

I would say taking a basic statistics course or have basic statistics knowledge to understand the data and statistical modeling is something you should have when you start. When I started running time series forecasting projects, I was also not an expert. I started with a lot of exponential smoothing models and ARIMA models. There are two beginners SAS forecasting courses to get you started. If you have experience in data analysis combined with some business reasoning, I think that's a good start. But the automated UI that we have are also great for both beginners and experts because they're simple to use and configurable at the same time. As we have embedded best practices in pre-made templates and AI automations, even if you want to run neural networks, we have auto-tuning functionalities, so all the hyperparameters and the difficult trial and error process that you had to go through in the past by yourself are automatically solved. Also you can experiment by running different automatic techniques  in parallel and then compare your results. Again, some statistical knowledge and one-two courses will guarantee success in the long run. But you can automate everything nowadays so the effort is so much less.

 

 

I have 4 years missing in the middle of a timeseries. Is that too much for PROC EXPAND?

Technically, this would work. However it needs to be carefully decided from a business or functional point of view what the best option is. It's a question of context. If you have a long time series with 10-20 years before, a couple of years afterwards, and four years missing in the middle, it might work to interpolate.

 

It's also the question if the four missing years are exceptional years. For example, if you miss year 2020 and 2021, which might be biased from Corona effects, simple projections will not make sense, but you should rather study the influence of other events and influential variables.

 

 

Can a forecasting procedure be used for population projection for the next 50 years? My team has estimated annual population count for the last 30 years. Now, we’re getting a request for projecting future population.

You can definitely do that in SAS Visual forecasting in various ways and experiment with multiple algorithms. However, the prediction intervals will widen as you move forward in time. You could try to use causal variables with extrapolated future values based on assumptions to mitigate this issue or you could also investigate demographic models that are tailored to forecasting population. 

 

 

What are the timelines associated with forecast capabilities? I.e., will you be able to do a 20-year forecast and include qualitative data influences that might influence the forecast in a few years from now, of which you do not have data but make assumptions?

It is technically possible, but needs to be verfiried from a functional perspective, which model types and approaches fit fest, as this is a very long forecast period. You can use future values of causal variables based on your assumptions and then experiment with different algorithms to see what gives you the best results.

 

 

The few forecasting models I've created have quite wide confidence intervals around the forecast estimates. Is there a bootstrap type of technique that can improve confidence intervals?

Probably there is a reason that this happens. I would start by exploring if the data quality could be improved. You could also try to add causal variables in your data and extrapolate their values in the future to get more reliable results. Keep in mind that bootstrapping will most probably widen your prediction intervals rather than narrowing them (https://otexts.com/fpp2/bootstrap.html). I found a SAS resource on implementing bootstrapping in SAS that you may find useful here: https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2020/4647-2020.pdf 

 

 

Is there a preferred Resource available for Time Series Cross-Validation and do you have any thoughts on the usefulness of Cross-Validation?

Cross-validation is worth trying but that doesn’t mean that other validation methods won’t work well with your data. Also, cross validation is usually a lot more computationally expensive when used in forecasting compared to cross-validation for predictive models. Cross validation is a technique we introduced early on. A direct comparison can be found in the blogs here:

  1. Udo Sglavo on Cross-validation using SAS Forecast Server (Part 1 of 2)
  2. Udo Sglavo on Cross-validation using SAS Forecast Server (Part 2 of 2)

We also automated the process using Forecast Studio where you don’t need to write any code! Link for more information here: https://support.sas.com/resources/papers/proceedings14/SAS213-2014.pdf

We are in the process of developing automated cross-validation in SAS Visual Forecasting in SAS Viya as well and this will be available soon.

 

 

Recommended Resources

Using the TIMESERIES procedure to check the continuity of your timeseries data

Replace MISSING VALUES in TIMESERIES DATA using PROC EXPAND and PROC TIMESERIES

Have a look at your TIMESERIES data from a bird's-eye view - Profile their missing value structure

Simulate timeseries data with a SAS DATA Step and SAS Functions

Step-by-step guide for using Open-Source models in SAS Visual Forecasting

How to incorporate Recurrent Neural Networks in your SAS Visual Forecasting pipelines

 

Please see additional resources in the attached slide deck.

 

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.  

Version history
Last update:
‎12-14-2022 02:51 PM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Tags