BookmarkSubscribeRSS Feed

Forecasting is a Snap in SAS Visual Analytics 8.2 on SAS Viya 3.3

Started ‎04-25-2018 by
Modified ‎04-27-2018 by
Views 5,449

Visual Analytics 8.2 not only lets you forecast future data values based on historic data over time, but you can also:

 

  • Include underlying factors, which may improve the accuracy of your forecast
  • Conduct goal seeking—see how the underlying factors (inputs, independent variables) need to change in order to achieve the goal you seek
  • Conduct scenario analysis—see how changes you make in underlying factors will affect your dependent variable (target)

This post demonstrates how to create forecasts in Visual Analytics 8.2 using the interactive reporting interface (Explore and Visualize Data interface), and then apply goal seeking and scenario analysis.

 

The first step is to open the Explore and Visualize Data shortcut in Visual Analytics 8.2. You will need to use a data set with a date variable. In this example, we use the ELECGENSUBSET data, created from publicly available data on electricity generation by various energy sources (Petroleum, Wind, Solar, Nuclear, etc.) in the United States. In this demo, we will forecast electricity generation in megawatt hours (Generation_MWH).

 

We pull in the data.

1.png

 

Our screen looks as follows:

 

2.png

 

Next, open the Objects pane on the left. Under Analytics we find the Forecasting object and drag it onto the canvas.

 

3.png

 

The Roles pane is on the right. Let’s remove the Frequency from Measure role and set the following roles:

 

4.jpg 

A green box highlights the underlying factors (inputs) that are significant. Here we see that CrudeOilImports is a significant input.

 

5.png

 

In the Options pane, we change the Forecast horizon to 12 so that the forecast is now for 12 months. Notice that the shaded green area around the forecast is the confidence interval.

 

6.png

 

We can click the expand icon (four arrows pointing outward) in the top right of the canvas to get more information. When we select the Dependent Variables Results, we see that the model used is an ARIMA (autoregressive integrated moving average).

 

7.png 

Next let’s try Goal Seeking! In the Roles pane, under Forecast, we select the What If button.

 

8.png 

Let’s leave the radio dial as Goal Seeking, and use the mouse to increase one of the values in the dependent variable (target)—in this case Generation_MWH—by dragging it upwards. Then we select Apply.

 

9.png
All else being equal, we see can how much CrudeOilImports would need to increase to reach the new electricity generation (Generation_MWH) goal.

 

10.png

We might also be interested in a Scenario Analysis. At the top left of the What-If Analysis, we select the snowman (vertical ellipsis) and select Start over. Notice that we could use a table to change values rather than the chart, if we wish. Personally, I like dragging the points around with the mouse, but to each her own.

11-1.png

 

Now we select Scenario Analysis and raise the independent value (input), which is CrudeOilImports. We need to remember to select Apply.

 

12-1.png

 

We see that the forecasted electricity generation (Generation_MWH) increased a few months after our manual increase in CrudeOilImports.

13-1.png

 

Remember that time series data cannot be analyzed like other data using simpler methods like ordinary least squares (OLS) regression. The reason is that time series violate assumptions of those methods. For example, OLS regression assumes that the error terms are independent and identically distributed (IID). This assumption is violated in time series data because of autocorrelation (also called serial correlation). In time series data the observations (and errors) from consecutive time periods are commonly correlated and thus not independent.

 

Excel and other software will not stop you from making this grave mistake, and so people can get very wrong results trying to analyze time series data. The beauty of SAS Visual Analytics is that it saves the novice from making this kind of mistake (and reporting inaccurate results!) by using methods that are appropriate for time series on time series data. Time series models used in Visual Analytics include exponential smoothing and ARIMA models. ARIMA stands for autoregressive integrated moving average.

 

A Few More Details for Those Who Are Interested

 

ARIMA (p, d, q)

 

  • p is the number of autoregressive terms and adds lags on the observations (once the series has been made stationary)
  • d is the order of differencing needed to adjust the series to make it stationary
  • q is the number of moving average terms (lagged forecast errors)

A stationary time series is one whose statistical properties (e.g., mean, variance, autocorrelation) are constant over time. Time series with either trends (increasing or decreasing over time) or seasonality are not stationary. Homoscedasticity (also called homogeneity) of variance means that variances (error) are equal/constant. Heteroscedasticity is the opposite and means that the variance (error) changes. I illustrate this below. Notice the “cone” shape of the heteroscedastic data.

14-1.png

 

You may also hear the term ARIMAX models. This is simply an ARIMA model that also includes independent variables (also called “underlying factors” or “explanatory variables” or “inputs”).

 

Also of note: Random-walk models, random-trend models, autoregressive models, and exponential smoothing models are special cases of ARIMA models.

 

  • Random walk = ARIMA (0,1,0)
  • Simple exponential smoothing = ARIMA (0,1,1) without  constant
  • Damped-trend linear exponential smoothing = ARIMA (1,1,2) without constant

You can create a stationary time series from a nonstationary time series in different ways, for example:

 

  • Differencing, such as:
    • First order differencing, which is computing the difference between consecutive observations
    • Seasonal differencing, e.g., for monthly data, computing the difference between an observation and the observation 12 time periods ago, i.e., 12 months prior is the same season)
  • Transformations: e.g., taking the log transformation of the series

 

Figuring out the best p, d, and q for a series is a complicated art. SAS Visual Analytics analyze the series and make these decisions for you. Yes, if you are a hot shot statistically savvy data scientist with domain knowledge you might be able to improve on the accuracy via tweaking the ARIMA by writing your own code. But often you will get decent accuracy automatically by letting Visual Analytics do the work!

 

For Forecasting, SAS Visual Analytics automatically tests multiple forecasting models against your data, and then selects the best model.

 

The forecast model can be any one of the following:

  • ARIMA
  • Damped trend exponential smoothing
  • Linear exponential smoothing
  • Seasonal exponential smoothing
  • Simple exponential smoothing
  • Winters method (additive)
  • Winters method (multiplicative)

The VA Forecasting object was introduced to SAS Viya in December 2017 for Viya 3.3. Unlike Forecast Server or SAS Visual Forecasting, SAS Visual Analytics does not let you tamper with the forecast. You get what you get and you don’t pitch a fit. There is always a trade-off between flexibility (having lots of choices and requiring advanced data science knowledge) and simplicity. Visual Analytics is a tool that brings advanced analytics to the masses. For this reason, I see it as a gateway drug of analytics. No statistical knowledge is required to use Visual Analytics, but once the users get a taste of the power of analytics, they often start craving more, and more, and more. The next thing you know, they go back to school to study machine learning or data science or statistics, much to the chagrin of their parents who just wanted them to get a job as a civil engineer and finally move out of the basement. I’m just saying. It can happen.

 

For more advanced forecasting ability, you may want to use Visual Forecasting on Viya. See Patricia Neri's post on Visual Forecasting.

Comments

I've received some requests for these electricity generation data.  They are publicly available at https://www.eia.gov/electricity/.

Version history
Last update:
‎04-27-2018 10:39 AM
Updated by:
Contributors

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags