SAS has a strong portfolio of forecasting products that offer state-of-the-art forecasting solutions to customers in many industries worldwide. Last month, SAS Visual Forecasting 8.1 on SAS Viya™ was released. What is it? How does it work? Recently I attended an enablement event from R&D, where I learned about SAS Visual Forecasting 8.1 for the first time. By the end of the training, I was amazed with its capabilities and simplicity of use! This is the first in a series of blogs showcasing how to use SAS Visual Forecasting 8.1. This blog provides background and overview of this new tool. It also includes some details on how Visual Forecasting is designed with object-oriented concepts. To illustrate that, I will share a small code snippet with you.
New, Open, and Distributed Forecasting Platform
SAS Visual Forecasting 8.1 is built on SAS Viya™, a new analytic platform powered by Cloud Analytic Services (CAS). As a result, it is designed to effectively model and forecast time series in large scale with its highly parallelized and distributed architecture. This essentially provides a platform for the speed and scalability needed to create the models and generate forecasts for millions time series. Leveraging CAS’ open architecture, you can access SAS Visual Forecasting 8.1 using SAS or third-party programming interfaces such as Python, Lua, and Java (integration with R will be available in a future release). Data scientists who do not know SAS programming are enabled to use SAS Visual Forecasting for their forecasting projects; and more importantly, they get consistent and accurate forecast results faster, regardless of the programming language they are using. Furthermore, forecasters are now able to develop their forecasting projects with data from their individual session-scoped CAS libraries or from shared CAS libraries. They can also easily share their data, modeling components and forecasting results without moving data in the CAS environment.
New Object-Oriented Design, New Scripting Language and New Engine
SAS Visual Forecasting 8.1 has a completely new design that provides efficient, dynamic, and flexible ways for modeling and forecasting of time series. It includes three important concepts and features:
When you use these new concepts and features to create time series models and forecasts, you’ll find an improved modeling experience: the process is more efficient, flexible and productive. This is especially true where you are dealing with a large number of time series (for example, more than 10,000 time series).
Packages and Objects in SAS Visual Forecasting
The software components of SAS Visual Forecasting are organized into packages and objects. A package is a collection or a group of objects; and it is defined based on a specific class of time series functions. Here is a list of available packages in Visual Forecasting 8.1.
Following the Object-Oriented design, the objects in a package are delivered with methods that allow you to interact with the objects. Those objects are very robust and are used to capture and control the states of your modeling and forecasting processes. Examples of objects include the data frame, model specification, diagnose and forecast results. Here is a subset of objects you can find in ATSM package. For the full list, see SAS Visual Forecasting 8.1: Time Series Packages here.
During the development cycle of time series modeling and forecasting, these objects are the building blocks or major components in programs that you submit to CAS for execution. It is worth noting that the ATSM repeater object provides a mechanism to restore rows from a CAS table to make them available for use by other ATSM objects. These repeater objects in ATSM, TSM and other packages, along with the scripting language, described below, facilitate dynamic and flexible time series modeling and forecasting, the kind of processes that you love to have. These objects also enable you to build your own applications, such as incremental forecasting and rolling simulation.
Scripting Language used in SAS Visual Forecasting
Objects in a package are building blocks used in your programs that model and forecast time series; the scripting language for the programming statements is the tool and the glue that allow you to manipulate the objects and put them together to customize the workflow with those objects in the program. This scripting language is a subset of the Data Step programming language, which includes expressions, functions, and some control flow. The programs written in this scripting language are compiled and executed on the CAS server in Visual Forecasting. This scripting language works mainly using scalars, strings, and arrays. It also supports logic, loops, functions, subroutines, and objects. I’ll show you a sample using this scripting language in the code example section below. The current release of SAS Visual Forecasting only uses a programming approach to create models, tune models, and generate forecasts for time series. With the objects and the scripting language, you have a very dynamic and flexible vehicle to deliver creative and efficient time series modeling and forecasting in both development and production environments.
CAS Actions and CAS Enabled Procedures in SAS Visual Forecasting
So you have objects and the scripting language to create your own modeling and forecasting workflows or processes. What you need now is an engine where you can run your programs. This is exactly what CAS Actions and CAS enabled Forecasting Procedures provide. There are two main CAS actions used in Visual Forecasting:
There are also two CAS enabled Forecasting Procedures used in Visual Forecasting:
SAS programmers may find it is a bit easier to use the TSMODEL procedure than a CAS action, which needs to be run within PROC CAS. Within SAS Studio, you can write your programs using objects and scripting language for time series modeling and forecasting and run them within a CAS session. If you are not a SAS 9 programmer, you may use Python, Lua or Java to write programs in which you call CAS actions that are being executed against input data in CAS tables on the CAS server.
A Small Code Example of How You Can Model and Forecast with SAS Visual Forecasting By now, I hope you have a fair understanding of how SAS Visual Forecasting works and how you can write and run your programs for time series modeling and forecasting. Next, let me show you a small code example to illustrate how things are put together within PROC TSMODEL. Assume that you have a timestamped data set ( for example, the pricedata dataset available from SASHELP) and you want to accumulate the data into monthly intervals and create a time series data set at the lowest level in the hierarchy (in the price dataset, that would be Region, Product Line, and Product Name). Next, you want to use ATSM package to automatically model and forecast the time series data, using sale as the dependent variable and price as independent variable. Here is the code you may write within SAS Studio:
/* This script illustrates the use of ATSM package to diagnose the time series and select the best model to generate the final forecasts.*/ /* Create a CAS session with the option specified */ cas mycas; /* Create a SAS library using mycas session */ libname mylib cas sessref = mycas; /* Load pricedata into a CAS table using Data Step*/ data mylib.pricedata; set sashelp.pricedata; run; /* One single pass through the data in the CAS table */ /* The TSMODEL options specify data input and output, how times series are created */ proc tsmodel data = mylib.pricedata outobj = ( outFor = mylib.outFor outEst = mylib.outEst outStat = mylib.outStat ); by regionname productline productname; id date interval=month; var sale /acc = sum; var price/acc = avg; /* Load the ATSM package */ require atsm; /* Scripting language programming statements */ submit; /* Declare ATSM objects */ declare object dataFrame(TSDF); declare object diagnose(DIAGNOSE); declare object diagSpec(DIAGSPEC); declare object forecast(FORENG); declare object outFor(OUTFOR); declare object outEst(OUTEST); declare object outStat(OUTSTAT); /* Setup dependent and independent variables */ rc = dataFrame.initialize(); rc = dataFrame.addY(sale); rc = dataFrame.addX(price); /* Setup time series diagnose specifications */ rc = diagSpec.open(); rc = diagSpec.setArimax('identify', 'both'); rc = diagSpec.setEsm('method', 'best'); rc = diagSpec.setTransform('transform', 'auto'); rc = diagSpec.close(); /* Diagnose time series to generate candidate model list */ rc = diagnose.initialize(dataFrame); rc = diagnose.setSpec(diagSpec); rc = diagnose.run(); /* Run model selection and forecast */ rc = forecast.initialize(diagnose); rc = forecast.setOption('lead', 12, 'holdoutpct', 0.1); rc = forecast.run(); /* Collect forecast results */ rc = outFor.collect(forecast); rc = outEst.collect(forecast); rc = outStat.collect(forecast); endsubmit; run;
This code generates output in three CAS tables and also provides a summary of the time series process that has been executed on the input data. Have you observed that once the timestamped data has been loaded into a CAS table, it only takes a single pass of the data to do the following?
This is a huge consideration in the context of big data. You want to minimize the number of passes through a very large dataset since it is very expensive to do it from a resource and performance perspective. This is just one of the improvements from SAS 9 to Viya™ in terms of forecasting process efficiency. For those who are familiar with SAS 9 forecasting products, you know that it takes four passes through the dataset to complete the same tasks. A first pass to sort the timestamped data before using proc sort, a second pass to create the desired time series using proc timedata, a third pass to diagnose the time series using proc hpfdiagnose, and a fourth pass to forecast the time series using proc hpfengine.
Conclusions SAS Visual Forecasting 8.1 is available on SAS Viya™ and leverages the power of CAS. It is designed for data scientists and forecasters to produce time series models and forecasts in a structured and robust way with an emphasis on process efficiency and flexibility. The software components are organized into packages and each package contains a number of objects. Practically, users without advanced time series forecasting knowledge can utilize ATSM package to automatically generate a list of candidate models and choose a champion model. Users with advanced time series forecasting knowledge can construct custom time series models using TSM package. Users can also combine ATSM models and TSM models to generate the champion models for each time series. There is a lot of more that we can do with SAS Visual Forecasting. My next blog will feature managing data with TSMODEL procedure.
Acknowledgements: Special thanks go to Joe Katz from Product Management, Alex Chien and Mike Leonard from R&D for their valuable suggestions and providing product and technical reviews of this blog.