We’re smarter together. Learn from this collection of community knowledge and add your expertise.

SAS Visual Forecasting 8.1 – Incremental Forecasting Using ATSM Objects

by Occasional Contributor Jack_Zhang on ‎08-15-2017 02:46 PM (1,202 Views)

SAS Visual Forecasting 8.1 has many interesting and powerful applications of using the objects and the scripting language. A user can, for example, perform hierarchical forecasting, create rolling simulations to check on the stability of a forecast model, and generate an incremental forecast after the forecast model deployment. While it is possible to implement these applications with SAS 9 forecasting products, I find that it is much easier and simpler to do them using the objects and scripting language provided in SAS Visual Forecasting. In this blog, I will provide you with two code snippets that show you how you can use the ATSM objects to forecast both existing and new time series during an incremental forecast iteration. This process uses the previously selected models to generate a forecast for the existing time series that is based on the updated historical data, and it diagnoses and estimates the models, before generating a forecast for the new time series. This is a very important and much needed feature in the context of big data.

 

Overview of Incremental Forecasting Scenarios

 

Imagine the following scenario.  You have a dataset with historical data for millions of time series.  You have gone through one or more iterations to model and forecast these time series.  After a certain period, additional historical data is loaded into the forecast system and added to the original dataset.  How do you want to update the models and forecast results based on the incremental data? Below, I will cover two scenarios for which you can leverage different strategies to reuse the forecast results from a previous forecast iteration.

 

Scenario I: The incremental data includes only data for existing time series

 

In this scenario, you have the following modes, or options, to update the models and forecast results for the existing time series: 

 

table1.png

 

Scenario II: The incremental data includes data for both existing and new time series

 

In this scenario, you can still use the options that are listed under the Scenario I paragraph to update the model and the forecast results for the existing time series.  For the new time series in the updated data, however, you have to run the “diagnose” and “select” tasks, before you can generate forecast results.

 

Using ATSM Objects to Run an Incremental Forecast – Scenario I

 

In this scenario, the FORENG object that is included in the ATSM package is utilized.  The forecast is generated by two methods that are associated with the FORENG object: replay() and setOption().  The value for the “TASK” option in the setOption() method determines what update mode is used. If “forecast” is an instance of the FORENG object, an incremental forecast for Scenario I is generated by the following two statements:   

  • rc = forecast.replay(infmsg, inest);

This object method replay(infmsg, inest) replays the models and the parameter estimates for the “forecast” instance of the FORENG object. It restores a previously generated forecast model selection graph (FMSG) from the specified INFMSG instance. Optionally, it restores an INEST instance that supplies previously calculated parameter estimates of the selected model for the restored FMSG. This determines the selected path (model set) from the restored FMSG and supplies the selected models with parameter estimates, which are fixed for forecast-only runs, or which serve as initial values for update runs. Please note that the INEST is required for the FORECAST, FIT and UPDATE incremental update modes.

  • rc = forecast.setOption('name’, value);

The object method setOption(‘name’, value) sets the named option for the FORENG object instance. For example, the following are valid name and value pairs for setOption parameters: 

 

table2.png

 

For more detail information about the synopsis and method syntax of ATSM object FORENG, please click hereNow, let us look at a code snippet to explore how the ATSM objects are used to run an incremental forecast for Scenario I, where the incremental data includes only updated data for existing time series.

 

/* this script illustrates the use of ATSM package to run incremental forecast with 
     SELECT, FIT, UPDATE, and FORECAST tasks */
cas mycas;
libname mylib cas sessref = 'mycas';

/*keep only the observations prior to 12/2002 */
data mylib.pricedata_i0;
    set sashelp.pricedata;
    if date < mdy(12,1,2002);
run;

/*data with additional observations after 11/2002 */
data mylib.pricedata_i1;
    set sashelp.pricedata;
run;

/*run full diagnose using data prior to 12/2002 and collect both the diagnosed model class list 
  and model parameter estimates */ 
proc tsmodel data      = mylib.pricedata_i0
             outobj    = (
                          outest       = mylib.outest  
                          outfor       = mylib.outfor  
                          outfmsg      = mylib.outfmsg 
                        ) ;
    id date interval = month;
    by regionname productline productname;
    var sale / acc = total;
    var price / acc = average;
    require atsm;
    submit;
        /*declare ATSM objects */
        declare object dataframe(TSDF);
        declare object diagspec(DIAGSPEC);
        declare object diagnose(DIAGNOSE);
        declare object forecast(FORENG);
        declare object outest(OUTEST);
        declare object outfor(OUTFOR);
        declare object outfmsg(OUTFMSG);

        /*setup TSDF object */
        rc = dataframe.initialize();
        rc = dataframe.addY(sale);
        rc = dataframe.addX(price, 'required', 'no', 'extend', 'stochastic'); 

        /*setup diagnose spec */
        rc = diagspec.open(); 
        rc = diagspec.setIDM('intermittent', 2); 
        rc = diagspec.setESM('method', 'best'); 
        rc = diagspec.setARIMAX('identify', 'both'); 
        rc = diagspec.close();  

        /*run diagnose */
        rc = diagnose.initialize(dataframe);
        rc = diagnose.setSpec(diagspec);
        rc = diagnose.run();

        /*run forecast */ 
        rc = forecast.initialize(diagnose); 
        rc = forecast.setOption('lead',12);
        rc = forecast.run(); 
         
        /*collect results */
        rc = outest.collect(forecast);
        rc = outfor.collect(forecast);
        rc = outfmsg.collect(forecast);
  endsubmit;   
run;

/*run forecast task using data after 11/2002 and the diagnosed model class list and 
  model parameter estimates from the previous runs */ 
proc tsmodel data      = mylib.pricedata_i1
             outscalar = mylib.outscalar
             inobj     = (
                          inest        = mylib.outest
                          infmsg       = mylib.outfmsg
                        )
             outobj    = (
                          outest       = mylib.outest2  
                          outfor       = mylib.outfor2 
                          outfmsg      = mylib.outfmsg2
                        );
    id date interval = month;
    by regionname productline productname;
    var sale / acc = total;
    var price / acc = average;
 
 /*output forecast run return code, use a different rc for the REPLAY and RUN 
     so that you can track and output the replay rc and run rc rather than 
     using a generic rc throughout the script.*/
    outscalar fcstReplay_rc fcstRun_rc;
    require atsm;
    submit;
        /*declare ATSM objects */
        declare object dataframe(TSDF);
        declare object forecast(FORENG);
        declare object outest(OUTEST);
        declare object outfor(OUTFOR);
        declare object outfmsg(OUTFMSG);

   /*declare input objects for forecast replay */
        declare object inest(inest);
        declare object infmsg(infmsg);
        
        rc = dataframe.initialize();
        rc = dataframe.addY(sale);
        rc = dataframe.addX(price, 'required', 'no', 'extend', 'stochastic'); 
        
        rc = forecast.initialize(dataframe);
        /*replay the input objects for incremental forecast – forecast mode*/
        fcstReplay_rc = forecast.replay(infmsg, inest);
        rc = forecast.setOption('task','forecast');

        /*other options 
        /*update the parameter estimates of the selected model based on the latest data 
          from scratch and forecast – fit mode*/
        fcst_rc = forecast.replay(infmsg, inest);
        rc = forecast.setOption('task','fit');
        /*update the parameter estimates of the selected model based on 
          the previous parameter estimates and latest data, and then forecast – update mode */ 
        fcst_rc = forecast.replay(infmsg, inest);
        rc = forecast.setOption('task','update');
        /*re-select the models from the model class list; there is no need to replay 
          the inest for this option as the best model will be re-selected 
          from the model class list infmsg – select mode */
        fcst_rc = forecast.replay(infmsg);
        rc = forecast.setOption('task','select');      
        */
        rc = forecast.setOption('lead',12); 
        fcstRun_rc = forecast.run();    
   rc = outest.collect(forecast);
        rc = outfor.collect(forecast);
        rc = outfmsg.collect(forecast);
   endsubmit;   
run;

 

This sample code does the following:

  • Create two datasets from the pricedata table and loads them into CAS tables.
  • Use PROC TSMODEL and the ATSM package with the first CAS table, pricedata_i0, to run the forecast on time series with observations prior to DEC, 2002.
  • Collect the forecast results, and save OUTFMSG and OUTEST to CAS tables.
  • Use PROC TSMODEL and the ATSM package to run an incremental forecast on the second CAS table, pricedata_i1. The incremental forecast is executed with the “task = FORECAST” option, where the replay method for the FORENG and repeater objects are used to restore the models and the parameter estimates that are saved in the CAS tables.

The sample code also includes the statements to run an incremental forecast utilizing the FIT, UPDATE and SELECT task option values.

 

Using the ATSM Objects to Run an Incremental Forecast – Scenario II

 

When the incremental data includes data for both existing and new time series, you need to have a mechanism to detect the new time series and handle them differently from the existing time series.   As discussed in Scenario I for existing time series, you have four options to run an incremental forecast without having to re-diagnose.  For new time series, however, you need to first diagnose the time series and select the forecast model before you can generate the forecast results. When you run an ATSM object instance, you receive one of the following return status codes: 

 

table3.png

 

The FORGEN object will by default use the ESM BEST model to generate the forecast. However, you can use the return code generated by the REPLAY method to customize how new time series are handled since there is no forecast model selected yet for these time series. Here is another code snippet to show you how to use the ATSM object to run an incremental forecast for Scenario II, where the incremental data includes both existing and new time series.

 

/* this script illustrates the use of ATSM package to run incremental forecast
  in the case where new series are observed in the incremental data. */
cas mycas;
libname mylib cas sessref = 'mycas';

/*drop time series where product = 1 from pricedata table */
data mylib.pricedata_i0;
    set sashelp.pricedata;
    if product ne 1;
run;

/*load full pricedata into a cas table */
data mylib.pricedata_i1;
    set sashelp.pricedata;
run;

/*run full diagnose using data without product = 1 and collect
   both the diagnosed model class list and model parameter estimates */ 
proc tsmodel data      = mylib.pricedata_i0
             outobj    = (
                          outest       = mylib.outest  
                          outfor       = mylib.outfor  
                          outfmsg      = mylib.outfmsg 
                        );
    id date interval = month;
    by regionname productline productname;
    var sale / acc = total;
    var price / acc = average;
  require atsm;
    submit;
        /*declare ATSM objects */
        declare object dataframe(TSDF);
        declare object diagspec(DIAGSPEC);
        declare object diagnose(DIAGNOSE);
        declare object forecast(FORENG);
        declare object outest(OUTEST);
        declare object outfor(OUTFOR);
        declare object outfmsg(OUTFMSG);

        /*setup TSDF object */
        rc = dataframe.initialize();
        rc = dataframe.addY(sale);
        rc = dataframe.addX(price, 'required', 'no', 'extend', 'stochastic'); 
  
        /*setup diagnose spec */
        rc = diagspec.open(); 
        rc = diagspec.setIDM('intermittent', 2); 
        rc = diagspec.setESM('method', 'best'); 
        rc = diagspec.setARIMAX('identify', 'both'); 
        rc = diagspec.close(); 
    
        /*run diagnose */
        rc = diagnose.initialize(dataframe);
        rc = diagnose.setSpec(diagspec);
        rc = diagnose.run();
     
        /*run forecast */
        rc = forecast.initialize(diagnose); 
        rc = forecast.setOption('lead',12);
        rc = forecast.run(); 

        /*collect results */
        rc = outest.collect(forecast);
        rc = outfor.collect(forecast);
        rc = outfmsg.collect(forecast);
  endsubmit;   
run;

/*run incremental forecasting with full pricedata table */ 
proc tsmodel data      = mylib.pricedata_i1
             outscalar = mylib.outscalar
             inobj     = (
                          inest        = mylib.outest
                          infmsg       = mylib.outfmsg
                        )
             outobj    = (
                          outest       = mylib.outest2  
                          outfor       = mylib.outfor2 
                          outfmsg     = mylib.outfmsg2
                        );
     id date interval = month;
    by regionname productline productname;
    var sale / acc = total;
    var price / acc = average;

    /*output forecast replay and run return code, use a different rc for the REPLAY 
      and RUN so that you can track and output the replay rc and run rc rather than 
      using a generic rc throughout the script. */
    outscalar fcstReplay_rc fcstRun_rc;
    require atsm;
    submit;
        /*declare ATSM objects */
        declare object dataframe(TSDF);
        declare object forecast(FORENG);
        declare object outest(OUTEST);
        declare object outfor(OUTFOR);
        declare object outfmsg(OUTFMSG);

        /*declare input objects for forecast replay */
        declare object inest(INEST);
        declare object infmsg(INFMSG);
        rc = dataframe.initialize();
        rc = dataframe.addY(sale);
        rc = dataframe.addX(price, 'required', 'no', 'extend', 'stochastic'); 
        rc = forecast.initialize(dataframe);

        /*replay the input objects for incremental forecasting */
        fcstReplay_rc = forecast.replay(infmsg, inest);
          
        /* ATSM return codes
           ATSM_OK_NOCHANGE  Method call produced no change in the object state.
           ATSM_OK_NOTOPEN     Object is not open for modification.
           ATSM_OK_NORESULT    No result is produced for the method call.
           ATSM_OK_FORCESEL     Model selection mode (TASK=SELECT) is forced 
                                                      in the foreng.run method. */

        /* The statement fcstReplay_rc = forecast.replay(infmsg,  inest); creates an output table, 
             where the retune code of the method call forecast.replay(infmsg, inest) is assigned 
             to fcstReplay_rc column  for each unique by-group in the  dataframe. When a time series 
             (a unique by-group) is not found from the infmsg or inest object, the return code is set 
             to ATSM_OK_NORESULT, which also means that the time series does not exist in
             previous model results. */

        /*diagnose time series if previous model results do not exist */
        if fcstReplay_rc = ATSM_OK_NORESULT then do;
            declare object diagnose(DIAGNOSE);
            rc = diagnose.initialize(dataFrame);
            rc = diagnose.run();
            rc = forecast.initialize(diagnose);
            rc = forecast.setOption('task','select');
        end;
        else do;
            rc = forecast.setOption('task','forecast');
        end;
        rc = forecast.setOption('lead',12); 
        fcstRun_rc = forecast.run();
        rc = outest.collect(forecast);
        rc = outfor.collect(forecast);
        rc = outfmsg.collect(forecast);     
    endsubmit;   
run;

 

This sample code does the following:

  • Subset the pricedata table by dropping observations where product=1 and load the resulting table into a CAS table called pricedata_i0.
  • Load full pricedata table into a CAS table called pricedata_i1.
  • Use PROC TSMODEL and the ATSM package on the first CAS table pricedata_i0 to run the forecast on time series where product is not equal to 1.
  • Collect the forecast results and save OUTMFSG and OUTEST to CAS tables.
  • Use PROC TSMODEL and the ATSM package to run an incremental forecast on the second CAS table, pricedata_i1. This process uses the “task=FORECAST” option for time series that do not include product 1,  while it first diagnoses, then selects the forecast model, and  lastly generate the forecast for time series that do include product 1. The replay method of the FORENG and repeater objects is used to restore the models and parameter estimates saved in the CAS tables.

This code snippet also illustrates how to use the ATSM object’s return status codes to identify the new time series in the incremental data.

 

Conclusions

 

Incremental forecasting is a common practice where you apply models and parameter estimates generated during an earlier forecast iteration to forecast updated historical data. I covered two scenarios. In the first scenario, the incremental data included only data for existing time series.  In the second scenario, the incremental data also included data for new time series that were not available in the previous forecast iteration. With SAS Visual Forecasting, in particular with the ATSM objects and scripting language statements, incremental forecasting is simple and easy to do by utilizing the replay and setOption methods of the FORENG object. The elegant use of the ATSM object return status codes in the scripting language statements provides flexibility and makes it easy to identify new time series in the incremental data. There are quite a few good applications of SAS Visual Forecasting that we can explore in future blogs.

 

Relevant Blogs

 

SAS Visual Forecasting 8.1 – A New Scalable Efficient Flexible Forecasting Solution

  • Overview and background of SAS Visual Forecasting 8.1

SAS Visual Forecasting 8.1 – Data Manipulation with PROC TSMODEL

  • Overview of PROC TSMODEL from a data manipulation perspective

SAS Visual Forecasting 8.1 – Using Automatic Time Series Model (ATSM) Package

  • Overview of using ATSM objects for automatic time series modeling and forecasting

SAS Visual Forecasting 8.1 – Using Open Programming Interfaces

  • Overview of using open programming interfaces including Python for time series modeling and forecasting

Acknowledgements: Special thanks go to SAS R&D and Product Management for their valuable suggestions and for providing product and technical reviews of this series of blogs on SAS Visual Forecasting.

 

Contributors
Your turn
Sign In!

Want to write an article? Sign in with your profile.