BookmarkSubscribeRSS Feed

Have It Your Way – Forecasting With SAS® and Open Source Q&A, Slides, and On-Demand Recording

Started ‎02-21-2024 by
Modified ‎02-23-2024 by
Views 447

Watch this Ask the Expert session to learn about the new DOSC node recently made available within SAS Visual Forecasting.

 

Watch the Webinar

 

You will learn:

  • Why SAS Visual Forecasting is embracing open source.
  • How the open source program is distributed to efficiently process multiple time series.
  • How to conveniently write compatible open source programs.
  • What output data and results are available after running the node.
  • How the node seamlessly integrates with the standard features of SAS Visual Forecasting.

 

The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.

 

Q&A

What language versions and packages are supported?

We leave it completely up to the customer and their site administrator to install Python or R; whatever version you’d like to use and install any packages you’d like on the back end. There’s some configuration involved to point to the right interpreter that you’ve installed, and then we pick it up from there.

 

Can I share my open source scripts with coworkers or use them in different projects?

The short answer is yes. You need to ensure the algorithm you use in your open source program is compatible with different data. For example, if you write a program that is designed for weekly data, then naturally you can’t use it against monthly data. So, you need to ensure your data is transferable and compatible. Another aspect, we share many predefined variables, and those variables are computed for each series, and they’re passed to the open source program. In this project, the variable corresponding to the time ID is called “date”. The dependent variable name is “sale” and BY variables are “productLine”, “productName” and “regionName”. Now, you can have another project where the name of the variables is not the same as current project. If you ensure to use as many predefined variables as possible that we share with the open source program, you wouldn’t run into any issues of compatibility. If the user ensures to use as much shared variables as possible, that will make a smooth transition from one project to another or making your codes available to your coworkers.

 

In the past, I used the Jupyter notebook supplied by SAS for open source with SAS Viya. Is that now unnecessary?

You can continue your past interactions with Jupyter notebook. That does not interfere with this feature. In this feature, you can submit open source programs in a different fashion. You can insert your open source programs in DOSC node and let the pipeline compare the accuracy of open source forecasts against that of other modeling nodes. You can access many shared variables, predefined variables, and you don’t need to worry about how to distribute the data on each worker and threads of your deployment. We will take care of it for you. If you wish to work with Jupyter notebook and use EXTLANG package of runTimeCode action, then you must essentially do a lot of back-end coding to make the open source program fit to the runTimeCode action syntax. So, I’d say this feature makes running open source programs much easier in this context.

 

In the demo, there were instances where there were non-zero values for one or more of the errors. But the exit status was still 0, why is that the case?

You are referring to OUTOPENSRCSTATUS table where some ERRNO_ status is 0 (indicating success) but the other columns have probably non-zero status codes. The _ERRNO_ status column refers to the error code coming from BY group processing of TSMODEL procedure (or runTimeCode action), while the first column (the _EXITCODE_ whose label is “Exit status code of external language program execution”) refers to any error code related to open source program execution. The OUTOPENSRCSTATUS table attempts to provide insights on whether the output of your open source code is working as expected or not from different angles.

 

What if I want to write code that processes more than one time series?

As Phil described in the introduction, the EXTLANG package will take the script and make a copy of it available to different worker nodes. Our data also exists on different threads of different worker nodes. So, you write one program and that runs against different data. A common thing is a script, and the same script is run against all data. You don’t need to worry about writing specific code for each series, but the same program is run against different programs. However, you may have some conditions in your open source program where you add a condition to run certain lines for a specific BY group. One of those variables that we share with the external language program is VF_BYGROUP. If it’s Python, that’s a dictionary. If it’s R, then that’s a list. You can customize the code that you submit to different series by adding the IF conditions in your open source program. At the end of the day, you write one program, and the same program is run against all data in your table.

 

Which settings are needed to run open source code from Python in SAS Viya? For example, read Excel data.

The feature that was presented today assumes the data is loaded to the deployment and is available on memory. This happens when you create a project in SAS Visual Forecasting where you can import data.

 

 

Recommended Resources

Forecasting with the Distributed Open Source Code node

5 things you should know about SAS Visual Forecasting

Scalable Cloud-Based Time Series Analysis and Forecasting Using Open-Source Software

Forecasting with SAS free ebook

Please see additional resources in the attached slide deck.

 

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.

 

Version history
Last update:
‎02-23-2024 04:28 PM
Updated by:

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Tags