How to score your SAS Model Studio Astore model in Python

In my previous blog I described different alternatives of how to use a predictive model developed in SAS Model Studio for batch scoring big data in SAS Viya. This blog will demonstrate how you can score these models from Python using the Jupyter notebook environment.

Model developers that prefer visual interfaces might work in SAS Model Studio to develop accurate machine learning models, while programmers can take advantage of the SAS SWAT interface to build their models in SAS Viya from a Python program. For this tutorial I will assume that the SAS Model Studio interface has been used to develop the machine learning model.

So, you developed your favorite predictive model in SAS Visual Data Mining and Machine Learning (SAS VDMML) using SAS Model Studio. Using the pipeline comparison facility, you decided which model won the model tournament and will be used to batch score new data. The video, How to compare models in SAS steps through this process. The figure below shows the selected champion model in the pipeline comparison tab of SAS Model Studio.

SAS Model Studio Pipeline Comparison SAS Model Studio Pipeline Comparison

You now have 2 options to score your favorite model from the Python Jupyter notebook environment:

Publish the model from SAS Model Studio to score from Python
Download an API endpoint to score the model from Python

It is important to note, that with both approaches the model scoring will NOT be executed in the Python environment. The model publishing as well as the API endpoint both provide an integration mechanism to call the scoring process from a Python program and execute the scoring in the SAS Viya environment. This allows Python programmers to take advantage of the powerful SAS Viya processing. For this showcase we will use a model that provides scoring assets in a so-called analytical store or Astore. An Astore is a binary file that contains the state from a predictive analytic procedure. This state from a predictive analytic procedure, such as a random forest or gradient boosting, is created using the results from the training phase of model development. Astores can be created from predictive models developed in SAS VDMML or in SAS Enterprise Miner.

Publish the model from SAS Model Studio to score from Python

From the SAS Model Studio interface, you can use the publishing facility. In the Pipeline Comparison tab, select Publish Model from the overflow menu as shown in the figure below.

SAS Model Studio Model Publishing SAS Model Studio Model Publishing

In the publishing wizard, select the publishing destination CAS for batch scoring and provide a name for the published model. The publishing creates an entry of the model in the destination table; by default, that table is called SAS_MODEL_TABLE. Publishing destinations are usually defined by your SAS Administrator in SAS Environment Manager. For more details. Please refer to the online documentation.

In order to score the model published to the CAS server from a Python program, a connection needs to be established from your Python environment to a running CAS server. This can be done using an authentication request API call as shown in the figure below.

Connect to CAS Server Connect to CAS Server

In Jupyter notebook, we can now use the published model for batch scoring from Python using the CAS Actions “runModel” or “runModelLocal”. You need to provide the required parameters to the program according to your settings.

inTable:
- caslib = input CAS library for scoring table
- name = name of scoring input table
outTable:
- caslib = output CAS library for scored table
- name = name of scored output table
modelName: name of published model (same as in publishing wizard)
modelTable: name of table that holds published models (by default “SAS_MODEL_TABLE)

Score Astore model with runModelLocal CAS Action Score Astore model with runModelLocal CAS Action

Running this code from Python will process the scoring of the input table in the SAS Viya environment and create the scored table. Both the input and the output table will be held in memory in the SAS Viya environment. In order to make the scored table available to other users or application in SAS Viya, it needs to be promoted.

Promotion of the scored CAS Table Promotion of the scored CAS Table

Download an API endpoint to score the model from Python

As a second option, we can use an API endpoint that is created automatically for batch scoring and can be called from different front ends, such as SAS, Python or REST. In the Pipeline Comparison tab of SAS Model Studio, select Download score API from the overflow menu. Then choose Python as the front end.

Download Scoring API for Python Download Scoring API for Python

Copy the provided Python code snippet into a program in Jupyter notebook and insert the required parameters to run the program.

Host: URL of the running CAS server
Port: Port of the running CAS server (optional)
datasourceUri: SAS Viya link to the input table for the batch scoring
outputCasLibName: Name of the output library for the scoring output table
outputTableName: Name of the scoring output table

Score Astore Model from Python using an API Call Score Astore Model from Python using an API Call

Running this code in Jupyter notebook will trigger the execution of the scoring in SAS Viya and creates the scoring output table in the CAS environment.

Scored Table in CAS Scored Table in CAS

Hopefully the examples in this blog demonstrated how easy it is to score Astore models in CAS from a Python program using Jupyter notebook.

Finally, I would like to thank my colleagues at SAS who helped reviewing and publishing this blog.

SAS Communities Library