Model Execution in SAS Model Manager

3 Likes

One of the key capabilities of SAS Model Manager is the ability to execute SAS, Python, and R models, so that these models can be used to easily score new data. In fact, SAS Model Manager supports multiple execution environments, from batch execution within SAS Viya to transactional REST-API endpoints hosted on SAS Viya or outside of it via containers.

Understanding these execution environments and their interactions with open source is a helpful skill for debugging and resolving issues in a timely manner. Today’s article will focus on the three most popular execution engines within SAS Model Manager with tips for administrators for addressing common issues.

SAS Cloud Analytics Service

SAS Cloud Analytic Services provides the run-time environment for data management and analytics on SAS Viya. Thus, you will hear CAS referenced as a place to store data, but also as an analytical engine. CAS operates in-memory to provide the best performance for big data. CAS supports execution of SAS, Python, and R models.

Within SAS Model Manager, CAS acts as both a publishing destination and the execution backing the Scoring Tests. Scoring tests provide a quick way to execute a model against a table within CAS. To create a new Scoring Test in CAS, a user selects a model, a version of that model, input data set, and an output library.

Executing models within CAS is just a few clicks for the user, but SAS Model Manager is performing several steps behind the scenes. SAS Model Manager will wrap the modeling code in a language called DS2. DS2 offers several advantages including parallel computation even within CPU-bound processes. This code includes steps to read data from CAS, execute the model code, and write the resulting data back to CAS. For Python models, an interface between DS2 and Python, called PyMAS, is invoked within the DS2 code. PyMAS handles the movement of data into and out of the Python model score code. For R models, SAS Model Manager automatically generates a Python wrapper around the R score code with the R model embedded within a Python process using the rpy2 package. This means that PyMAS can also act as the intermediatory between the DS2 code and the wrapped R model.

Tip: When executing Python and R models, do you notice pypgm, resultCode, and revision as columns in the output table? These are variables used by SAS Model Manager within the DS2 code for handling score code and results codes. Each PyMAS package instance represents exactly one Python module revision, and you may only see a value populated in this column whenever a new revision is invoked. If your Python code fails, a -1 is returned for the revision. Furthermore, a value of 0 for resultCode indicates successful execution for that row of data, but a non-zero value indicates a failure. A list of resultCode meanings can be found here.

SAS Micro Analytic Service

SAS Micro Analytic Service (MAS) is a stateless, memory-resident, high-performance program execution service that is available as an out-of-the-box publishing destination in SAS Model Manager. The MAS destination hosts the model on SAS Viya, which reduces the amount of configuration work required to score new data using the model. Additionally, the MAS destination exposes a REST API endpoint to the model, making it easy to incorporate the model into a wider process or application. MAS is best suited for ad-hoc transactional scoring in near-real time. For example, MAS is a great fit for determining if a credit card purchase is fraudulent right after the card is swiped. MAS is also great for scoring data in an application right after the applicant clicks submit.

We can deploy a SAS or Python model into MAS in just a few clicks from the SAS Model Manager user interface. First, select your model(s) and hit the Publish button. From the Publish Models dialogue, select the MAS destination. Give your published model a memorable name before clicking the Publish button. In just a few seconds, we are ready to start scoring data over REST API using our model!

Just like model execution in CAS, SAS Model Manager automatically wraps in the scoring code in DS2 and leverages PyMAS as the interface between DS2 and Python. But within MAS, a transaction is scored via a POST request with the input data as a JSON payload. This means that input data to score is sent to the deployed model via REST API and the resulting output is returned as a part of that same call.

SAS Container Runtime

Models can also be deployed outside of SAS Viya too. SAS Model Manager can build containers for SAS, Python, and R models that are published to container registries in Azure, Docker, GCP, and AWS. These containers are light-weight, portable, standardized, and scalable. Using SAS Container Runtime, you can easily take advantage of the cloud infrastructure, or local installation, to deploy applications with small footprint and are highly scalable and highly available. This ensures resources can be fully utilized to quickly execute the largest number of models and reduce the effort that is required to manage the environment.

Upon publishing, SAS Model Manager will build the container, install dependencies, add the scoring code, and push the container to the specified registry. SAS Model Manager leverages different base containers, depending on the score code type of the model. This allows us to minimize the container footprint by only including what is necessary.

After deploying the container into Kubernetes or Docker, users can score data via REST API with a JSON payload, just like with MAS. The difference is that containers provide scalability and the ability to execute without SAS Viya.

Installation and Configuration

When installing SAS Viya, there a few key steps to ensure models run smoothly within SAS Viya.

First, ensure that the ASTORES persistent storage volume (PV) and the overlays are configured during the deployment of SAS Viya. The overlays mount the directory paths that point to the ASTORES persistent volume claim (PVC).

Next, for running Python models, also ensure that a the customer-prepared Python installation (and any required packages) Kubernetes persistent volume (PV) and overlays are configured during the deployment of SAS Viya. Additionally, the PyMAS package with the environment variables pointing to the python executable file and mas2py.py file should be configured. There are also minimum versioning and packaging requirements for the Python environment.

Finally, for running R models, complete the steps for Python, including set up for rpy2, and ensure that a the customer-prepared R installation (and any required packages) Kubernetes persistent volume (PV) and overlays are configured during the deployment of SAS Viya.

Tip: When working with CAS and MAS destinations, take note that only one Python and one R environment is supported. The Python and R language as well as their many packages and libraries are not necessarily designed to play nicely across versions. If you are seeing issues when unloading and calling your model, there may be a version mismatch in model decencies. You can mitigate these issues when deploying to a container by leveraging the requirements.json file.

Administration

Running open-source code can introduce security risks. Thus, running open-source code in SAS Model Manager through the scoring test or performance monitoring requires escalated privileges. Folks with approval to run open-source code should be added by an Administrator to the CASHOSTACCOUNTREQUIRED group.

Before users can publish models to a destination, it must be created by an administrator. A MAS destination is often available out-of-the-box, and an admin can quickly create a CAS destination in Environment Manager. For container destinations, I recommend leveraging the Command Line Interface. There are examples for creating each destination in GitHub.

Code Requirements

To run Python and R models within SAS Viya, the score code files must follow a specific format. Data scientists can find the score code format required for Python models and R models in the SAS Model Manager documentation. This score code must include import statements for any necessary modules or library, a score function definition, and an output statement. Packages like Python-sasctl and R-sasctl can also help generate the correctly formatted model score code.

Want to learn more? Check out these resources:

Did I miss any helpful tips? Add them in the comments below!