BookmarkSubscribeRSS Feed

Score Millions of Records in Minutes Using Python Models: Python Container Scoring Optimization

Started ‎10-02-2023 by
Modified ‎10-02-2023 by
Views 986

SAS Viya is fast. Earlier this year, an independent group analyzed and reviewed the speed at which SAS Viya could load and pre-process data, train and score models, and calculate accuracy. They found that on average, SAS Viya was 30 times faster than all other solutions in the study. And it makes sense that SAS Viya is so fast. SAS has been in the analytics game for almost 50 years staffed by the best statistical programmers and computer scientists who can optimize not only the engine running the code but also the code itself.

 

In my tests, I have seen SAS Viya score 10 million records of data using a SAS model in about 2 minutes. But what about a real challenge? How fast can SAS Viya run non-SAS code in a non-SAS engine? The SAS Model Manager team was challenged to run Python models faster. At minimum, the team was tasked with scoring 10 million records using a Python model in 45 minutes. I am happy to report that the SAS Model Manager team not only met that challenge but exceeded it! After optimizing our Python base container, SAS Model Manager can deploy Python models into a container in 2-3 minutes. Once the container is built and running, the team’s benchmarks show that a single container can score 10 million records of data in under 3 minutes. That is over 15 times faster than our goal. Additionally, that means we are scoring 60,000 rows per second!

 

SophiaRowland_0-1696269751864.png

 

SAS Model Manager is the ModelOps solution for registering, managing, comparing, testing, monitoring, and deploying models on SAS Viya. SAS Model Manager can build and deploy containers for SAS, Python, and R models using a technology called SAS Container Runtime. SAS Container Runtime allows models to run in any OCI-compliant system using a lightweight and scalable container that manages the dependencies required by that model. These containers are executed outside of SAS Viya without needed to pay any additional fees to SAS. 

 

SophiaRowland_1-1696269820652.png

 

To leverage the faster scoring of Python models inside a container, you will need SAS Viya 2023.09 or later. Next, you will need to:

  • Configure the container registry where SAS Viya will push the container. Registries in Docker, Azure, GCP, and AWS are supported.
  • Configure the container publishing destination in SAS Model Manager.
  • Register the Python model into SAS Model Manager with the requirements.json file. This file tells SAS Model Manager what packages and package versions the Python model needs to run so that the packages are included in the container. If you need help getting the model into SAS Model Manager with the requirements file, don't fear. Python-sasctl can help.
  • From SAS Model Manager publish the Python model to the container publishing destination. 
  • Run the container. If you are leveraging Docker, you can the following command to start your container on the specified port:

 

docker run -p <port-number>:8080 <container-name>

 

  •  Send data to the container via REST API. If you have published Python models to a container prior to 2023.09, it is important to note that new endpoints are available that standardize with the SAS-base container, which had a streamlined REST call. The old endpoints still work as expected, but will be slower. If you are calling your container via Python, you can use the following block of code:

 

# Specifiy URL
url='http://localhost:<port-number>/<model-name>'

# Make POST request with data
response = requests.post(
    url=url,
    headers={"Content-Type": "application/json"},
    data=json.dumps(data.to_json())
)

# View results
new_container_results = pd.DataFrame(response.json()["data"])
pprint.pprint(response.json()["metadata"])
pprint.pprint(new_container_results)

 

The following demo shows the container in action:

 

 

Want to learn more? Check out these resources!

Version history
Last update:
‎10-02-2023 02:22 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags