BookmarkSubscribeRSS Feed

Running SAS Models in Azure Synapse and Databricks Without Invoking SAS

Started ‎10-13-2022 by
Modified ‎10-13-2022 by
Views 1,145

In SAS Viya, we can publish and run a SAS scoring model in several target data platforms:

 

  • Hadoop Cloud Services
  • Cloudera Data Platform
  • Databricks
  • Azure Synapse Analytics
  • Teradata

 

A question that often comes up is the ability to run SAS models (once they are published) directly from within the target data platform, without running a SAS program. Indeed, this makes sense when you want to embed a scoring phase as a part of a larger data engineering process without mixing technologies and handling complex integration points.

 

Recently, such capabilities have been added to Azure Synapse and Databricks. It is now possible to run SAS models inside Azure Synapse and Databricks without invoking SAS nor running a SAS program.

 

To do so, we will be using the Scala and Python API which was released in SAS Viya 2021.2.2. Keep in mind that to use this API:

 

  • SAS In-Database Technologies for Databricks or Azure Synapse must be licensed (it is included in some SAS Viya offerings and can be added to others)
  • The SAS Embedded Process must be installed on the target platform

 

Let’s highlight some of the important instructions by looking at a Scala example on Azure Synapse.

 

First, you have to import the package that contains the implementation of the Model class:

 

import com.sas.spark.scoring._

 

To score data, we need to load the input table in a Spark dataset:

 

var inDataset = spark.table("default.hmeq_spark")

 

Then, we need to create a model that has been previously published into ADLS from SAS Viya:

 

var mymodel = Model.create(inDataset,"abfss://blobdata@mystorageaccount.dfs.core.windows.net/models/01_gradboost_astore/01_gradboost_astore.is")

 

ABFSS is the driver to use in Azure Synapse to access a blob in ADLS. 01_gradboost_astore is the name of the SAS model published in ADLS from SAS.

 

Optionally, we can add some options to the model:

 

mymodel.setDBMaxText(2000)
mymodel.setTraceON

 

Check the documentation for additional information on the options available.

 

Then we are ready to run the SAS model. This produces an output Spark dataset:

 

var dfout = mymodel.run

 

Potentially, we may want to save the output dataset as a Spark table:

 

dfout.write.mode("overwrite").saveAsTable("default.hmeq_spark_astore_api")

 

Here we go! We have run a SAS scoring model directly in the Azure Synapse ecosystem and we can leverage immediately scoring insights contained in the output Spark table.  

 

What about an example with Python and Databricks?

 

Here are the equivalent Python instructions used against Databricks in this case:

 

from sasep.model import Model

hmeqin = spark.table("default.hmeq_prod")

mymodel = Model.create(hmeqin, "dbfs:/mnt/adls/models/01_gradboost_astore/01_gradboost_astore.is")

mymodel.setDBMaxText(2000)
mymodel.setTraceON()

hmeqout = mymodel.run()

hmeqout.write().mode("overwrite").saveAsTable("default.hmeq_out_api")

 

Notice in this case that we have to mount the ADLS blob container (or S3 if we run on AWS) to a Databricks file system, hence the dbfs driver pointing to a mount point.

 

You can find complete examples in the documentation. You can use both APIs interchangeably with Azure Synapse and Databricks.  

 

Many thanks to my colleagues Maggie Marcum, Josh Mcclung, David Ghazaleh and Alex Fang for their help.

Version history
Last update:
‎10-13-2022 11:14 AM
Updated by:
Contributors

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started