BookmarkSubscribeRSS Feed

Submit workloads to SAS Viya from Python

Started ‎04-11-2022 by
Modified ‎04-11-2022 by
Views 3,656

SAS Viya provides several mechanisms for integrating the Python language with SAS Viya's data and analytics capabilities. One such tool is SASPy, a module that creates a bridge between Python and SAS, allowing Python developers, who may not necessarily be familiar with SAS code, to leverage the power of SAS directly from a Python client. In this post, we'll look at the setup required to configure SASPy to access SAS Viya (deployed on Kubernetes) from a Jupyter notebook.

 

What is SASPy?

 

The open-source SASPy Python module converts Python code to SAS code and runs the code in SAS. It provides Python APIs to SAS so that you can start a SAS session and run analytics from Python. You can move data between SAS data sets and Pandas dataframes and exchange values between python variables and SAS macro variables. You can use the module in both interactive line mode and batch Python, as well as in Jupyter Notebooks. The results include ODS output, and can be returned as Panda data frames. Not all Python methods are supported, but you can customize the module to add or modify methods.

 

Requirements and Setup

 

You will need:

1. A SAS Viya environment. I'm running the latest Stable cadence of SAS Viya on Kubernetes, but SASPy supports any version from SAS 9.4 onwards.

2. A Python environment. In my lab, I downloaded and installed Anaconda which includes Python and Jupyter notebook.

3. SASPy. Go to the GitHub repository and download and install by following the instructions. I had an older version of Python installed on my client machine, so to make sure the SASPy module was made available to the 'correct' Python (the one that comes with Anaconda), I launched the command line prompt from the Anaconda Navigator interface and installed SASPy from here by running: pip install saspy.  

 

af_1_saspyinstall.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

  

To configure, specify the connection parameters in the sascfg_personal.py configuration file for SASPy to connect to your Viya deployment per the instructions. In the config file, you can specify the preferred connection method (for SAS Viya, we're limited to HTTP/HTTPS connections) and authentication information. Mine looks something like:

 

SAS_config_names=['httpsviya']

SAS_config_options = {'lock_down': False,
                      'verbose'  : True,
                      'prompt'   : True
                     }

SAS_output_options = {'output' : 'html5'}       # not required unless changing any of the default

httpsviya = {'ip'      : 'gelcorp.sas.com',
             'context' : 'Data Mining compute context',
             'authkey' : 'HTTP_Dev_Henrik',
             'options' : ["fullstimer", "memsize=1G"]
             }

 

Note that in this example, we're using an authinfo file, which is placed in the USERHOME directory and the contents of which look like:

 

HTTP_Dev_Henrik user Henrik password lnxsas

 

Demonstration

 

Once everything is configured, we can try it by running some code. Here's my demo Jupyter notebook...  

 

af_2_jupyter-1024x1018.png

 

...and a running commentary of what happened in 4 steps:

 

  • (1) In the first step , we're importing the SASPy libraries. Then we establish a connection to SAS Viya by referring to the connection defined in the config file, and a SAS Compute Server (SAS Programming Runtime) session is launched. As per the (configurable) settings in sascfg_personal.py, the Data Mining context is used.
  • (2) Python procedures are then used to query a sample SAS data set in various ways. SASPy converts the Python code to SAS code, which it executes in the Compute Server session. The results of the queries are returned directly as Pandas dataframes.
  • (3) We also then run a SAS code fragment to embed and run actual SAS code.
  • (4) The session stays open until we kill it with the endsas() function.

 

By the way, if we look in the logs (using Kibana), we can see the SAS code that actually ran in the compute session:  

 

af_3_saspy_kibana-1024x533.png

 

Additional Resources

 

Refer to the SASPy documentation for more information about installation, configuration and usage.

 

Of course, SASPy is not the only way to integrate Python and SAS. The Python SWAT package can be used to interact with CAS directly using the CAS REST API, allowing developers to run CAS actions from Python. Other built-in capabilities and optional modules also exist for things like Deep Learning, ESP, for accessing Python from a SAS program, and more.

 

Quite a lot of material is available for further information, including several blog posts, Youtube videos (including this excellent tutorial), and the official documentation.

 

And to wrap up, a note on using SASPy with SAS Workload Management. I have previously written about submitting SAS 9.4 Grid Manager jobs from Python. That capability is now available for any SAS Viya deployment that includes a SAS Workload Management license. In Workload Management, any 'compute' (excluding CAS) workload that is submitted as a job to SAS Workload Orchestrator, which uses its built-in smarts to determine where to send it for execution. So in effect, any SAS Compute Server session is grid-enabled, including sessions launched from Python. The workloads are submmitted as jobs to the Workload Orchestrator Manager which starts a launcher pod on an appropriate node in the cluster to execute the code. An admin can then interact with and manage the jobs as though they were any other job, and can also utilize the additional monitoring and administration functions provided by Workload Management. 

 

Thank you for reading.

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Very useful and interesting blog @AjmalFarzam .

 

Great to see you make the distinction of Python users running code within a SAS Compute Sever session versus running CAS actions within the CAS Server. This is really helpful, as it illuminates the opportunity for SAS programs that do not leverage CAS to be run. Ad as you wrap up with, these kind of sessions can fall under the control of SAS Viya Workload Orchestrator.

 

And great to see links to SAS and user generated content like the YouTube tutorial

 

Thanks for putting all of this together.

 

--Simon

 

Tried it now, it's a great way of submitting and getting results from Python... Thanks, Ajmal!

Version history
Last update:
‎04-11-2022 08:18 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags