BookmarkSubscribeRSS Feed

Exploring the configuration: using Python with SAS Analytics Pro

Started ‎12-18-2022 by
Modified ‎12-18-2022 by
Views 1,000

In this post we will look at using Python with SAS Analytics Pro. More precisely, calling SAS (Analytics Pro) from a Python programming environment.

 

SAS provides several mechanisms for integrating the Python language with SAS data and analytics capabilities. One such tool is SASPy, which is a module that creates a bridge between Python and SAS (the SAS Foundation). In this post we will look at the configuration required to integrate SASPy with Analytics Pro.

 

What is SASPy?

 

SASPy provides Python APIs to the SAS system. Allowing the Python programmer to start a SAS session and run analytics from Python through a combination of object-oriented methods or explicit SAS code submission. Data can be moved between SAS data sets and Pandas dataframes; SASPy also allows the exchange of values between python variables and SAS macro variables.

 

Let’s have a look at how this works with Analytics Pro.

 

SASPy connectivity

 

SASPy supports several connection methods which are described in the SASPy documentation, see here.

 

When connecting to Analytics Pro, regardless of where it is running (Windows, Linux, Intel macOS, etc), the SSH (STDIO over SSH) connection method needs to be used. Reading the SASPy documentation you will see that this is for connecting to SAS environments running on a Linux platform. As Analytics Pro is running in a container whose base image is built on Linux [Red Hat Universal Base Image (UBI) 8], SSH connectivity must be used.

 

Note, in late 2020, Apple began the transition from Intel processors to Apple silicon in Mac computers. Analytics Pro is currently not supported on devices using this new CPU architecture.

Configuring SAS Analytics Pro

 

The Analytics Pro documentation describes the required configuration, see Enable Use of SASPy.

 

In my testing I was running Analytics Pro on a Linux server. The following graphic illustrates the environment that I used for my testing.

 

MG_1_202211_SASPy_overview_withAD.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

  

 

As previously stated, the SASPy connection uses SSH, so SSH or passwordless SSH is required. The use of passwordless SSH is often used (preferred) as it eliminates the need to prompt the user for a password on the connection to Analytics Pro.

 

To enable the Analytics Pro container, you must provide the following:

 

  • Enable the SSH port, port 22 by default. Port 22 in the container needs to be mapped to a port on the Docker host. This is done on the ‘docker run’ command using the ‘--publish’ parameter. For example, ‘--publish 8022:22’. This maps port 22 to port 8022 on the Docker host.
  • SSHD configuration: a ‘sshd.conf’ configuration file is required in the sasinside/sasosconfig directory. The file doesn’t have to have any content.

 

In addition to the SSH configuration, two Linux capabilities are required. You need to enable the ‘AUDIT_WRITE’ and ‘SYS_ADMIN’ capabilities. You do this with the ‘--cap-add’ parameters on the ‘docker run’ command. The capabilities are required by the container operating system (UBI 8 )  when enabling SSH (it is not an Analytics Pro requirement).

 

Once you have started Analytics Pro with this configuration, then it is ready for SASPy connections from the Python programming clients.

 

Python environment configuration

 

The configuration of the Python environment is fairly straight forward. The SAS documentation states that along with installing the SASPy package, you also need to install the ‘wheel’ and ‘pandas’ packages.

 

You also need to generate an SSH key pair (public and private keys) when using passwordless SSH.

 

It is important to note that Windows doesn’t provide the OpenSSH client. However, there are a couple of options here:

 

  • Git for Windows, installs it is own SSH client
  • There is also a GitHub project for OpenSSH, see PowerShell/Win32-OpenSSH.

 

Once you have generated the SSH key it needs to be copied to the Analytics Pro container.

 

Using a Linux programming client

 

On a Linux client you can use the ‘ssh-copy-id’ command to copy the SSH key to the Analytics Pro container. For example:

 

ssh-copy-id -i identities_file -l login_username docker_server -p port

 

where:

 

  • Identities_file: is the SSH key (for example, ‘my_rsa_key’).
  • login_username: is the username for login to Analytics Pro.
  • port: is the port of the docker host that is being mapped to port 22 on the Analytics Pro container.

 

Using a Windows programming client

 

The OpenSSH client for Windows doesn’t provide the ‘ssh-copy-id’ command. So, manual steps are needed to copy the public key to the Analytics Pro container.

 

The contents of the public key must be copied to the ‘authorized_keys’ file in the user’s .ssh folder. This is in the user’s /home/.ssh folder in the Analytics Pro container. Depending on the setup of the environment it may be necessary to create the .ssh folder prior to creating the authorized_keys file.

 

In my environment I also used the ‘ASKPASS’ utility to help with the SSH commands. It is used to pass the password to the SSH command. For example, I ran the following commands from PowerShell ISE to copy the public key to Analytics Pro.

 

# Create the users .ssh directory
$env:ASKPASS_PASSWORD = 'xxxxxxx'
$env:SSH_ASKPASS_REQUIRE = "force"
$env:SSH_ASKPASS = "C:\Program Files\OpenSSH\askpass_util.exe"
ssh -o StrictHostKeyChecking=accept-new -p 8022 docker_server -l username "mkdir .ssh" 

# Use askpass to copy SSH Public Key to remote host 
$env:ASKPASS_PASSWORD = ' xxxxxxx' 
$env:SSH_ASKPASS_REQUIRE = "force" 
$env:SSH_ASKPASS = "C:\Program Files\OpenSSH\askpass_util.exe" 
type $env:USERPROFILE\.ssh\my_rsa_key.pub | ssh -p 8022 docker_server -l username "cat > .ssh/authorized_keys"


SASPy configuration

 

The final set-up step, once you have the SSH key copied to Analytics Pro, is to create the saspy configuration file, called ‘sascfg_personal.py’ by default.

 

Below is an example of the SSH profile when using my Windows client as the python programming environment. Note, the ‘identity’ parameter needs to use the ‘\\’ (UNC path) to be in a format that Python can read.

 

SAS_config_names   = ['ssh']
SAS_config_options = {'lock_down': False, 
                       'verbose' : True,
                       'prompt'  : True
                     }
 #SAS_output_options = {'output'  : 'html5'} # not required unless changing any of the default

 ssh                = {'saspath'  : '/opt/sas/viya/home/SASFoundation/sas',
                       'ssh'      : 'C:\Program Files\OpenSSH\ssh',
                       'identity' : 'C:\\Users\\student\\.ssh\\my_rsa_key',
                       'host'     : 'docker_server',
                       'luser'    : 'username',
                       'port'     : '8022',
                       'options'  : [""-fullstimer""]
                      }

 

Looking at the ‘ssh’ profile:

 

  • The ‘saspath’ parameter specifies the path to the SAS foundation in the Analytics Pro container.
  • The ‘ssh’ parameter is the path to the SSH command on the programming client. In the profile on my Linux client this was set to ‘/user/bin/ssh’.
  • The ‘identity’, ‘host’, ‘luser’ and ‘port’ parameters provide the information for the SSH connection.
  • The ‘options’ parameter is used to specify options on the SAS session.

 

Start program with SAS in Python

 

With the set-up completed you are now ready to start programming in python and using SAS data and PROCs. For example, here is a simple program that I used to query the SASHELP.CLASS table (using my Windows client).

 

#!/usr/bin/env python 
# coding: utf-8 
import saspy 
import pandas as pd 

# Start the session with Analytics Pro 
sas = saspy.SASsession(cfgfile='c:\\Users\\student\\saspy\\sascfg_personal.py', cfgname='ssh', results='text') 

# Query SAS data 
mydata = sas.sasdata("CLASS","SASHELP") 
mydata.head() 
mydata.describe() 

# Close the session 
sas.endsas()

 

This resulted in the following output.

 

MG_2_202211_windows_query_class_data-1024x445.png

 


Conclusion

 

As can be seen, the set-up of Analytics Pro and the Python programming environment is not complex. The only real complexity is when working on a Windows client, there isn’t a ‘ssh-copy-id’ command, so you have to perform the manual steps to copy the public key to the Analytics Pro container.

 

A final note on using a Windows client, the SASPy configuration and the python script files need to be UTF-8 encoded.

 

I hope this is helpful and thanks for reading. @MichaelGoddard.

 

 

Version history
Last update:
‎12-18-2022 05:12 PM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags