Let's start with a gentle introduction to the Python CAS client by doing some basic operations like creating a CAS connection and running a simple action. You'll need to have Python installed as well as the SWAT Python package from SAS, and you'll need a running CAS server.
We will be using Python 3 for our example, but the same code will work in Python 2.7. Specifically, we will be using the IPython interactive prompt (type 'ipython' rather than 'python' at your command prompt). The first thing we need to do is import SWAT and create a CAS session. We will use the name 'cas01' for our CAS hostname and 12345 as our CAS port name. In this case, we will use username/password authentication, but other authentication mechanisms are also possible depending on your configuration.
# Import the SWAT package which contains the CAS interface
In [1]: import swat
# Create a CAS session on my-cas-server port 12345
In [2]: conn = swat.CAS('cas01', 12345, 'username', 'password')
In [3]: conn
Out[3]: CAS('cas01', 12345, 'username', protocol='cas', name='py-session-1', session='0f60f4f0-ff84-6843-a8cc-00232492ae09')
As you can see above, we have a session on the server cas01. It has been assigned a unique session ID and more user-friendly name. In this case, we are using the binary CAS protocol as opposed to the REST interface. We can now run CAS actions in the session. Let's begin with a simple one: listnodes.
# Run the listnodes action
In [4]: nodes = conn.listnodes()
NOTE: Information is available on 6 nodes.
In [5]: nodes
Out[5]:
[nodelist]
Node List
name role connected IP Address
0 cas01 controller Yes 10.37.10.172
1 cas02 controller Yes 10.37.10.174
2 cas03 worker Yes 10.37.10.175
3 cas04 worker Yes 10.37.10.176
4 cas05 worker Yes 10.37.10.191
5 cas06 worker Yes 10.37.10.193
+ Elapsed: 0.00413s, user: 0.005s, sys: 0.008s, mem: 0.413mb
The listnodes action returns a CASResults object (which is just a subclass of Python's ordered dictionary). It contains one key ('nodelist') which holds a Pandas DataFrame. We can now grab that DataFrame to do further operations on it.
# Grab the nodelist DataFrame
In [6]: df = nodes['nodelist']
In [7]: df
Out[7]:
Node List
name role connected IP Address
0 cas01 controller Yes 10.37.10.172
1 cas02 controller Yes 10.37.10.174
2 cas03 worker Yes 10.37.10.175
3 cas04 worker Yes 10.37.10.176
4 cas05 worker Yes 10.37.10.191
5 cas06 worker Yes 10.37.10.193
# Use DataFrame selection syntax to subset the columns.
In [8]: roles = df[['name', 'role']]
In [9]: roles
Out[9]:
Node List
name role
0 cas01 controller
1 cas02 controller
2 cas03 worker
3 cas04 worker
4 cas05 worker
5 cas06 worker
# Extract the worker nodes using a DataFrame mask
In [10]: roles[roles.role == 'worker']
Out[10]:
Node List
name role
2 cas03 worker
3 cas04 worker
4 cas05 worker
5 cas06 worker
# Extract the controllers using a DataFrame mask
In [11]: roles[roles.role == 'controller']
Out[11]:
Node List
name role
0 cas01 controller
1 cas02 controller
In the code above, we are doing some standard DataFrame operations using expressions to filter the DataFrame to include only worker nodes or controller nodes. Pandas DataFrames support lots of ways of slicing and dicing your data. If you aren't familiar with them, you'll want to get acquainted on the Pandas web site.
When you are finished with a CAS session, it's always a good idea to clean up.
# Close the connection
In [12]: conn.close()
Those are the very basics of connecting to CAS, running an action, and manipulating the results on the client side. You should now be able to jump to other topics on the Python CAS client to do some more interesting work.
You can download the Jupyter notebook for this article on Github at https://github.com/sassoftware/sas-viya-programming/tree/master/communities.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.