BookmarkSubscribeRSS Feed

Using PROC PYTHON to augment your SAS programs

Started ‎05-10-2022 by
Modified ‎06-01-2022 by
Views 25,937

It's been there for a while (2021.1.3 – July 2021), but I've not been able to look at it until recently. PROC PYTHON is a pretty cool feature! It allows a SAS Programmer to include some Python processing logic in her/his code. Not only you can run Python statements from SAS, but you can leverage built-in accelerators (PROC PYTHON callback methods) to streamline the process between SAS and Python: share macro-variables, invoke SAS functions, run SAS code, transfer data back and forth between SAS and Python.

 

My colleagues, Marinela Profi and Wilbram Hazejager wrote a nice article about this SAS-Python integration. I recommend you read it to learn more about how it works behind the scenes, discover the new Python Code Editor available in SAS Studio and how to integrate Python in a SAS Studio Flow.

 

On my side, I just want to take a few examples and see how Python can help SAS in some situations. I am certainly not a Python expert but from a pure technical standpoint, I see the integration of Python in SAS beneficial on at least 3 main points:

 

 

  • When external custom Python functions/methods that achieve a specific business task already exist

 

Some companies have invested a lot in Python resources and assets. If a specific computation/function that has a lot of value for a company already exists in their Python assets and needs to be integrated with SAS or applied to SAS data, it is now possible to call it easily from PROC PYTHON. Customers no longer need to convert Python code to SAS code or design jobs that call different technologies without being governed (by the way, this is an important point highlighted in Marinela and Wilbram's blog).

 

 

  • When specific tasks are not possible in SAS

 

Everything is possible with SAS 😊 ! Directly or indirectly! Running Python seamlessly from SAS opens new doors. I was recently looking for a way to read an Avro file from SAS without using a specific data platform like Hadoop. Just an Avro file in a regular folder. I found a Python package and I was able to use it from SAS very easily:

 

proc python ;
   submit ;

import pandas
from fastavro import reader

fo = open('/data/avro/userdata_avro/userdata2.avro', 'rb')
records = [record for record in reader(fo)]
df = pandas.DataFrame.from_records(records)
df['registration_dttm'] = pandas.to_datetime(df['registration_dttm'])

# Send the data to SAS using PROC PYTHON callback method
ds = SAS.df2sd(df,"userdata")

   endsubmit ;
run ;

The SAS.df2sd callback method enables a user to transfer data from a Pandas DataFrame to a SAS data set.

 

 

  • When specific tasks are easier to do in Python

 

Yes, sometimes, things are easier in other languages. Again, I needed to read a JSON file and wanted to grab information at different levels of the JSON string, retaining some data fields. Of course, I could have used the JSON library engine which allows me to read the same file like I want but it involves multiple steps. You either need to post-manipulate the JSON engine library tables to build the target table or you need to build the right JSON MAP. Not complicated but not direct. PROC DS2 and the JSON package is another an alternative but less straightforward. With Python, you can leverage JSON "traversing" functions and specify directly how you want to build your target Pandas DataFrame and thus your SAS data set:

 

proc python ;
   submit ;

import pandas as pd
import json
f = open("/data/json/smartFridges_brackets.json")
data = json.load(f)
df = pd.json_normalize(data, record_path=['Objects', 'Object', 'InfoItem', 'value'],
   meta=[['Objects', 'Object', 'id'],
       ['Objects', 'Object', 'type'],
       ['Objects', 'Object', 'InfoItem', 'name'],
       ['Objects', 'Object', 'InfoItem', 'description']])
df['dateTime'] = pd.to_datetime(df['dateTime'])
df.rename(columns = {
   'Text':'measure',
   'Objects.Object.id':'deviceId',
   'Objects.Object.type':'deviceType',
   'Objects.Object.InfoItem.name':'measureName',
   'Objects.Object.InfoItem.description':'measureDescription',
   'Objects.Object.InfoItem.name':'measureName'
   }, inplace = True)

# Send the data to SAS using PROC PYTHON callback method
ds = SAS.df2sd(df,"smartFridges")

   endsubmit ;
run ;

 

This was just a few examples to illustrate how Python can complement SAS in some situations. It's good to have another extension to add to our rich set of capabilities. Feel free to comment and share your experiences about using Python in a SAS context. 

 

Thanks for reading.

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Is this only available in SAS Viya? I don't see it listed under SAS 9.4.

@PaigeMiller Yes, only in SAS Viya beginning in October 2021 (with integrated Python editor). It's announced here.

You can find the sample data used in this article at this place: https://github.com/nicrobert/sas_samples/tree/main/python-blog-data

 

Version history
Last update:
‎06-01-2022 01:46 PM
Updated by:

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started