BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pink_poodle
Barite | Level 11

What is the relationship between SAS and Python? How does Python benefit SAS (e.g., Proc IML), and, vice versa, how does SAS benefit Python (e.g., SASpy)? 

1 ACCEPTED SOLUTION

Accepted Solutions
sastpw
SAS Employee

Ok, so now on to some more saspy specifics to help clear up of some of these subsequent questions.

First, I believe that seeing an example of saspy at work will help a lot; kinda the old ‘picture is worth a thousand words’.

Here is an example notebook that walks you though much of the basic functionality of saspy:

https://github.com/sassoftware/saspy-examples/blob/master/SAS_contrib/saspy_example_github.ipynb

 

This notebook contains only Python code. saspy is Python code. This is a Python program, just in a notebook UI format.

 

What you will see is that there are 4 kinds of Python objects in saspy. SASsession, SASdata, Analytic objects (SASstat, SASets, SASml, SASqc …), and the(Analytic) SASresult object.

 

The SASSession object has methods to do appropriately scoped things like see what Libnames are assigned in the session, Assign a libname. Get data set information for a given libref or data set, create a SASdata or Analytic object, …  Copy a SAS Data Set or View into Python as a Pandas Data Frame, or copy a Data Frame to SAS, creating a SAS Data Set and returning you the SASdata object that refers to it to then use to interface with it.

 

The SASdata object refers to a SAS Data Set or View. It is a Python object that is just a reference to a data set in the SAS session. This object has many methods which allow you to do things with that data, from viewing rows, to running summary statistics or producing plots or histograms or sub-setting and filtering.

 

The Analytic objects allow you to run the different SAS Analytic procedures and retrieve all of the tabular, graphic and other results they produce by returning you the SASResults object containing those output artifacts.

 

Everything returned by a saspy method is, of course, a python object; dictionary, list, string, Boolean, or some other object. Again, it’s all python language. So, rather than saying “Once Python connects to SAS, the data flows mostly from SAS to Python. “, I would say that “the results from running saspy methods flow mostly from SAS to Python”, as I think of ‘data’ as actually being data in data sets or data frames. You control if and when data is actually transferred between SAS and Python, and in which direction.

 

I hope looking through that example code will make this more clear. Let me know!

 

Thanks,

Tom

 

View solution in original post

13 REPLIES 13
Reeza
Super User
SAS has saspy which allows python users to connect to SAS data and pass commands to SAS, that it can execute. My understanding would be that the benefits of this, is that it allows a business to use multiple languages, so switching technologies is not a requirement or burden. The reality is development in Python and R is at a pace that an Enterprise company will not be able to match. So working with R/Python/SAS is generally how it works these days.
pink_poodle
Barite | Level 11

Can a person run SAS from Python using SASpy without having SAS installed? 

Reeza
Super User

@pink_poodle wrote:

Can a person run SAS from Python using SASpy without having SAS installed? 


AFAIK, no. But they could access a server version or something not installed locally, ie SAS Viya. 

The requirements and descriptions are here:

 

https://github.com/sassoftware/saspy

Tom
Super User Tom
Super User

@pink_poodle wrote:

Can a person run SAS from Python using SASpy without having SAS installed? 


You cannot run SAS code without running SAS.  But SASpy can connect to the SAS software that you have.  So if you have a server running SAS you should be able to connect the Python running on your machine to that server without having to also install SAS on the same machine where the Python code is running.

novinosrin
Tourmaline | Level 20

I honestly don't know if this is relevant and really sorry if it's not, but i felt it doesn't hurt to share this piece that I received from a SAS magnate whom i dearly follow

 

Here it is for what it's worth:->

 

  • Me: "Guru!, Your opinion on Data science?I have been wondering what the heck is behind the talk of data science. To me, statisticians/statistics seems real and green and 'data science' appeals rather hyped. Can you let me know your thoughts? Also, what do you think of SAS vs Python/R? I would appreciate your response/feedback."

  • Guru: WRT "data science", I totally agree: It's just a hyped-up buzzword. I know a few folks who had always been basically business analysts with some exposure to SAS but with little or no exposure to statistics and whose positions were all of a sudden internally rebranded as "data scientists". E.g., one of them, a friend of mine, was a BA, then became a QA (i.e. "quantatative analyst"), then SQA (the same plus "senior") and has now transmogrified into SDS ("senior data scientist"), all the while having been doing the same kind of work. Business (especially big) has a funny proclivity to using lofty buzzwords. 20 years ago, when my team was doing data management for Citi, we'd say we were "pulling data" - the activity nowadays designated as "creating an ETL". Waiters have become servers; steward(esse)s - flight attendants, and so forth ad infinitum. On the other note, I don't know any R, so can't offer any opinion on that. I do know a bit of Python and have actually played with it a little to see how to use it to do the same things I know how to do with SAS. Leaving aside the front end and graphics (where I assume Python has an edge), I've see some things done easier with it than with SAS and some - in a much more convoluted way. And as I'm chiefly a back-end kind of guy, I'd never attempt to do heavy data lifting with Python, as the performance difference between it (interpreted) and SAS (compiled) is rather staggering. Other than that, I think that Python is a very good general programming language with decently clear and non-verbose syntax, good logging, etc. and that being well-versed in it is great. But I wouldn't pitch it against SAS in terms of "A vs B", sort of like either-or; rather, methinks they form a good complementary pair to know. Just my $.02 ;). Best

pink_poodle
Barite | Level 11

Thank you for sharing. Parts of his response are definitely relevant. 

     Leaving aside the front end and graphics (where I assume Python has an edge), I've see some things done easier with it than with SAS and some - in a much more convoluted way. And as I'm chiefly a back-end kind of guy, I'd never attempt to do heavy data lifting with Python, as the performance difference between it (interpreted) and SAS (compiled) is rather staggering.

     Other than that, I think that Python is a very good general programming language with decently clear and non-verbose syntax, good logging, etc. and that being well-versed in it is great. But I wouldn't pitch it against SAS in terms of "A vs B", sort of like either-or; rather, methinks they form a good complementary pair to know

This is the part that needs elaboration. With Python at the front end and SAS at the back end, what is their intersection, and what is the benefit of this intersection to Python as well as SAS?

sastpw
SAS Employee

Well, I can try to elaborate on that aspect, though I’m sure I won’t do it complete justice.

 

From a mechanical point of view, the intersection is that the saspy module has 4 types of python objects, each with an appropriate sets of methods to let you connect to SAS, access SAS data, perform SAS analytics and access the tables, charts, plots, graphs,… that SAS produces from the various analytic and base procedures. All of this is with simple, pure Python syntax. The other part of this intersection is the ability to move data and variables between Python and SAS, as necessary. The default case is that there is no data movement between the two. Data being analyzed or queried, or viewed, stays in SAS. Only the results of what you do are returned to Python. However, the ability to move data between SAS Data Sets and Pandas Data Frames (either direction), as well as Python variables and SAS Marco Variables (again, both directions) is available with simple methods.

 

As to the benefit, that’s less tangible, but nonetheless exists. From my time working with Python, I’ve been very impressed with the speed and ease with which I can accomplish my programming tasks. Sure, I prefer 370 Assembler (Mainframe) to other languages, and I’ve mostly written in C for a long time, but I can whip out a dozen lines of Python to prototype something that would take me 20 times as long to do in C. So what about the SAS language, well I find it similar, because I’m not a SAS programmer. I write SAS. That’s what I program, not program ‘in’.  Calling a simple saspy method to do something complicated in SAS, where the SAS code required to accomplish that is much more than a single line of code, is more convenient for me.

 

So, Python is a ubiquitous language that is extremely easy to use and highly functional. It isn’t designed, however, to do what SAS does. Referring to the highlighted works in your post; back-end, heavy data lifting, performance. The SAS system is built to be all of those and more. It is the production backbone of many businesses. So, Python being Python, and SAS being SAS, I think that a seamless integration between the two makes clear sense in many applications and, to also quote the post, ‘ form a good complementary pair’.

 

Rapid development of business logic processes with the python language, leveraging the highly performant data processing and analytic production back end of SAS, makes for a pretty compelling picture in my head. Of course, this is nothing more than my opinion.

 

Oh, and as to the question about needing SAS to run saspy, as others have already mentioned, yes, you need SAS, but it doesn’t have to be installed on the same machine as saspy (it can be). SAS can be accessed remotely, which again, speaks to the aspect of it being a production backend server.

 

I’m sure this doesn’t completely answer your question, but I hope it helps continue the discussion 😊

 

Thanks,

Tom

 

 

pink_poodle
Barite | Level 11

Thank you, Tom, for an interesting and thorough reply. It is fascinating that you are using SAS methods in Python to develop SAS. This is clearly an example of Python's benefit to SAS. So, from what you are saying,

Data being analyzed or queried, or viewed, stays in SAS. Only the results of what you do are returned to Python.

Python doesn't usually modify SAS datasets, but Python uses the results of queries on SAS datasets, even to modify the framework of SAS software. 

 

Now, what is the benefit of SAS to Python? One example is a Python data entry screen that conducts data analysis using  SAS methods and stores the results in SAS datasets. I recently read an article with an example of such interaction between Python and SAS, but using a sas-esppy* instead of saspy Python package.

 

* ESP - event stream processing -  process in which large streams of real-time data are processed with the sole aim of extracting insights and useful trends out of it

sastpw
SAS Employee

Ah, reading your response to mine, I think I understand a little better what you were getting at regarding Python benefiting SAS.

With your reference to sas-esppy, I think that most of what I was saying was in that same vain; SAS benefitting python.

I would say saspy is very similar in that regard; allowing Python programs to leverage the SAS backend: SAS benefitting Python.

I’m a little less clear on a couple statements about Python benefiting SAS. I not sure that my response didn’t confuse that.

These two, I’m not 100% sure on: using SAS methods in Python to develop SAS, Python uses the results […] to modify the framework of SAS software

 

Let me state a few things differently and see if that changes what you’re thinking. All of the various parts of SAS I’ve developed are written in C (or assembler, way back when). I haven’t actually developed any SAS components (themselves) written in Python. That being said, I’ve researched, prototyped and/or productionized a variety of SAS components which interact with Python in recent years. Both Python interacting w/ SAS and SAS interacting with Python. In these cases, the actual SAS parts of these are still written in C. Where I think I can answer your Python benefiting SAS is maybe with the following.

 

I have used Python itself to quickly prototype ‘proof of concept’s for some of these various Python/SAS interactions. The parts which would be the SAS components, that would eventually be written in C, I could use Python to show the POC or to quickly prove or disprove my design ideas, whereas having to actually implement the C code and interfaces to Python just to prove the concept would take significantly longer.

 

One of the other things I find that I like about Python is the OS level interfaces. For a ‘high level language’ it has ‘low level’ interfaces. That’s where I can actually use it to prototype parts that will eventually be written in C. I can prove out concepts with Process control, Inter process communication, I/O, and other low level concepts which take much more time to develop in C than in Python. But anything I can code in Python, I know I can accomplish in C; it just takes longer to write. So, in that way I think I can answer your question about how Python benefits SAS.

 

Does that make more sense with regard to the question you are asking about the benefits? Hopefully I’m now on the same page as you. But, let me know! This is an interesting discussion.

 

Thanks,

Tom

pink_poodle
Barite | Level 11

Ok, the second part is much more clear now. To summarize some of the discussion,

 

1) The benefits of SAS to Python. SAS benefits Python by providing Python packages, such as SASpy and SAS-esppy, that allow the Python programs to access and analyze SAS datasets.

-> Can Python create and populate a SAS data set?

 

2) The benefits of Python to SAS. Python can be used to test components of SAS framework, which are commonly written in C. Python also has a nice interface to OS systems. A level comparison to point 1)  might also benefit from an example of Python to SAS data flow.

 

Once Python connects to SAS, the data flows mostly from SAS to Python. During this process, SASpy package helps Python convert SAS data sets to Pandas data frames, because Python cannot work with data sets directly. 

 

-> Does data set to data frame conversion make the reverse, Python to SAS, data flow more difficult to accomplish? You mentioned that Python can update SAS macro variables, what about SAS datasets?

 

 

sastpw
SAS Employee

Ok, so now on to some more saspy specifics to help clear up of some of these subsequent questions.

First, I believe that seeing an example of saspy at work will help a lot; kinda the old ‘picture is worth a thousand words’.

Here is an example notebook that walks you though much of the basic functionality of saspy:

https://github.com/sassoftware/saspy-examples/blob/master/SAS_contrib/saspy_example_github.ipynb

 

This notebook contains only Python code. saspy is Python code. This is a Python program, just in a notebook UI format.

 

What you will see is that there are 4 kinds of Python objects in saspy. SASsession, SASdata, Analytic objects (SASstat, SASets, SASml, SASqc …), and the(Analytic) SASresult object.

 

The SASSession object has methods to do appropriately scoped things like see what Libnames are assigned in the session, Assign a libname. Get data set information for a given libref or data set, create a SASdata or Analytic object, …  Copy a SAS Data Set or View into Python as a Pandas Data Frame, or copy a Data Frame to SAS, creating a SAS Data Set and returning you the SASdata object that refers to it to then use to interface with it.

 

The SASdata object refers to a SAS Data Set or View. It is a Python object that is just a reference to a data set in the SAS session. This object has many methods which allow you to do things with that data, from viewing rows, to running summary statistics or producing plots or histograms or sub-setting and filtering.

 

The Analytic objects allow you to run the different SAS Analytic procedures and retrieve all of the tabular, graphic and other results they produce by returning you the SASResults object containing those output artifacts.

 

Everything returned by a saspy method is, of course, a python object; dictionary, list, string, Boolean, or some other object. Again, it’s all python language. So, rather than saying “Once Python connects to SAS, the data flows mostly from SAS to Python. “, I would say that “the results from running saspy methods flow mostly from SAS to Python”, as I think of ‘data’ as actually being data in data sets or data frames. You control if and when data is actually transferred between SAS and Python, and in which direction.

 

I hope looking through that example code will make this more clear. Let me know!

 

Thanks,

Tom

 

pink_poodle
Barite | Level 11

Ok, I see the title "Now round trip the Data Frame back to a SAS Data Set" in the link you provided. This is a useful link and a nice explanation, thank you.

sastpw
SAS Employee

You're more than welcome.Good discussion. And, I just updated that example notebook a little, I added a few newer methods showing some other neat features. saspy is constantly being extended, so there's always more to learn 🙂

 

Thanks!

Tom

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 7248 views
  • 10 likes
  • 5 in conversation