Hi,
How can I connect SAS ODA to Azure Databricks notebook using SASPY? In the past I was able to use SAS ODA with Jupyter notebook successfully using SASPY but not able to do the same using Azure Databricks.
With Jupyter notebook, I was able to store the sascfg_personal.py and connect but not sure how exactly to do that for databricks. This is the code I used to run in jupyter notebook:
Code to find the place where to store sascfg_personal.py file:
import saspy, os
# to check the location where to store the sascfg_personal.py file with required details
print(saspy.__file__.replace('__init__.py', 'sascfg_personal.py'))
For Databricks above code prints the location:
/databricks/python/lib/python3.8/site-packages/saspy/sascfg_personal.py but i don't know how to access that location in databricks.
Code to connect to SAS ODA:
import saspy
sas = saspy.SASsession(java='C:\\Program Files (x86)\\Java\\jre1.8.0_221\\bin\\java.exe', iomhost=['odaws01-usw2.oda.sas.com','odaws02-usw2.oda.sas.com','odaws03-usw2.oda.sas.com','odaws04-usw2.oda.sas.com'], iomport=8591, encoding='utf-8')
sas
Nothing happens when I run this code in databricks. I guess first I've to keep the correct location of the java.exe but again i don't know how to find that in databricks.
Could you please share some details for the connection setup of SAS ODA with Azure Databricks using SASPY. I couldn't find anything on the community or over internet.
Hey, yes %%SAS is a jupyter notebook thing, and doesn't exist on databricks. The link to the saspy issue I posted in the first response was that submitLST() doesn't work on this platform since databricks doesn't have their displayHTML implemented in a way that it can be called from executing code (unlike how jupyter and Zeppelin, at least) work. On those platforms, the submitLST() method will render the output for you. Since that can't be done on databricks, uset the submit() method and then type in their displayHTML() with the LST output from the submit call.
So for submitting code that returns html output, you would do this:
res = sas.submit(sascode)
print(res[‘LOG’])
displayHTML(res[‘LST’])
My understanding is that SASPy converts Python code to SAS code and then runs it. If the Python code connects to an external database then I don't think that will be runnable in SAS as there is no equivalent "Python-like" connection functionality in SAS. You would have to run the Databricks data extraction in Python itself, then convert it to a SAS dataset so SAS can handle it.
SAS has its own way to connect to external databases like Databricks using SAS/ACCESS products, but that has nothing to do with SASPy.
The two limitations with my project are:
- Use python to run the pre existing sas code and that's where saspy comes into picture.
- All our analytics code runs ni azure databricks notebooks and we don't want to use any other IDE.
Are you saying there isn't any support for connectivity between sas oda and azure databricks ?
I don't use SAS On-Demand for Academics but I suspect it does not support connecting to external databases. Perhaps someone with more ODA experience can provide more definitive answers.
I think what you're trying to do is get to SAS ODA from Databricks, right?
Here’s how I am able to get at ODA from databricks. I expect you can do the same, at least;
I don’t have a real account to really store files (sascfg_personal.py, ...) or install packages,
like SASPy, that stay there. If you can do that it might be easier than the following.
Submit these in their own cells:
# this is to install saspy, because it’s not there by default
pip.main(['install','saspy'])
# now import and see that it’s there
import saspy
saspy
# I submit this and then delete the cell; so user/pw aren’t visible in the notebook
import os
os.environ['omruser'] = ‘userid’
os.environ['omrpw'] = ‘pw’
# create a dict for the Configuration Definition (could just code these on SASsession, but this is cleaner looking)
oda = {'java' : '/usr/bin/java',
'iomhost' : ['odaws01-usw2.oda.sas.com','odaws01-usw2.oda.sas.com','odaws01-usw2.oda.sas.com'],
'iomport' : 8591,
'omruser' : os.environ['omruser'],
'omrpw' : os.environ['omrpw'],
}
# this will use ‘default’ in the example config, but you overrode everything so it will use IOM remote
sas = saspy.SASsession(**oda, display='databricks')
sas
When trying to render HTML, you have to use their ‘displayHTML()’ function in each cell you want to
render html output. They provide no way for saspy to render it for you, which stinks, but I can’t control that.
See issue https://github.com/sassoftware/saspy/issues/462 for more info on that.
So for submitting code that returns html output, you would do this:
res = sas.submit(sascode)
print(res[‘LOG’])
displayHTML(res[‘LST’])
Hope this works for you,
Tom
Thank you for your response @sastpw !!
I'm however not able to establish connection following the code which you provided. I replaced the userid and password with my userid and password of SAS ODA and while running the below connection step, I get the error, could you please provide your inputs.
I'm getting error in below step:
# this will use ‘default’ in the example config, but you overrode everything (i overrode already) so it will use IOM remote
sas = saspy.SASsession(**oda, display='databricks')
sas
This is the error which i get:
I was able to setup the connection using below details:
oda = {'java' : '/usr/bin/java',
'iomhost' : ['odaws01-usw2-2.oda.sas.com','odaws02-usw2-2.oda.sas.com'],
'iomport' : 8591,
'omruser' : os.environ['omruser'],
'omrpw' : os.environ['omrpw'],
'encoding' : 'utf-8'
}
I'm going to run some code and hopefully it works.
Thank you very much everyone for all your help so far.
Well, even though you're not connecting, that looks like you have it mostly correct. It shows that java was found and worked, i see that my iomhost had the same 3 hosts; don't know where I copied that wrong from; at least the host was right and it connects for me. The hosts have to be the right one for your region, based upon the doc here: https://support.sas.com/ondemand/saspy.html, and it looks like you changed it to be the US region based on that doc. I see there are now 2 us regions (I only remember there being one). Did you try the other?
So, the error you got is:
`We failed in getConnection The application could not log on to the server "odaws04-usw2.oda.sas.com:8591". The load balancing object spawner timed out. SAS process has terminated unexpectedly.`
which, again, means that you are using the correct configuration, and that java was found and the IOM client (what saspy used java for) was working. The error shows that it just couldn't start up a workspace server for you. I just tried each of the 2 US region lists of iomhost hostnames, and I connected with the one you're trying, yet I got the same error about the object spawner timing out when trying the other set of hosts. So, be sure which region you are supposed to be using, and try each of them to see if you can connect to either. And, if you know you're on the right one, try again if it couldn't connect the first time; could just be a resource/timing issue.
Also, as for the sascfg_personal.py; you're not using that with the code provided; you have the config you are passing in dynamically to the SASsession by passing in that oda dictionary. Python on databricks certainly wont be accessing any files from your PCs C drive. So that file there has no part in this.
If the US region is what you use, try each of the two (even just combine the sets of hosts into one list in your config), since if it fails for one it tries the next in the list - won't hurt to have all of them in case one works.
```
oda = {'java' : '/usr/bin/java',
'iomhost' : ['odaws01-usw2-2.oda.sas.com','odaws02-usw2-2.oda.sas.com',
'odaws01-usw2.oda.sas.com','odaws02-usw2.oda.sas.com','odaws03-usw2.oda.sas.com','odaws04-usw2.oda.sas.com'],
'iomport' : 8591,
...
```
.
haha, that's great. You tried what I was typing in before I could post it. Awesome! Yep, I can't connect with
those 2 hosts, but I can with the 4 in the first list. So good deal; just gotta get to the right set for your id it seems.
Great!
Tom
Thank you @sastpw !!
I've another set of errors now and I think I do see some posts from you on git on those databricks error (I'm not 100% sure if those were your posts though 🙂 )
I'm getting below errors while running these set of codes and not sure if it's limitation of saspy presently or if there is any workaround for that:
1. Code:
%%SAS sas
proc print data=sashelp.cars ;
run;
Error: UsageError: Cell magic `%%SAS` not found.
2. Code:
sas.submitLST("proc print data=sashelp.cars; run;", method='listorlog')
Error: NameError: name 'displayHTML' is not defined
Any hint on the resolution will be very helpful.
Hey, yes %%SAS is a jupyter notebook thing, and doesn't exist on databricks. The link to the saspy issue I posted in the first response was that submitLST() doesn't work on this platform since databricks doesn't have their displayHTML implemented in a way that it can be called from executing code (unlike how jupyter and Zeppelin, at least) work. On those platforms, the submitLST() method will render the output for you. Since that can't be done on databricks, uset the submit() method and then type in their displayHTML() with the LST output from the submit call.
So for submitting code that returns html output, you would do this:
res = sas.submit(sascode)
print(res[‘LOG’])
displayHTML(res[‘LST’])
Attached document sas-setup.docx contains Databricks notepad to run saspy from it’s Community Edition which is free for personal non-commercial use – there is a link in the notebook/word document
I start the document with my sascfg_personal.py configuration, then include a copy of the notebook, which I have called sas-setup
This builds on previous use of saspy in Jupyter notebooks and - after a couple of days fine tuning - doesn't require any software on local machine - just sign up for Databricks Community Edition and SAS OnDemand for Academics (or modify the sascfg_personal.py to use with other SAS connections)
Enjoy!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.