BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
pdwivedi08
Fluorite | Level 6

Hi,

 

How can I connect SAS ODA to Azure Databricks notebook using SASPY? In the past I was able to use SAS ODA with Jupyter notebook successfully using SASPY but not able to do the same using Azure Databricks. 

 

With Jupyter notebook, I was able to store the sascfg_personal.py and connect but not sure how exactly to do that for databricks. This is the code I used to run in jupyter notebook:

 

Code to find the place where to store sascfg_personal.py file:

import saspy, os
# to check the location where to store the sascfg_personal.py file with required details
print(saspy.__file__.replace('__init__.py', 'sascfg_personal.py'))

 

For Databricks above code prints the location: 

/databricks/python/lib/python3.8/site-packages/saspy/sascfg_personal.py but i don't know how to access that location in databricks.

 

Code to connect to SAS ODA:

 

import saspy
sas = saspy.SASsession(java='C:\\Program Files (x86)\\Java\\jre1.8.0_221\\bin\\java.exe', iomhost=['odaws01-usw2.oda.sas.com','odaws02-usw2.oda.sas.com','odaws03-usw2.oda.sas.com','odaws04-usw2.oda.sas.com'], iomport=8591, encoding='utf-8')
sas

 

Nothing happens when I run this code in databricks. I guess first I've to keep the correct location of the java.exe but again i don't know how to find that in databricks. 

 

Could you please share some details for the connection setup of SAS ODA with Azure Databricks using SASPY. I couldn't find anything on the community or over internet.

1 ACCEPTED SOLUTION

Accepted Solutions
sastpw
SAS Employee

Hey, yes %%SAS is a jupyter notebook thing, and doesn't exist on databricks. The link to the saspy issue I posted in the first response was that submitLST() doesn't work on this platform since databricks doesn't have their displayHTML implemented in a way that it can be called from executing code (unlike how jupyter and Zeppelin, at least) work. On those platforms, the submitLST() method will render the output for you. Since that can't be done on databricks, uset the submit() method and then type in their displayHTML() with the LST output from the submit call.

 

So for submitting code that returns html output, you would do this:

 

res = sas.submit(sascode)

print(res[‘LOG’])

displayHTML(res[‘LST’])

 

View solution in original post

12 REPLIES 12
SASKiwi
PROC Star

My understanding is that SASPy converts Python code to SAS code and then runs it. If the Python code connects to an external database then I don't think that will be runnable in SAS as there is no equivalent "Python-like" connection functionality in SAS. You would have to run the Databricks data extraction in Python itself, then convert it to a SAS dataset so SAS can handle it.

 

SAS has its own way to connect to external databases like Databricks using SAS/ACCESS products, but that has nothing to do with SASPy. 

pdwivedi08
Fluorite | Level 6

The two limitations with my project are:

 

- Use python to run the pre existing sas code and that's where saspy comes into picture.

- All our analytics code runs ni azure databricks notebooks and we don't want to use any other IDE.

 

Are you saying there isn't any support for connectivity between sas oda and azure databricks ?

SASKiwi
PROC Star

I don't use SAS On-Demand for Academics but I suspect it does not support connecting to external databases. Perhaps someone with more ODA experience can provide more definitive answers. 

sastpw
SAS Employee

I think what you're trying to do is get to SAS ODA from Databricks, right?

 

Here’s how I am able to get at ODA from databricks. I expect you can do the same, at least;

I don’t have a real account to really store files (sascfg_personal.py, ...) or install packages,

like SASPy, that stay there. If you can do that it might be easier than the following.

 

Submit these in their own cells:

 

# this is to install saspy, because it’s not there by default

pip.main(['install','saspy'])

 

# now import and see that it’s there

import saspy

saspy

 

# I submit this and then delete the cell; so user/pw aren’t visible in the notebook

import os

os.environ['omruser'] = ‘userid’

os.environ['omrpw']  = ‘pw’

 

# create a dict for the Configuration Definition (could just code these on SASsession, but this is cleaner looking)

oda      = {'java'      : '/usr/bin/java',

            'iomhost'   : ['odaws01-usw2.oda.sas.com','odaws01-usw2.oda.sas.com','odaws01-usw2.oda.sas.com'],

            'iomport'   : 8591,

            'omruser'   : os.environ['omruser'],

            'omrpw'     : os.environ['omrpw'],

            }

 

# this will use ‘default’ in the example config, but you overrode everything so it will use IOM remote

sas = saspy.SASsession(**oda, display='databricks')

sas

 

 

When trying to render HTML, you have to use their ‘displayHTML()’ function in each cell you want to

render html output. They provide no way for saspy to render it for you, which stinks, but I can’t control that.

See issue https://github.com/sassoftware/saspy/issues/462 for more info on that.

 

So for submitting code that returns html output, you would do this:

 

res = sas.submit(sascode)

print(res[‘LOG’])

displayHTML(res[‘LST’])

 

Hope this works for you,

Tom

 

pdwivedi08
Fluorite | Level 6

Thank you for your response @sastpw  !!

 

I'm however not able to establish connection following the code which you provided. I replaced the userid and password with my userid and password of SAS ODA and while running the below connection step, I get the error, could you please provide your inputs.

 

I'm getting error in below step:

 

# this will use ‘default’ in the example config, but you overrode everything (i overrode already) so it will use IOM remote

sas = saspy.SASsession(**oda, display='databricks')

sas

 

This is the error which i get:

 

Using SAS Config named: default We failed in getConnection The application could not log on to the server "odaws04-usw2.oda.sas.com:8591". The load balancing object spawner timed out. SAS process has terminated unexpectedly. Pid State= (6908, 64000) SAS Connection failed. No connection established. Double check your settings in sascfg_personal.py file. Attempted to run program /usr/bin/java with the following parameters:['/usr/bin/java', '-classpath', '/databricks/python/lib/python3.8/site-packages/saspy/java/saspyiom.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/log4j-1.2-api-2.12.4.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/log4j-api-2.12.4.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/log4j-core-2.12.4.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/sas.security.sspi.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/sas.core.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/sas.svc.connection.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/sas.rutil.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/sas.rutil.nls.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/iomclient/sastpj.rutil.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/thirdparty/glassfish-corba-internal-api.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/thirdparty/glassfish-corba-omgapi.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/thirdparty/glassfish-corba-orb.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/thirdparty/pfl-basic.jar:/databricks/python/lib/python3.8/site-packages/saspy/java/thirdparty/pfl-tf.jar', 'pyiom.saspy2j', '-host', 'localhost', '-stdinport', '46581', '-stdoutport', '46951', '-stderrport', '55619', '-iomhost', 'odaws01-usw2.oda.sas.com;odaws02-usw2.oda.sas.com;odaws03-usw2.oda.sas.com;odaws04-usw2.oda.sas.com', '-iomport', '8591', '-user', 'pdwivedi08@gmail.com', '-lrecl', '1048576', ''] No SAS process attached. SAS process has terminated unexpectedly. Invalid response from SAS on inital submission. printing the SASLOG as diagnostic
SASIOConnectionError: Failure establishing SASsession.
 
 
Do I've to do anything in sascfg_personal.py ? And as of now, this file is stored in my PC but if I've to connect to Azure Databricks do I've to store it some other place or the one in my PC will work ? I have sascfg_personal.py in 'C:\\Program Files (x86)\\Java\\jre1.8.0_221\\bin\\java.exe' and I tried changing java location to this as well but getting same error. 
 
Also, I had changed 'iomhost' : ['odaws01-usw2.oda.sas.com','odaws02-usw2.oda.sas.com','odaws03-usw2.oda.sas.com','odaws04-usw2.oda.sas.com'] to run but it's not working. 
 
Kindly advise anything else which I can try.
pdwivedi08
Fluorite | Level 6

@sastpw 

 

I was able to setup the connection using below details:

 

oda = {'java' : '/usr/bin/java',

'iomhost' : ['odaws01-usw2-2.oda.sas.com','odaws02-usw2-2.oda.sas.com'],

'iomport' : 8591,

'omruser' : os.environ['omruser'],

'omrpw' : os.environ['omrpw'],
'encoding' : 'utf-8'

}

 

I'm going to run some code and hopefully it works.

 

Thank you very much everyone for all your help so far. 

 

sastpw
SAS Employee

Well, even though you're not connecting, that looks like you have it mostly correct. It shows that java was found and worked, i see that my iomhost had the same 3 hosts; don't know where I copied that wrong from; at least the host was right and it connects for me. The hosts have to be the right one for your region, based upon the doc here: https://support.sas.com/ondemand/saspy.html, and it looks like you changed it to be the US region based on that doc. I see there are now 2 us regions (I only remember there being one). Did you try the other?

 

So, the error you got is:
`We failed in getConnection The application could not log on to the server "odaws04-usw2.oda.sas.com:8591". The load balancing object spawner timed out. SAS process has terminated unexpectedly.`

which, again, means that you are using the correct configuration, and that java was found and the IOM client (what saspy used java for) was working. The error shows that it just couldn't start up a workspace server for you. I just tried each of the 2 US region lists of iomhost hostnames, and I connected with the one you're trying, yet I got the same error about the object spawner timing out when trying the other set of hosts. So, be sure which region you are supposed to be using, and try each of them to see if you can connect to either. And, if you know you're on the right one, try again if it couldn't connect the first time; could just be a resource/timing issue.

 

Also, as for the sascfg_personal.py; you're not using that with the code provided; you have the config you are passing in dynamically to the SASsession by passing in that oda dictionary. Python on databricks certainly wont be accessing any files from your PCs C drive. So that file there has no part in this. 

 

If the US region is what you use, try each of the two (even just combine the sets of hosts into one list in your config), since if it fails for one it tries the next in the list - won't hurt to have all of them in case one works.

```

oda = {'java' : '/usr/bin/java',
'iomhost' : ['odaws01-usw2-2.oda.sas.com','odaws02-usw2-2.oda.sas.com',
'odaws01-usw2.oda.sas.com','odaws02-usw2.oda.sas.com','odaws03-usw2.oda.sas.com','odaws04-usw2.oda.sas.com'],
'iomport' : 8591,

...

```

.

sastpw
SAS Employee

haha, that's great. You tried what I was typing in before I could post it. Awesome! Yep, I can't connect with

those 2 hosts, but I can with the 4 in the first list. So good deal; just gotta get to the right set for your id it seems.

Great!

 

Tom

pdwivedi08
Fluorite | Level 6

Thank you @sastpw  !!

 

I've another set of errors now and I think I do see some posts from you on git on those databricks error (I'm not 100% sure if those were your posts though 🙂

 

I'm getting below errors while running these set of codes and not sure if it's limitation of saspy presently or if there is any workaround for that:

 

1. Code:

 

%%SAS sas
proc print data=sashelp.cars ;
run;

 

Error: UsageError: Cell magic `%%SAS` not found.

 

2. Code:

sas.submitLST("proc print data=sashelp.cars; run;", method='listorlog')

 

Error: NameError: name 'displayHTML' is not defined

 

 

Any hint on the resolution will be very helpful.

sastpw
SAS Employee

Hey, yes %%SAS is a jupyter notebook thing, and doesn't exist on databricks. The link to the saspy issue I posted in the first response was that submitLST() doesn't work on this platform since databricks doesn't have their displayHTML implemented in a way that it can be called from executing code (unlike how jupyter and Zeppelin, at least) work. On those platforms, the submitLST() method will render the output for you. Since that can't be done on databricks, uset the submit() method and then type in their displayHTML() with the LST output from the submit call.

 

So for submitting code that returns html output, you would do this:

 

res = sas.submit(sascode)

print(res[‘LOG’])

displayHTML(res[‘LST’])

 

pdwivedi08
Fluorite | Level 6

Thank you very much @sastpw !!

 

The solutions suggested by you worked for me.

johnagalloway0
Fluorite | Level 6

Attached document sas-setup.docx contains Databricks notepad to run saspy from it’s Community Edition which is free for personal non-commercial use – there is a link in the notebook/word document

 

I start the document with my sascfg_personal.py configuration, then include a copy of the notebook, which I have called sas-setup

 

This builds on previous use of saspy in Jupyter notebooks and - after a couple of days fine tuning - doesn't require any software on local machine - just sign up for Databricks Community Edition and SAS OnDemand for Academics (or modify the sascfg_personal.py to use with other SAS connections)

 

Enjoy!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 2252 views
  • 0 likes
  • 4 in conversation