BookmarkSubscribeRSS Feed

Single-Sign-On(SSO) access from SAS Viya Application to Azure DataBricks

Started ‎10-07-2024 by
Modified ‎10-07-2024 by
Views 1,130

The SAS Viya platform supports Single-Sign-On(SSO) authentication access to the Azure DataBricks environment with the CDATA and Databrick JDBC drivers. The SSO is supported for both SPARK LIBNAME and SPARK CASLIB statements against the Databricks database. The LIBNAME and CASLIB statement must include option AUTH=OAUTH2 to use the SSO authenticated connection to DataBricks. The CDATA JDBC driver is distributed with the SAS Viya software deployment and is the default driver for SAS to connect to the DataBricks database.

 

The SAS Viya platform can be configured with Microsoft Entra ID as an OIDC provider for initial user authentication. The steps to configure the SAS Viya platform for using OIDC with Microsoft Entra ID are listed in SAS documentation.

 

In this post, I discuss the critical configuration parts of the Entra ID OIDC Application, Databricks workspace, and user permission at DataBricks when using SSO-based access to the Databricks database.

 

Entra ID OIDC application configuration:

 

When creating an Entra ID OIDC application for Databricks access, apart from standard API permissions, you must include the “Azure DataBricks” API permission with the user_impersonation delegated role.

 

The Azure tenant admin must have consented to these roles and the status must be green with text like granted for specific domain/organization.

 

01_UK_SASViya_SSO_Access_to_Azure_DataBricks_1.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

 

Databricks Workspace Configuration:

 

The PAT (Personal Access Token) Option must be enabled at Databricks Workspace for PAT, SSO, and Service Principal (SP) based access path to work. The user processes connecting to Azure Databricks with SSO or SP authentication using the JDBC driver initiate a PAT token at Databricks instance, even though it’s the SSO or SP connection.

 

02_UK_SASViya_SSO_Access_to_Azure_DataBricks_2.png

 

The Databricks admin user can add additional Entra ID users to the Databricks environment for DataBricks workspace, and database tables access.

 

03_UK_SASViya_SSO_Access_to_Azure_DataBricks_3.png

 

All Entra ID users added to the DataBricks workspace must be part of the admin group for SSO authentication. Behind the scenes, the OAuth process creates a DataBricks PAT token for SSO users. As per the DataBricks configuration, only admin group users can manage and create PAT tokens at DataBricks.

 

04_UK_SASViya_SSO_Access_to_Azure_DataBricks_4.png

 

 

SSO-based access from SAS Compute Server to Azure DataBricks

 

When the SAS Viya platform is configured with Entra ID OIDC application to provide initial user authentication and Databricks Workspace is configured with Entra ID user access permissions. SAS users can use SSO-based access from SAS Compute Server to the Azure Databricks database using SPARK LIBNAME engine with AUTH=OAUTH2 option.

 

The following code describes the SSO-based access from the SAS Compute Server to the Azure Databricks database. Notice that there is no user ID and password but have an option AUTH=OAUTH2 as part of the LIBNAME statement.

 

Code:

 

%let MYDBRICKS=adb-3916581250425199.19.azuredatabricks.net;
%let MYHTTPPATH=sql/protocolv1/o/3916581250425199/0927-154850-ssq0x1nk;

%let MYDRIVERCLASS="cdata.jdbc.databricks.DatabricksDriver";
%let MYCATALOG=hive_metastore;
%let MYSCHEMA=default;

libname CdtSpark spark platform=databricks
  driverClass=&MYDRIVERCLASS
  auth=oauth2
  server="&MYDBRICKS"
  database="&MYSCHEMA"
  httpPath="&MYHTTPPATH"
  port=443
  bulkload=no
  character_multiplier=1
  dbmax_text=50
  properties="Catalog=&MYCATALOG;Other=ConnectRetryWaitTime=20;DefaultColumnSize=1024;"
;

Proc SQL outobs=5;
select * from CdtSpark.baseball_prqt ;
run;quit;

data CdtSpark.prdsal2_sas2 ;
set sashelp.prdsal2 ;
run;

 

Log:

 

80   %let MYDBRICKS=adb-3916581250425199.19.azuredatabricks.net;
81   %let MYHTTPPATH=sql/protocolv1/o/3916581250425199/0927-154850-ssq0x1nk;
82   
83   %let MYDRIVERCLASS="cdata.jdbc.databricks.DatabricksDriver";
84   %let MYCATALOG=hive_metastore;
85   %let MYSCHEMA=default;
86   
87   libname CdtSpark spark platform=databricks
88     driverClass=&MYDRIVERCLASS
89     auth=oauth2
90     server="&MYDBRICKS"
91     database="&MYSCHEMA"
92     httpPath="&MYHTTPPATH"
93     port=443
94     bulkload=no
95     character_multiplier=1
96     dbmax_text=50
97     properties="Catalog=&MYCATALOG;Other=ConnectRetryWaitTime=20;DefaultColumnSize=1024;"
98   ;
NOTE: Libref CDTSPARK was successfully assigned as follows: 
      Engine:        SPARK 
      Physical Name: 
      jdbc:cdata:databricks:Server=adb-3916581250425199.19.azuredatabricks.net;Database=default;HTTPPath=sql/protocolv1/o/3916581250
      425199/0927-154850-ssq0x1nk;QueryPassthrough=true;UseCloudFetch=true;InitiateOAuth=OFF;AuthScheme=AzureAD;OAuthAccessToken=***
      ***;Catalog=hive_metastore;Other=ConnectRetryWaitTime=20;DefaultColumnSize=1024;
99   
100  Proc SQL outobs=5;
101  select * from CdtSpark.baseball_prqt ;
WARNING: Statement terminated early due to OUTOBS=5 option.
102  run;quit;
NOTE: PROC SQL statements are executed immediately; The RUN statement has no effect.
NOTE: The PROCEDURE SQL printed page 1.
NOTE: PROCEDURE SQL used (Total process time):
      real time           1.66 seconds
      cpu time            0.07 seconds
      
103  
104  data CdtSpark.prdsal2_sas2 ;
105  set sashelp.prdsal2 ;
106  run;
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: There were 23040 observations read from the data set SASHELP.PRDSAL2.
NOTE: The data set CDTSPARK.PRDSAL2_SAS2 has 23040 observations and 11 variables.      
107  

 

Results:

 

05_UK_SASViya_SSO_Access_to_Azure_DataBricks_5.png

 

 

SSO-based access from CAS to Azure DataBricks

 

When the SAS Viya platform is configured with Entra ID OIDC application to provide initial user authentication and Databricks Workspace is configured with Entra ID user access permissions. The SSO-based access can be used to load CAS from the Azure Databricks database using SPARK CASLIB with the AUTH=OAUTH2 option.

 

The following code describes the SSO-based access from CAS to the Azure Databricks database. Notice that there is no user ID and password but have an option AUTH=OAUTH2 as part of the CASLIB statement.

 

Code:

 

%let MYDBRICKS=adb-3916581250425199.19.azuredatabricks.net;
%let MYHTTPPATH=sql/protocolv1/o/3916581250425199/0927-154850-ssq0x1nk;

%let MYDRIVERCLASS=cdata.jdbc.databricks.DatabricksDriver;
%let MYCATALOG=hive_metastore;
%let MYSCHEMA=default;

CAS mySession  SESSOPTS=( CASLIB=casuser TIMEOUT=99 LOCALE="en_US" metrics=true);

/* SSO based CASLIB  to Databricks */
caslib Cdtspkcaslib datasource=(srctype='spark',
              platform=databricks,
              auth=oauth2,
              schema="&MYSCHEMA",
              server="&MYDBRICKS",
              httpPath="&MYHTTPPATH",
              driverclass="&MYDRIVERCLASS",
              bulkload=no,
              port=443,
              useSsl=yes,
              charMultiplier=1,
              dbmaxText=50,
              properties="Catalog=&MYCATALOG;DefaultColumnSize=255;Other=ConnectRetryWaitTime=20"
            );

/* Load CAS from DataBricks database table */
proc casutil outcaslib="Cdtspkcaslib" incaslib="Cdtspkcaslib" ;
    load casdata="iot_device" casout="iot_device" replace;
    list tables;
quit;

/* Save CAS data to DataBricks database table */
proc casutil outcaslib="Cdtspkcaslib" incaslib="Cdtspkcaslib";
load data=sashelp.cars casout="cars" replace;
save casdata="cars" casout="cars_sas"  replace;
list files;
quit;

CAS mySession  TERMINATE;

 

Log:

 

90   /* SSO based CASLIB  to Databricks */
91   caslib Cdtspkcaslib datasource=(srctype='spark',
92                 platform=databricks,
93                 auth=oauth2,
94                 schema="&MYSCHEMA",
95                 server="&MYDBRICKS",
96                 httpPath="&MYHTTPPATH",
97                 driverclass="&MYDRIVERCLASS",
98                 bulkload=no,
99                 port=443,
100                useSsl=yes,
101                charMultiplier=1,
102                dbmaxText=50,
103                properties="Catalog=&MYCATALOG;DefaultColumnSize=255;Other=ConnectRetryWaitTime=20"
104              );
NOTE: Executing action 'table.addCaslib'.
NOTE: 'CDTSPKCASLIB' is now the active caslib.
NOTE: Cloud Analytic Services added the caslib 'CDTSPKCASLIB'.


105  
106  /* Load CAS from DataBricks database table */
107  proc casutil outcaslib="Cdtspkcaslib" incaslib="Cdtspkcaslib" ;
NOTE: The UUID 'c5e703f1-0786-fe4d-b8df-19f813db3445' is connected using session MYSESSION.
108      load casdata="iot_device" casout="iot_device" replace;
NOTE: Executing action 'table.loadTable'.
NOTE: Performing serial LoadTable action using SAS Data Connector to Spark.
NOTE: Cloud Analytic Services made the external data from iot_device available as table IOT_DEVICE in caslib Cdtspkcaslib.
NOTE: Action 'table.loadTable' used (Total process time):


111  
112  /* Save CAS data to DataBricks database table */
113  proc casutil outcaslib="Cdtspkcaslib" incaslib="Cdtspkcaslib";
NOTE: The UUID 'c5e703f1-0786-fe4d-b8df-19f813db3445' is connected using session MYSESSION.
114  load data=sashelp.cars casout="cars" replace;
NOTE: The INCASLIB= option is ignored when using the DATA= option in the LOAD statement.
NOTE: Executing action 'table.addTable'.
NOTE: Action 'table.addTable' used (Total process time):
NOTE: SASHELP.CARS was successfully added to the "CDTSPKCASLIB" caslib as "CARS".
115  save casdata="cars" casout="cars_sas"  replace;
NOTE: Executing action 'table.save'.
NOTE: Performing serial SaveTable action using SAS Data Connector to Spark.
NOTE: Cloud Analytic Services saved the file cars_sas in caslib CDTSPKCASLIB.
NOTE: Action 'table.save' used (Total process time):

 

Results:

 

06_UK_SASViya_SSO_Access_to_Azure_DataBricks_6.png

 

Important Links:

 

Authenticate to Databricks on Microsoft Azure by Using Single Sign-On

Scenario: OIDC with Microsoft Entra ID (Azure Active Directory)

LIBNAME Statement for Spark

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎10-07-2024 10:38 AM
Updated by:
Contributors

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags