BookmarkSubscribeRSS Feed

Direct Access to JupyterHub through SASDrive

Started ‎07-24-2020 by
Modified ‎01-10-2022 by
Views 5,437

With the shift towards the cloud, Federated Identity Management and authentication standards like SAML and OpenID Connect  are becoming increasingly more common. Using these standards, validation of a user’s credentials is performed by a trusted identity provider and are no longer available to an application. SAS Viya supports these authentication standards natively through the SAS Logon Manager (SASLogon) microservice, and Viya web clients can make secondary connections to CAS by leveraging their OAuth token received from SASLogon during initial authentication.

However, when using open source programming clients, in conjunction with Viya SWAT libraries, not having a user’s credentials (password or Kerberos ticket) can make connections to CAS troublesome, as these clients do not directly interact with the SASLogon microservice.

This blog post explores a solution to provide a Python/R SWAT interface to CAS using a Jupyter Notebook programming environment launched directly from SAS Drive, without the need for direct user credentials.

 

Jupyter Notebook & JupyterHub

 

Jupyter Notebook is an open-source web-based interactive programming environment allowing a user to create and share documents containing live code, equations, and visualizations. Jupyter Notebooks are a great way to leverage the power of SAS Viya/CAS using the Python or R SWAT packages. A SAS Kernel for Jupyter Notebooks also exists allowing a user to write, document, and submit SAS programming statements to SAS 9 workspaces or Viya compute sessions.

The JupyterHub project provides a way to scale a notebook to multiple users through a Hub that spawns, manages, and proxies multiple instances of a single-user Jupyter Notebook.

JupyterHub architectureJupyterHub architecture

At a high level JupyterHub consists of the following pieces:

  • Proxy: The public facing part of JupyterHub that uses a dynamic proxy to route HTTP requests to the hub and single-user notebook servers.
  • Hub(Python/Tornado): Manages user accounts, authentication, and coordination of single-user notebooks.
  • Database: SQLite Database containing all of the state of the hub.
  • Authenticator: Controls access to JupyterHub. The default authenticator uses PAM, but this is extremely extensible. A number of different authenticator providers are supported, and custom authenticators can be created.
  • Spawner: Controls how JupyterHub starts the individual notebook server for each user. The default spawner starts a notebook server on the same machine running under their system username, but it is possible to launch notebooks in other ways. For example, it’s possible to launch a notebook on a Kubernetes cluster instead.
  • Single-User Notebook Server(Python/Tornado): A dedicated, single-user, Jupyter Notebook server starts for each user on the system when the user logs in.

 

Solution Overview

 

In order to provide access to CAS from Python/R SWAT without explicit user credentials, we can take advantage of the extensibility of the JupyterHub by using an OAuth token to authenticate to CAS. SASLogon is an OAuth 2.0 Identity Provider and Jupyter Hub extends to use a custom OAuth authenticator. The OAuth Access/Refresh tokens generated by SASLogon during authentication can be made available from the Authenticator to the end-user notebook session as environment variables. In turn, these tokens are used with the OAuth authentication mechanism available to the binary connection to CAS in the R/Python SWAT packages.

At a high level the steps to implement this solution follow:

  1. Register JupyterHub as a SASLogon OAuth client supporting the authorization_code and refresh_token grant types.
  2. Enable CORS support in Viya and add JupyterHub as an allowed origin.
  3. Install Python/R SWAT packages for JupyterHub
  4. Install/Configure an OAuth Authenticator for JupyterHub
  5. Create a Quick Access link in SAS Drive to JupyterHub
  6. Log into JupyterHub by clicking the Quick Access Link and connect to CAS by providing an OAuth token environment variable to the password field in the binary protocol connection constructor from your Jupyter Notebook session.

Note: This blog post will not cover the steps required to set up a JupyterHub server from scratch. We will assume an existing hub already exists and is configured with the default Authenticator.

 

Register JupyterHub as a SASLogon OAuth client application.

 

Use the SAS Viya Consul token to obtain a SASLogon access token in order to register a new application:

As a sudo user, run the following commands from the server where consul lives. Update VIYA_BASE_URL with the base url used to access Viya web applications.

export CONSUL_TOKEN=`sudo cat /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tokens/consul/default/client.token`
 
export VIYA_BASE_URL=https://viya.demo.sas.com
 
curl -k -X POST "$VIYA_BASE_URL/SASLogon/oauth/clients/consul?callback=false&serviceId=app" \
     -H "X-Consul-Token: $CONSUL_TOKEN"

 

The cURL command returns json similar to the following.

{"access_token":"eyJhbGciOiJSUzI1NiIsIm...","token_type":"bearer","expires_in":35999,"scope":"uaa.admin","jti":"de81c7f3cca645ac807f18dc0d186331"}

 

 To assist in later use, create an environment variable from the access_token key returned in the JSON.

export ACCESS_TOKEN=”eyJhbGciOiJSUzI1NiIsIm...”

 

Register a new client using the access token with the authorization_code and refresh_token grant types. Update ‘export JUYPYTERHUB_BASE_URL=’ with the base URL to access JupyterHub, and client_id and client_secret with your own values.

export JUPYTERHUB_BASE_URL=https://jupyterhub.demo.sas.com
 
curl -k -X POST "$VIYA_BASE_URL/SASLogon/oauth/clients" \
       -H "Content-Type: application/json" \
       -H "Authorization: Bearer $ACCESS_TOKEN" \
       -d "{
        \"client_id\": \"your_client_id\", 
        \"client_secret\": \"your_client_secret\",
        \"scope\": [\"openid\"],
        \"authorized_grant_types\": [\"authorization_code\",\"refresh_token\"],
        \"redirect_uri\": \"$JUPYTERHUB_BASE_URL/hub/oauth_callback\",
        \"access_token_validity\": 1296000,
        \"autoapprove\": true
       }"

 

Enable CORS support and add JupyterHub to the allowedOrgins
 

  1. Log into SAS Environment Manager and assume administrator privileges. Note that you must have an account authorized as a SAS Administrator to perform these actions.
  2. Select Configuration on the main SAS Environment Manager page.
  3. Select Definitions from the View menu on the Configuration page.
  4. Select sas.commons.web.security.cors from the list of configuration definitions.
    • Click the New Configuration button if you are setting the CORS options for the first time. Or click the pencil icon to edit an existing CORS definition.
  5. Make sure that allowCredentials is enabled.
  6. Add the JupterHub URL (e.g. https://jupyterhub.demo.sas.com) to the allowOrigins (or * to accept all values). If there is an existing value, you can add it to the list separated by a comma.
  7. Click Save to complete the configuration.
  8. Restart the SASLogon microservice. (sudo systemctl restart sas-viya-saslogon-default)

 

Install Python/R SWAT packages for JupyterHub

 

Installation instructions for the SWAT packages are available on the packages corresponding GitHub pages:

To make packages available to JupyterHub users, you generally install packages system-wide or in a shared environment, depending on how your Hub is configured. See the JupyterHub documentation for details specific to your configuration:  https://jupyterhub.readthedocs.io/

As an example, if you installed JupyterHub at /opt/jupyterhub/ as root, installing the  python SWAT package, would be as simple as the following command:

sudo /opt/jupyterhub/bin/pip3 install swat

 

Install/Configure an OAuth Authenticator for JupyterHub

 

Install the JupyterHub OAuthenticator package.

sudo /opt/jupyterhub/bin/pip3 install oauthenticator

 

The package is available from GitHub at https://github.com/jupyterhub/oauthenticator, and is installed with pip.The full documentation is available at https://oauthenticator.readthedocs.io/en/latest/. We will leverage the “Generic Authenticator” in order to integrate with SASLogon.

Next, add the code located on this sascommunities GitHub page to jupyterhub_config.py. Place the code under the comment section starting with “## Class for authenticating users”. Pay attention to preserve indentation. The code configures the Generic OAthenticator class to use SASLogon as its identity provider.

This step also enables authentication state (auth state) which allows the Authenticator to persist state information related to authentication in the internal JupyterHub database. Because auth state can contain sensitive information, it is encrypted before being stored, so an encryption key is set (JupyterHub uses the Fernet python package for encryption).

The pre_spawn_start method is a hook called before spawning a user’s notebook to pass state information. This method retrieves the OAuth access token from the auth state and passes it to the spawner environment as an environment variable.

The refresh_user method to refreshes the users OAuth tokens and updates the auth state if the token expires. The method executes before spawning a notebook since (refresh_pre_spawn = True), and every 6 hours (auth_refresh_age = 21600); however, the tokens from the auth state are only available to the notebook server once, through the pre_spawn_start method, when the spawner launches the server. Therefore, a token may appear expired if notebook servers run longer than the access token validity period.

When registering the JupyterHub client via curl, the access token validity was set to 1296000 seconds. Implement a cull_idle_servers script to make sure no notebook servers left behind by users that are older than 15 days (1296000 seconds).

Make sure to replace the references of viya.demo.sas.com with the FQDN used to access SAS Viya and jupyterhub.demo.sas.com with the FQDN for JupyterHub. Also, update the client_id and client_secret with the values you registered earlier. Otherwise the code can remain unchanged.

If you receive a CERTIFICATE_VERIFIED_FAILED error, it most likely means that the CA certificate used to sign the Viya HTTP server is not available to Python. You can either uncomment the “c.GenericOAuthenticator.tls_verify = False” line in the code, or add the CA certificate to the JupyterHub server’s truststore.

 

Create Quick Access Link in SAS Drive to JupyterHub

JupyterHub consoleJupyterHub console

 

 

  1. Log into SASDrive and assume the SAS Administrator role.
  2. In the top left-hand corner click New, then Link.
  3. Create the link
    1. Add the JupyterHub server to the URL, e.g. https://jupyterhub.demo.sas.com.
    2. Add a meaningful label, like JupyterHub.
    3. Choose a location to save the link. By default, “My Folder” is selected. You can change this to a location under SAS Content that is available to other users. Be aware of the authorization rules at the location you choose to save the link. If saved in Public, then all users have access to edit or delete the link by default.
    4. Optionally, you can upload an image, like the Jupyter Logo, that displays in the created tile.
    5. Click Ok to create the link.
  4. On the newly created link, right-click and Select Administer, then Quick Access…
  5. From here, select any user(s) or group(s) in your environment. Clicking the person icon will bring up a menu to select and search for existing users and groups.
  6. After selecting a user(s) or group(s), click Update Quick Access. The link is available to all users, or members of the groups selected. Note, if the users are added to a selected group at a later time, the link is not distributed for those users automatically.

 

If you saved the link to a location in SAS Content, available to other users, they can navigate to the location in SAS Content from the “All” tab in SAS Drive. From there they can pin it to their quick access if you do not wish to administrate all users quick access links.

 

Log into JupyterHub and connect to CAS

 

Authentication to JupyterHub is automatic after clicking the Quick Access link from SAS Drive.

You may also log into JupyterHub directly, without going through SAS Drive. In that case, if you are not already authenticated to SAS Viya, you will be redirected to SASLogon (and then to any other downstream identity providers if applicable). Once authenticated to SAS Viya you are redirected back to JupyterHub

To connect to CAS, supply the OAuth token environment variable from the pre_spawn_start method to the password field in the binary protocol connection constructor.

 

Python Example

import os, swat
conn = swat.CAS('cas.demo.sas.com', 5570, password=os.environ.get('ACCESS_TOKEN'))

 

R Example

library(swat)
conn <- swat::CAS('cas.demo.sas.com', port=5570, password=Sys.getenv('ACCESS_TOKEN'))

 

If you have the CASCLIENTDEBUG environment variable set, you should see similar messages in the log:

NOTE: Client is using the oauth identity provider
NOTE: Sent challenge length 1212
NOTE: Received response length 55
NOTE: User viyademo connected to CAS using OAuth 2.

 

Conclusion

In cloud-based environments with federated authentication mechanisms, users may not have credentials available to authenticate to the CAS server using SWAT clients. Leveraging OAuth tokens generated by SAS Logon, may be the only option available. The Jupyter authenticator insulates the programmer from being concerned with how to generate their own OAuth token.

In addition, integrating JupyterHub with SASLogon and SAS Drive is a great way to provide a portal to programming environment for users who wish to leverage the power of the CAS analytic engine, in a more familiar programming language, from the same area where all of their SAS Content and Reports are available and organized.

 

 

Version history
Last update:
‎01-10-2022 02:40 PM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags