How to Configure a SAS/ACCESS interface in SAS Viya 4

As you probably know, if you have started to work with the new SAS Viya (2020.1 and later), configuring something in the SAS Viya platform with new version is often very different from anything you’ve known with Viya 3.

Many of the simplest, "standard" (change the SASWORK or CAS DISK Cache default location, increase the number of CAS workers,…) or product specific (configure the SAS/ACCESS interface, the QKB, MAS to support Analytic Stores, enable the Embedded process, etc…) configuration tasks, involve changes in various places of the Viya Kubernetes manifest (site.yaml).

The current supported way to implement these changes is to use Kustomize.

Many of these configuration tasks also imply system or infrastructure changes (storage, network configuration, environment

variable). While they were quite natural in Bare OS environment (if you had basic Linux system skills), these changes could appear as much more complex in a Kubernetes world – especially if you are new to it...

Configuring a SAS/ACCESS engine is a great illustration of that. So, let’s have a better look at this specific configuration (with an explanation and even a demo video).

Basics of the SAS/ACCESS configuration

As a reminder with the new SAS Viya (2020.1 and later) there is no need to ensure that your order contains the individual SAS/ACCESS engines,(they are no longer licensed separately, they are included in offerings as applicable).

The diagram below shows, at a high level, an example of how the data can be loaded into CAS from a remote Relational database (such as Oracle or DB2).

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

The CAS containers are running in Pods inside the Kubernetes cluster.

As we can see in this case of CAS serial loading, the database clients must be available "somewhere" for the CAS Controller pod. The CAS controller pod also needs to be able to reach the IP address of the remote data store that is “outside” of the cluster.

Note: This diagram only shows the case of the CAS serial loading, but for CAS multi-node loading with the Data connector or legacy SAS/ACCESS based connection to the data from the Compute server, we would need to give access to access to the storage and remote database to more nodes in the Kubernetes cluster.

So, for most of the SAS/ACCESS Interface engines, to make our connection to the remote data source work we’ll have to :

Provision a persistent storage to place our database client and config files, ODBC drivers, etc..
Set up some environment variables (like ORACLE_HOME or ODBC_INST)
Ensure that the containers running in the pod can resolve the database server hostname on the network

The table below shows how we would do that in Bare OS and how we need to do the same thing in a Kubernetes environment.

Note : The little "K" icon that you see, signals that "Kustomize" is used to perform the configuration task.

Hopefully, by now you have a good idea of the kind of work that is required to configure your SAS/ACCESS engine for SAS Viya (2020.1 and later). Let’s see how it is done for a specific SAS/ACCESS Interface.

(*) See this article from Rob Collum for additional information on HostAliases.

Find the configuration instructions in the doc

When you need to configure something in Viya (whether it is a SAS/ACCESS Interface or something else), you first must understand if the change is required at the Kubernetes level or at the Viya Server, service or application level.

There are various sources of documentation:

Deployment Guide
README files (either in markdown or HTML format in $deploy/sas-bases/README.md or $deploy/sas-bases/docs/index.htm)
Administration Guide

Some "Kubernetes level" configuration (or customization) instructions are provided directly in the Deployment Guide, others can be found in the README files (included in the Deployment Assets .tgz file)

For "application level" configurations, you can consult the administration guide. For example, if you want to know how to configure an autoexec script for the Compute server, you’ll find the instructions there and instead of using kustomize you will directly edit value in the SAS environment manager or use the Command-Line Interface (CLI).

So, if you are not too sure of the "level" of your change, it is probably better to consult and search in both documentation sources.

Regarding the configuration of the SAS/ACCESS engine, you can find the detailed information in the README files, or more precisely in sas-bases/docs/configuring_sasaccess_and_data_connectors_for_sas_viya_4.html if you want to see the HTML version.

As you can see, you can find all the instructions corresponding to the specific SAS/ACCESS engine that you want to configure.

"How to": configure SAS/ACCESS to ORACLE

Here is a demo video showing the steps of the SAS/ACCESS configuration and validation: SAS Demo | How to Configure a SAS/ACCESS Interface in SAS Viya 4

What if I want to configure SAS/ACCESS to Hadoop?

The process to configure the SAS/ACCESS to Hadoop is quite similar. We have the common steps:

Collect the database clients and place them in a shared persistent storage.
Make sure we can contact the Database.
Create Kustomize PatchTransformers to reference the database client location.
Set the environment variables (create and reference ConfigMaps with Kustomize).
Optionally, configure database Host resolution for the pods (reference HostAliases with Kustomize).
Restart CAS, and if needed force the Compute server users to restart their sessions.

The main differences for the connection to Hadoop, are that :

You need to download the Hadoop tracer from the SAS Support FTP Server and run it inside the Hadoop cluster to collect the database clients (HADOOP Jars and config files)
You don’t need to fiddle with the ConfigMapGenerator in Kustomize as the required environment variables (SAS_HADOOP_JAR_PATH and SAS_HADOOP_CONFIG_PATH) can be set directly in the SAS program (or an autoexec) – so you can skip the step 4.
If you need to implement the HostAliases because the Hadoop server names are not registered in the corporate DNS, you have to do it for both the Hive Server and the HDFS Server

Conclusion

Whatever the specific SAS/ACCESS Interface you are configuring, the main steps mostly remain the same, but sometimes with some variants (as we have seen for SAS/ACCESS to HADOOP.)

For example, several SAS/ACCESS engines won’t require the setup of separate volume for the database clients or the addition of ConfigMap because they use Progress DataDirect ODBC drivers that are already included in the Viya installation (refer to the README files for additional details).

We have seen, in this article, a good example of how a specific configuration is done in SAS Viya (2020.1 and later). Sometimes you “only” have to use Environment Manager or use the CLI to change the value of a property in the SAS Configuration Server (Consul).

But quite often, you will also have to implement change at the Kubernetes level, like create a new ConfigMap, provision a new persistent volume and attach it to the Viya pods, create a new load-balancer service, etc…

That's why Installation Engineers and Administrators who have the opportunity to ramp up their Kubernetes and Kustomize skills will really feel more comfortable with these kind of configurations in SAS Viya (2020.1 and later).

Thanks for reading!