By now SAS customers are aware of the announcements made at SGF 2016 regarding SAS Viya. As customers attain the new software, one can anticipate the questions that will arise regarding how customers will integrate the new Viya platform with their existing environments. To connect these distinct environments SAS introduced the Viya bridge. In this blog we will review the steps required to establish and validate this bridge using SAS/CONNECT.
Those who have used SAS/CONNECT in the past will find most of the following content to be "old news". But for those who have minimal experience or haven't tested with SAS/CONNECT in a while, you may find it useful to refresh your skills.
Since SAS/CONNECT is the mechanism used to communicate between environments, it stands to reason that SAS/CONNECT is required in both environments. SAS/CONNECT is not included with SAS Visual Data Mining and Machine Learning, so it will have to be licensed. If licensed it will be included in the Ansible playbook and installed during deployment. For current SAS 9.4 customers, many may have licensed SAS/CONNECT in their existing environments. For those who have not, they will have to license SAS/CONNECT to use the bridge.
If the customer's existing SAS 9.4 environment does not contain SAS/CONNECT, they will need to add it by following these steps.
It should be noted that adding SAS/CONNECT to an existing customer environment will likely incur additional licensing fees.
Customers should also be aware of the time required to add SAS/CONNECT to an environment. This time should include requesting the order update, downloading the depot, updating the plan file and planning for the update of their current environment, including the downtime required and validation testing.
At this point SAS/CONNECT should exist in both environments and a SAS/CONNECT spawner should be running in both environments. Our test environment consisted of the following, where SAS/CONNECT was added to the SAS 9.4 environment.
You may have seen some videos about the bridge and in their testing they used DI Studio. This is one way to test if available, but since not every customer will have DI Studio, testing in this blog uses SAS Studio and code submitted directly. The following diagram shows the key components used during testing. Solutions based on the SAS 9.4 environment have many more components but are not shown for clarity. The LASR server is shown to indicate it is the predecessor to CAS, but was not used as part of testing.
The following tests were performed to ensure that the bridge is working properly.
It is a good idea to ensure the Connect server works within the VDMML environment before testing across environments. As noted earlier, our testing originates within SAS Studio. When SAS Studio starts it request a SAS workspace server via the object spawner. Once started, code used to test SAS/CONNECT is then executed in the workspace server which in turn contacts the SAS/CONNECT spawner to initiate a SAS/CONNECT server using the signon statement. Once the SAS/CONNECT server is running a portion of code is submitted to the server using rsubmit. This flow is shown in the diagram below.
Sample code used to test this flow is shown here.
This code simply performs a PROC CONTENTS of the SASHELP.CARS dataset, runs a simple DATA step to subset it and then uses PROC PRINT to print the output dataset. This obviously isn't too exciting. But it is a step that should be taken in a new VDMML deployment to validate the SAS/CONNECT server. Notice that this code looks the same as code that would be submitted to a SAS 9.4 environment.
If the connection successful you will see the following messages in the SAS log on the Viya machine. Note the SAS release number is V.03.00.
Now that we've verified the SAS/CONNECT server on Viya is working, we can test across environments by submitting code from SAS 9.4 to the VDMML machine to load data into CAS and perform some basic tasks on the data. In this scenario we start with SAS Studio on the SAS 9.4 environment, which initiates a connection to a workspace server via the object spawner. Then code submitted in the workspace server requests a connection on the VDMML machine via the SAS/CONNECT spawner. Once the SAS/CONNECT server is started a connection is established across machines and code may be submitted. Code submitted via rsubmit then requests to create a CAS library via a CAS session, uploads a table from SAS 9.4, and performs some basic analytics on the CAS table. The following diagram shows the servers involved and the initial flow of communications.
The sample code for this test can be found here.
The rsubmit portion of the code creates a CAS library, uploads the SASHELP.HEART dataset to the CAS server, and then executes PROC MDSUMMARY to perform some basic statistics on the CAS table. The output of this PROC is written back to the CAS server. Finally a PROC DATASETS is performed on the CAS library to show the datasets loaded in prior steps.
There are two key parts of this code to note. Verifying we can upload a SAS dataset from the local workspace session and then performing CAS procedures on the data. For clarity the HEART dataset was copied to the local WORK library. Here we show the log from the data upload.
Once the data is loaded into CAS, PROC MDSUMMARY is used to perform simple statistics on several fields and results are sent back to CAS. Notice the message indicating the CAS server processing time.
If we look at the CAS library we should see two datasets: one that was uploaded and one that was created via PROC MDSUMMARY.
By default you won't be able to view the uploaded dataset from a SAS Studio session on Viya. This is expected with uploaded datasets and is related to the scope of the dataset. The scope can be either global or session, and the default is session. Session level scope means only the SAS/CONNECT session is aware of the uploaded dataset. In order to view the dataset from the Viya environment, you will need to add the PROMOTE= option to the OUT library of PROC UPLOAD in the SAS 9.4 environment and then perform a refresh of the library in the Viya environment. A similar option is available with PROC CASUTIL as well. The code for PROC UPLOAD would look like this.
For more information on the scope of sessions, please see Gerry Nelson's blog here.
If the prior tests worked then then it is expected that tests originating in the VDMML environment will be successful as well. As in earlier examples testing begins in SAS Studio. SAS Studio will initiate a connection to the object spawner which starts a workspace server and then begins interaction with a workspace server. Code submitted from SAS Studio to the workspace server then attempts to signon to a SAS/CONNECT server in the SAS 9.4 environment. It also allocates a CAS library on the CAS server locally and subsequently downloads data directly from the SAS/CONNECT server in the SAS 9.4 environment.
The sample code for this test can be found here.
In this test we ensure that we can download data from the SAS 9.4 environment into CAS and then perform some basic statistics using a CAS-based PROC. Similar to the prior test, the download that is performed writes directly to the CAS library. Once the data is loaded into CAS it is available for in-memory analytics. However, because the CAS library is local to the workspace server, analytic code interacting with the CAS datasets runs outside of the code submitted using rsubmit to the SAS/CONNECT server on the SAS 9.4 system. If we look at the output of PROC DATASETS we see the downloaded dataset and the summary dataset in the CAS library.
As a result we have successfully tested the bridge from a Viya environment to a SAS 9.4 environment.
As you can see there is little that is new here. The mechanics of connecting to a remote machine is the same for SAS 9.4 and SAS Viya. Some of the technology behind the scenes may have changed for SAS Viya, but to the end user they will see little difference. Becoming familiar with the new CAS statements and language will take some time, but most everything else should look familiar. Also remember that if SAS/CONNECT is not currently licensed it will need to exist in both environments, so plan accordingly.