Many customers with SAS Visual Analytics 7.x deployments have expressed interest in loading data from other existing SAS environments. Portions of this topic have been covered in a YouTube video found here. This video explained that the SASIOLA engine can be used to remotely load data. However, a recent exchange with an international team working on a POC revealed some nuances that may not have been covered in the video or may not have been clear.
The Visual Analytics POC had a requirement that multiple users from an existing source environment be permitted to add tables from an EG or Studio session to a non-distributed LASR server. The implementation team ran into issues when users other than the account that started the LASR server attempted to load data. So this blog will review this scenario as well as look at an alternative. Some of this is old hat at this point, but hopefully there will be a few nuggets along the way.
First a brief recap of the setup methods.
There are two approaches to set up authorization to access the LASR server when using the SASIOLA engine.
As you might expect, if you go the LASR authorization route, the account used to access the LASR server must have proper permissions set on the LASR server, LASR library and related folder in metadata to perform updates. If these are set properly, then the user must set the metadata connection information in their program prior to allocating the LASR library using the SASIOLA engine.
When using LASR file permissions, the user must have appropriate server file permissions to load and access data in LASR. These permissions are managed using LASR signature files. The default server permission settings are determined by the umask setting for the account starting the LASR server. If these are set properly, then passwordless SSH keys must be established between the source and the target (LASR) system. This can be to either distributed or non-distributed LASR server.
The examples in the video referenced earlier was performed using the LASR file permissions method. And the attempts to load data used the same account that started the LASR server. This is the simple case.
For this POC the customer wanted to load data from multiple accounts running in their EG sessions using LASR file permissions. To this end, they established passwordless SSH for each account from the machine where the EG workspace server ran to the machine where the LASR server resided. They confirmed passwordless SSH was working and attempted to load data but it failed with the following misleading error. (The following is a recreation.)
Interestingly this is the same error that is displayed if SSH keys are not established. Notice that it appears that it wrote to the LASR library, but a PROC DATASETS reveals otherwise.
The first thing to check was to make sure the account attempting to write the dataset has write permissions at the LASR signature directory level (e.g. _T_E875B271_7FFFF0987AF8). It was confirmed that the user could write to the directory. It was also confirmed that the user was also in the same group as the lasradm account (sasusers).
And although not shown here, the LASR01 library allocated properly and it was possible to display the datasets. So intuition says it is related to permissions, even though the message doesn't really help.
A review of permissions on the LASR server signature files revealed they are set as follows.
The three LASR server signature file entries relate to read, write and administer at the server level respectively. The read bit, read vertically and highlighted below, is used to determine the permissions for actions for the owner, group and everyone.
So in this scenario the owner, lasradm, has permission to read, write and start/stop LASR (read vertically). However, the group (sasusers) and all other users do not have server write permission. Since the account was in the same group as the owner, sasusers, then the read bit for write signature file should be turned on for users in that group to write to the LASR server.
Enable the read bit of group permissions for the write signature file using the chmod command shown below. Got that? Then display the permissions to ensure it was enabled.
Now if we attempt to load the dataset again it is successful. And upon review of the available signature files we see that three new file signature files have been created for the dataset that was loaded.
And if PROC DATASETS is run, then we "see" that it is available.
In this example the write permission was enabled manually using the chmod command. Obviously this is not the preferred method as it will be reset the next time the LASR server is restarted. In order to modify these settings permanently it is necessary to change the umask for the account starting the LASR server. In a distributed LASR deployment this can be accomplished in the resource.settings file. For non-distributed LASR servers one way to modify the umask is to set it in the WorkspaceServer_usermods.sh script.
One other scenario the customer wanted to test was to use stored processes to update LASR. This is possible, but remember that stored processes use the sassrv account to own the stored process server. Therefore it will be necessary to ensure sassrv is in the same group as the account that starts the LASR server and that passwordless SSH is established for the account. If it doesn't work initially, try restarting the object spawner. Keep in mind that enabling this method opens the door for anyone who can run stored processes to add and delete LASR tables. Probably not an ideal situation.
Tables loaded in this manner are not registered in metadata. To use the table in Visual Analytics, register the table manually or write code to automate the process. And remember that loads are serial. They will only be as fast as the network pipe between the source and target environments.
The customer POC did not choose LASR authorization. And previous testing in the aforementioned video did not cover LASR authorization using the SIGNER option of the LIBNAME statement. So it seemed like a good time to test it.
Since this blog is already fairly long I will keep this short.
If each user is defined in metadata on the target VA system with appropriate authorization, one can simply load data using an example similar to the following. This example was executed on my workstation and the target system was a public LASR server in my test environment. The important point here is that SSH keys were not required.
Of course placing credentials in open code is not advisable. However, obfuscation or elimination of credentials in code is beyond the scope of this blog. The following screenshots show that the table BASEBALL was added to the remote LASR server.
Only brief testing was completed using LASR authorization. Additional testing is left to the reader. This method will likely be preferred when loading data remotely using the SASIOLA engine as it allows the customer to manage access via metadata and doesn't require passwordless SSH. However, like the file permissions method, this method does not register tables and as a result the user must do so manually to use the tables in VA.
Hopefully these examples highlighted some of the differences between the two methods of loading data remotely to a LASR server and revealed a few related tweaks. If you have comments or similar experiences from which others may benefit please add them below.
Security Scenario for SAS® Visual Analytics - SGF paper by Dawn Schrader
SAS Note 56996 - Tips for using the SAS® LASR™ Analytic Server Access Tools