We’re smarter together. Learn from this collection of community knowledge and add your expertise.

How to resolve failures related to SSH authentication in the SAS® HPA Environment

by SAS Employee alexal ‎07-14-2017 08:28 AM - edited ‎07-14-2017 08:30 AM (1,278 Views)

Symptoms

 

The SAS High-Performance Analytics Environment provides the framework for the distributed SAS® LASR™ Analytic Server, as well as SAS® High-Performance procedures running in distributed mode, such as PROC HPDS2, PROC HPSUMMARY, and PROC HPSAMPLE. The SAS High-Performance Analytics Environment software is contained in the TKGrid directory on each node in the distributed environment. Some SAS administrators might be more familiar with the directory name TKGrid than with the proper name SAS High-Performance Analytics Environment.

 

The processes that make up a distributed SAS LASR Analytic Server or high-performance procedure are started on each node using Secure Shell (SSH). The SSH connections must be established without the need for the user to enter a password. This is often referred to as passwordless SSH. There are many ways to facilitate secure passwordless SSH logins, such as using public-private key pairs or GSSAPIAuthentication (based on Kerberos).

 

If SAS LASR Analytic Server or high-performance procedures are executed by users who cannot perform passwordless SSH logins from SAS to the SAS High-Performance Analytics Environment and between all nodes in the SAS High-Performance Analytics Environment, the errors are shown in log files. These are shown below.

 

Example 1: Problem occurs when starting LASR with PROC LASR CREATE or when executing a high-performance procedure such as PROC HPDS2

 

ERROR: Failed to enumerate available compute nodes in the distributed computing environment.
ERROR: Failed to open TKGrid library.
ERROR: The bridge for SAS High-Performance Analytics encountered an internal error.


Example 2: Problem occurs when attempting to load data to an already started SAS LASR Analytic Server with PROC LASR ADD


ERROR: Failed to load analytic extension for the distributed computing environment.
ERROR: Unable to send machine list to: hpa.example.com

 

Example 3: Problem occurs when starting the SAS® Visual Analytics LASR Monitor


[EXCEPTION: class java.io.IOException: null]
ERROR: Monitor thread failed to start due to configuration errors.
ERROR: No I/O produced by the grid monitor, check your SSH key configuration.

 

Example 4: Problem occurs when using PROC IMSTAT or LIBNAME SASIOLA


ERROR: Failed to load the SAS LASR Analytic Server access extension in the distributed computing environment.
ERROR: The server-side process that communicates with the LASR Analytic Server could not be established. It is not possible to add tables through this LIBNAME.

Diagnosis

 

When a SAS procedure (such as PROC LASR or high-performance procedures like PROC HPDS2) runs in distributed mode, SAS uses SSH to connect to the SAS High-Performance Analytics/TKGrid head node. The SAS High-Performance Analytics/TKGrid head node then uses SSH to connect to the worker nodes. Finally, one random node uses SSH to connect to the other nodes to start processes needed for the SAS procedure.


Because of this randomly chosen node, we need passwordless SSH authentication between all SAS High-Performance Analytics/TKGrid nodes.


Do Not Rely on Passwordless SSH Authentication from Head Node to All Worker Nodes

 

It is often assumed that passwordless SSH authentication from the head node to all worker nodes is enough to run distributed SAS procedures. It is not. Every node must be able to make a passwordless connection to every other node.


Use the steps below to help you determine whether passwordless SSH authentication is working as needed between all nodes.

 

  • Open a command line on the SAS compute server (the system where SAS code is to be executed, such as the Base SAS® system or the WorkspaceServer system). Change to the user that is encountering errors of the sort described above.
  • Use SSH to log on to the SAS High-Performance Analytics/TKGrid head node. Do this even if the head node is the same system as the SAS compute server. SAS performs an SSH login regardless of whether TKGrid is co-located with the compute server system.

    ssh node0.example.com

    You should authenticate without any prompt for a password.
  • Now, on the SAS High-Performance Analytics/TKGrid head node, run the command below. Edit only the path to TKGrid as necessary in your environment. The simsh command is executed twice and hostname is a command provided to simsh. This runs a nested SSH loop that connects from every TKGrid node to every TKGrid node.

    /opt/TKGrid/bin/simsh /opt/TKGrid/bin/simsh hostname

    This also needs to complete without any errors or password prompts. Successful output looks similar to this:

    node0: node2: node2.example.com
    node0: node3: node3.example.com
    node0: node1: node1.example.com
    node0: node0: node0.example.com
    node1: node2: node2.example.com
    node1: node3: node3.example.com
    node1: node1: node1.example.com
    node1: node0: node0.example.com
    node2: node2: node2.example.com
    node2: node3: node3.example.com
    node2: node1: node1.example.com
    node2: node0: node0.example.com
    node3: node2: node2.example.com
    node3: node3: node3.example.com
    node3: node1: node1.example.com
    node3: node0: node0.example.com

The node in the first column connected to the node in the second column. And the third column is the output of the hostname command on the node that we connected to. This is a quick way to confirm that SSH can be used to connect from each node to all other nodes without supplying a password.


If you are unable to execute these steps without being prompted for a password or other errors, proceed to the steps below.

Solution

As noted above, passwordless SSH authentication can be performed in various ways.  However, the most common way is by using SSH key pairs.  An example of setting up keys to facilitate passwordless SSH login is below.

 

  • Run this command to create a passwordless SSH key pair and store it in your home directory.

    ssh-keygen -q -t rsa -N "" -f ~/.ssh/id_rsa

 

  • Copy the public key to the authorized_keys file.

    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

  • Ensure that only the user has access to the authorized_keys file.

    chmod 600 ~/.ssh/authorized_keys

 

  • Copy the ~/.ssh directory to all nodes. This example assumes that your nodes are listed in /etc/gridhosts and the user's home directory is in /home/$USER. Edit as necessary.

    for i in `cat /etc/gridhosts`; do scp -r ~/.ssh $i:/home/$USER; done

    Enter a password when scp connects to each node listed in /etc/gridhosts.

 

  • If your SAS compute server is on a separate system from the TKGrid head node, copy the ~/.ssh directory there as well so that SAS can make the initial connection to the TKGrid head node.

After these steps are complete, the steps in the "Identifying the Problem" section, above,  should succeed without any password prompts or errors.  Once this is confirmed, retry the previously failing SAS LASR Analytic Server or SAS High-Performance Analytics code or task.

 

Using GRIDRSHCOMMAND to Specify an External SSH Executable


In some environments, errors might continue even after you have verified successful passwordless SSH authentication between all nodes. This is most likely to occur in environments in which non-RSA SSH key types are used, key pairs are not stored in $HOME/.ssh, or SSH authentication is performed using GSSAPIAuthentication (based on Kerberos). In these cases, the environment variable GRIDRSHCOMMAND might be needed to specify an external SSH executable, instead of using the SAS built-in SSH module. The environment variable should point to the SSH executable that you want to use and pass options that suppress banner messages and warnings about host key validation, as shown below.


Example of setting GRIDRSHCOMMAND with an OPTIONS statement in SAS code:


options set=GRIDRSHCOMMAND="/usr/bin/ssh -q -o StrictHostKeyChecking=no";

 

Example of setting GRIDRSHCOMMAND in SASFoundation/9.4/sasv9_local.cfg (useful for applying the variable globally and for environments where SAS code is auto-generated, such as SAS Visual Analytics):


-SET GRIDRSHCOMMAND "/usr/bin/ssh -q -o StrictHostKeyChecking=no"


If GRIDRSHCOMMAND is needed with the SAS Visual Analytics LASR Monitor, export the variable in LevX/Applications/SASVisualAnalytics/HighPerformanceConfiguration/LASRMonitor.sh:


export GRIDRSHCOMMAND="/usr/bin/ssh -q -o StrictHostKeyChecking=no"

Contributors
Your turn
Sign In!

Want to write an article? Sign in with your profile.