BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
msjhicks
Calcite | Level 5

I'm trying to get the LASR Server started (vs 7.3 on REHL).  We did have it running once a month or so again when first installed but something's changed since then.  I get the error: ERROR: Failed to load analytic extension for the distributed computing environment.  I am starting as the user sas, which has SSH (passwordless) keys set up between the main server and 3 distributed servers for VA. 

 

I found this article: http://support.sas.com/kb/60/126.html and was able to verify the SSH works for sas between nodes (/opt/TKGrid/bin/simsh /opt/TKGrid/bin/simsh hostname).  At the bottom of the article was info & I confirmed I'm using RSA keys, and keys are stored in $HOME/.ssh.  I think this may be true: SSH authentication is performed using GSSAPIAuthentication (based on Kerberos) so I tried the next tip (setting GRIDRSHCOMMAND) - tried in the sasv9_local.cfg.  This seemd to hang the start process (rather than return an error message).  Any ideas to debug are welcome.  Do have a ticket open also.

1 ACCEPTED SOLUTION

Accepted Solutions
alexal
SAS Employee

Update:

 

The problem has been identified and resolved. Hot Fix has been pushed to SAS 9.4 M5. If you need a fix for SAS 9.4 M3 or M4, please open a track and let me know the number.

 

Special thanks go to @msjhicks and her colleagues for help with debugging the problem!

View solution in original post

20 REPLIES 20
alexal
SAS Employee

@msjhicks,

 

Please send an output from following commands. Do not forget to replace <PATH_TO> & <FQDN_OF_LASR_HEAD_NODE>.

 

export TKPATH=/<PATH_TO>/TKGrid/lib:/<PATH_TO>/TKGrid/bin
export GRIDHOST=<FQDN_OF_LASR_HEAD_NODE>
export GRIDINSTALLLOC=/<PATH_TO>/TKGrid
export GRIDRSHCOMMAND="/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no"
/<PATH_TO>/TKGrid/bin/checknodes /<PATH_TO>/TKGrid/grid.hosts
/<PATH_TO>/TKGrid/bin/tkgridperf
/<PATH_TO>/TKGrid/bin/tkgridmon

msjhicks
Calcite | Level 5

I believe I set all the env vars okay (including the GRIDHOST which I didn't include)

env | grep GRID

GRIDRSHCOMMAND=/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDntials=yes -o RSAAuthentication=no
GRIDINSTALLLOC=/opt/sasva/TKGrid

 

env | grep TKPATH
TKPATH=/opt/sasva/TKGrid/lib:/opt/sasva/TKGrid/bin

 

But I get this when I try to run checknodes I get the following messages (4 times - just showing first "set") and failure with RC 255

 

/opt/sasva/TKGrid/bin/checknodes /opt/sasva/TKGrid/grid.hosts

unknown option --
usage: ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
           [-D [bind_address:]port] [-E log_file] [-e escape_char]
           [-F configfile] [-I pkcs11] [-i identity_file]
           [-L [bind_address:]port:host:hostport] [-l login_name] [-m mac_spec]
           [-O ctl_cmd] [-o option] [-p port]
           [-Q cipher | cipher-auth | mac | kex | key]
           [-R [bind_address:]port:host:hostport] [-S ctl_path] [-W host:port]
           [-w local_tun[:remote_tun]] [user@]hostname [command]

 

The other 2 commands just hang

Janette

msjhicks
Calcite | Level 5

Also, had a question after talking to someone here more knowledgeable.  In our main server we have in /etc/ssh/sshd_config:

KerberosAuthentication yes

GSSAPIAuthentication yes
GSSAPICleanupCredentials no

 

We are using kerberos and authenticating back to an LDAP server (using pam and other processes)

 

However, on our 3 nodes (for VA High Perf. Analytics) we have all Kerberos options commented out but still have:

GSSAPIAuthentication yes
GSSAPICleanupCredentials no

 

Should that GSSAPIAuthentication on those nodes by "no"?

Thanks.

Janette

 

 

 

alexal
SAS Employee

@msjhicks,

 

The value for GRIDRSHCOMMAND must be in double quotes.

 

>> Should that GSSAPIAuthentication on those nodes by "no"?

No.

msjhicks
Calcite | Level 5

Hi.  I did the export as you had typed it - should I do something different?  Sorry if I'm missing what you meant.  Thank you,

Janette

 

export GRIDRSHCOMMAND="/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no"

 

env | grep GRIDRSH
GRIDRSHCOMMAND=/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no

msjhicks
Calcite | Level 5

OK, I thought I'd try putting single quotes around so it kept the quotes.  This time I get:

 

/opt/sasva/TKGrid/bin/checknodes /opt/sasva/TKGrid/grid.hosts
Machine busas.binghamton.edu responded with failure. RC: 20
Machine sasproc01.binghamton.edu responded with failure. RC: 20
Machine sasproc02.binghamton.edu responded with failure. RC: 20
Machine sasproc03.binghamton.edu responded with failure. RC: 20
Num Returned: 4
Num Failed: 4

 

/opt/sasva/TKGrid/bin/tkgridperf
Unable to enumerate grid.

 

/opt/sasva/TKGrid/bin/tkgridmon
Unable to enumerate grid.
ERROR: Failed to execute command: "/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthenticat
ion=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no" busas.binghamton.edu export TKMPI_INFO=""; /opt/sasva/TKGrid/t
kmpirsh.sh -np 1 /opt/sasva/TKGrid/tkmpinodelib.sh busas.binghamton.edu 65206 tkegenum
Timeout waiting for Grid connection.

 

 

 

 

alexal
SAS Employee

@msjhicks,

 

Please export these variables once again and run these commands:

 

strace -fv -s 1000 -o /tmp/tkgridperf.strace.log /opt/sasva/TKGrid/bin/tkgridperf
strace -fv -s 1000 -o /tmp/tkgridmon.strace.log /opt/sasva/TKGrid/bin/tkgridmon

 

 

Attach log files for further investigation. Also, make sure that you have an entry for localhost in /etc/hosts on each machine in your TKGrid cluster.

 

Also, what happens if you run this command?

 

/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no busas.binghamton.edu export TKMPI_INFO=""; /opt/sasva/TKGrid/tkmpirsh.sh -np 1 /opt/sasva/TKGrid/tkmpinodelib.sh busas.binghamton.edu 65206 tkegenum
msjhicks
Calcite | Level 5

Hi. 

We do have a localhost entry in /etc/hosts on each machine.

 

When I run this I get errno 111:

/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no busas.binghamton.edu export TKMPI_INFO=""; /opt/sasva/TKGrid/tkmpirsh.sh -np 1 /opt/sasva/TKGrid/tkmpinodelib.sh busas.binghamton.edu 65206 tkegenum


Failed to connect to 'busas.binghamton.edu', errno: 111

 

I emailed you privately about attaching the logs - let me know.

Thanks,

Janette

alexal
SAS Employee

@msjhicks,

 

Error 111 means connection refused. Are you sure that SSHD daemon is up and running on busas.binghamton.edu? Send these log files to the track and in the message state that they are for Alex. Also, send /var/log/secure and /var/log/messages from busas.binghamton.edu.

dave_foster
Fluorite | Level 6

Hello,

 

We have been following this thread and are experiencing the same results.  What was the final resolution?

alexal
SAS Employee

@dave_foster,

 

We are still working on it. To workaround the problem you can remove a dns entry from /etc/resolv.conf.

dave_foster
Fluorite | Level 6

Thanks.  I believe there are a number of entries in this folder, am I looking for something specific?

 

Also you sent me the SAS ticket number for this, so I can have my customers SAS AE keep tabs on it?

alexal
SAS Employee

@dave_foster,

 

A host entry. I will have another debug session next week, so I will keep everyone posted. 

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

CLI in SAS Viya

Learn how to install the SAS Viya CLI and a few commands you may find useful in this video by SAS’ Darrell Barton.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 20 replies
  • 4699 views
  • 1 like
  • 4 in conversation