Architecting, installing and maintaining your SAS environment

Trying to Start LASR Server get Failed to load analytic extension for the distributed computing env.

Reply
Occasional Contributor
Posts: 6

Trying to Start LASR Server get Failed to load analytic extension for the distributed computing env.

I'm trying to get the LASR Server started (vs 7.3 on REHL).  We did have it running once a month or so again when first installed but something's changed since then.  I get the error: ERROR: Failed to load analytic extension for the distributed computing environment.  I am starting as the user sas, which has SSH (passwordless) keys set up between the main server and 3 distributed servers for VA. 

 

I found this article: http://support.sas.com/kb/60/126.html and was able to verify the SSH works for sas between nodes (/opt/TKGrid/bin/simsh /opt/TKGrid/bin/simsh hostname).  At the bottom of the article was info & I confirmed I'm using RSA keys, and keys are stored in $HOME/.ssh.  I think this may be true: SSH authentication is performed using GSSAPIAuthentication (based on Kerberos) so I tried the next tip (setting GRIDRSHCOMMAND) - tried in the sasv9_local.cfg.  This seemd to hang the start process (rather than return an error message).  Any ideas to debug are welcome.  Do have a ticket open also.

SAS Employee
Posts: 285

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

@msjhicks,

 

Please send an output from following commands. Do not forget to replace <PATH_TO> & <FQDN_OF_LASR_HEAD_NODE>.

 

export TKPATH=/<PATH_TO>/TKGrid/lib:/<PATH_TO>/TKGrid/bin
export GRIDHOST=<FQDN_OF_LASR_HEAD_NODE>
export GRIDINSTALLLOC=/<PATH_TO>/TKGrid
export GRIDRSHCOMMAND="/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no"
/<PATH_TO>/TKGrid/bin/checknodes /<PATH_TO>/TKGrid/grid.hosts
/<PATH_TO>/TKGrid/bin/tkgridperf
/<PATH_TO>/TKGrid/bin/tkgridmon

Occasional Contributor
Posts: 6

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

I believe I set all the env vars okay (including the GRIDHOST which I didn't include)

env | grep GRID

GRIDRSHCOMMAND=/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDntials=yes -o RSAAuthentication=no
GRIDINSTALLLOC=/opt/sasva/TKGrid

 

env | grep TKPATH
TKPATH=/opt/sasva/TKGrid/lib:/opt/sasva/TKGrid/bin

 

But I get this when I try to run checknodes I get the following messages (4 times - just showing first "set") and failure with RC 255

 

/opt/sasva/TKGrid/bin/checknodes /opt/sasva/TKGrid/grid.hosts

unknown option --
usage: ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
           [-D [bind_address:]port] [-E log_file] [-e escape_char]
           [-F configfile] [-I pkcs11] [-i identity_file]
           [-L [bind_address:]port:host:hostport] [-l login_name] [-m mac_spec]
           [-O ctl_cmd] [-o option] [-p port]
           [-Q cipher | cipher-auth | mac | kex | key]
           [-R [bind_address:]port:host:hostport] [-S ctl_path] [-W hostSmiley Tongueort]
           [-w local_tun[:remote_tun]] [user@]hostname [command]

 

The other 2 commands just hang

Janette

Occasional Contributor
Posts: 6

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

Also, had a question after talking to someone here more knowledgeable.  In our main server we have in /etc/ssh/sshd_config:

KerberosAuthentication yes

GSSAPIAuthentication yes
GSSAPICleanupCredentials no

 

We are using kerberos and authenticating back to an LDAP server (using pam and other processes)

 

However, on our 3 nodes (for VA High Perf. Analytics) we have all Kerberos options commented out but still have:

GSSAPIAuthentication yes
GSSAPICleanupCredentials no

 

Should that GSSAPIAuthentication on those nodes by "no"?

Thanks.

Janette

 

 

 

SAS Employee
Posts: 285

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

@msjhicks,

 

The value for GRIDRSHCOMMAND must be in double quotes.

 

>> Should that GSSAPIAuthentication on those nodes by "no"?

No.

Occasional Contributor
Posts: 6

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

Hi.  I did the export as you had typed it - should I do something different?  Sorry if I'm missing what you meant.  Thank you,

Janette

 

export GRIDRSHCOMMAND="/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no"

 

env | grep GRIDRSH
GRIDRSHCOMMAND=/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no

Occasional Contributor
Posts: 6

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

OK, I thought I'd try putting single quotes around so it kept the quotes.  This time I get:

 

/opt/sasva/TKGrid/bin/checknodes /opt/sasva/TKGrid/grid.hosts
Machine busas.binghamton.edu responded with failure. RC: 20
Machine sasproc01.binghamton.edu responded with failure. RC: 20
Machine sasproc02.binghamton.edu responded with failure. RC: 20
Machine sasproc03.binghamton.edu responded with failure. RC: 20
Num Returned: 4
Num Failed: 4

 

/opt/sasva/TKGrid/bin/tkgridperf
Unable to enumerate grid.

 

/opt/sasva/TKGrid/bin/tkgridmon
Unable to enumerate grid.
ERROR: Failed to execute command: "/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthenticat
ion=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no" busas.binghamton.edu export TKMPI_INFO=""; /opt/sasva/TKGrid/t
kmpirsh.sh -np 1 /opt/sasva/TKGrid/tkmpinodelib.sh busas.binghamton.edu 65206 tkegenum
Timeout waiting for Grid connection.

 

 

 

 

SAS Employee
Posts: 285

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

@msjhicks,

 

Please export these variables once again and run these commands:

 

strace -fv -s 1000 -o /tmp/tkgridperf.strace.log /opt/sasva/TKGrid/bin/tkgridperf
strace -fv -s 1000 -o /tmp/tkgridmon.strace.log /opt/sasva/TKGrid/bin/tkgridmon

 

 

Attach log files for further investigation. Also, make sure that you have an entry for localhost in /etc/hosts on each machine in your TKGrid cluster.

 

Also, what happens if you run this command?

 

/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no busas.binghamton.edu export TKMPI_INFO=""; /opt/sasva/TKGrid/tkmpirsh.sh -np 1 /opt/sasva/TKGrid/tkmpinodelib.sh busas.binghamton.edu 65206 tkegenum
Occasional Contributor
Posts: 6

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

Hi. 

We do have a localhost entry in /etc/hosts on each machine.

 

When I run this I get errno 111:

/usr/bin/ssh -q -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o GSSAPIAuthentication=yes -o GSSAPIDelegateCredentials=yes -o RSAAuthentication=no busas.binghamton.edu export TKMPI_INFO=""; /opt/sasva/TKGrid/tkmpirsh.sh -np 1 /opt/sasva/TKGrid/tkmpinodelib.sh busas.binghamton.edu 65206 tkegenum


Failed to connect to 'busas.binghamton.edu', errno: 111

 

I emailed you privately about attaching the logs - let me know.

Thanks,

Janette

SAS Employee
Posts: 285

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

@msjhicks,

 

Error 111 means connection refused. Are you sure that SSHD daemon is up and running on busas.binghamton.edu? Send these log files to the track and in the message state that they are for Alex. Also, send /var/log/secure and /var/log/messages from busas.binghamton.edu.

SAS Employee
Posts: 285

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

@msjhicks,

 

I just sent you a message.

New Contributor
Posts: 3

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

Hello,

 

We have been following this thread and are experiencing the same results.  What was the final resolution?

SAS Employee
Posts: 285

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

Posted in reply to dave_foster

@dave_foster,

 

We are still working on it. To workaround the problem you can remove a dns entry from /etc/resolv.conf.

New Contributor
Posts: 3

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

Thanks.  I believe there are a number of entries in this folder, am I looking for something specific?

 

Also you sent me the SAS ticket number for this, so I can have my customers SAS AE keep tabs on it?

SAS Employee
Posts: 285

Re: Trying to Start LASR Server get Failed to load analytic extension for the distributed computing

Posted in reply to dave_foster

@dave_foster,

 

A host entry. I will have another debug session next week, so I will keep everyone posted. 

Ask a Question
Discussion stats
  • 15 replies
  • 623 views
  • 1 like
  • 3 in conversation