Architecting, installing and maintaining your SAS environment

SAS Grid Validation error

Reply
Senior User
Posts: 1

SAS Grid Validation error

[ Edited ]

Getting an error: 

 

"Objspawn was unable to launch the server SASApp - Workspace Server due to the server launch exceeding the specified wait time. 

Failed to start the server"

 

This is a grid environment, and the new application servers created - (for example SASAppVS, created for the Visual Statistics use) - work fine. I am new to the grid environment. So, any help regarding this would be appreciated. 

Super User
Posts: 5,437

Re: SAS Grid Validation error

Try to describe in more detail about your environment.

Is it an existing Grid, and you are just trying to add another node?

What is the relationship with your VA/VS deployment?

Your object spawner works, try to test the workspace server bypassing the grid.

To me, it sounds like there are some parameters missing/wrong.

Grid implementations can be quite complex to trouble shoot, so try to contact the person that initially implemented it, or SAS tech support if you can't proceed on your own.

Data never sleeps
Super User
Posts: 3,260

Re: SAS Grid Validation error

Also check your object spawner logs as it may provide more details of what's going wrong.

 

You will find it in a directory similar to this: \SAS\Config\Lev1\ObjectSpawner\Logs

Contributor SDV
Contributor
Posts: 23

Re: SAS Grid Validation error

You might need to turn on Workspace Server logging to.  I have found instances where the object spawner issues the request to LSF to run the workspace server but the workspace server was failing to run do to an issue that only showed up in its log.

 

Assuming you are using load balancing and grid launched workspace servers, you can run these LSF commands from one of the servers immediately after you submit your program:

 

bjobs -u all -a                        (shows you all jobs LSF has run for about the last hour)

 

bjobs -l <job ID>                    (substitute the LSF job ID number for the job requesting to start a workspace server - look for owner that matches EG user's user ID)

 

You should see the command that the object spawner sent to LSF, which should be workspaceserver.sh (Linux) or .bat (Windows), and an error code

Regular Contributor
Posts: 174

Re: SAS Grid Validation error

Sireesha,

 

Has this been resolved? I know I experienced a similar issue when we first installed and configured our grid environment.

 

How is your workspace server load-balancing configured, if at all?  Are they grid-launched?  You can tell by going to "properties" of SASApp - Logical Workspace Server:

 

Capture.PNG

 

If I remember correctly, I also had to modify the ObjectSpawner_usermods.sh script, adding a specific port to be used. This is the line in my ObjectSpawner_usermods.sh script:

 

USERMODS=" -dnsmatch $HOSTNAME.mydomain.com -conversationport 8599 $JREOPTIONS"

 

I had to add "-conversationport 8599".  I believe the default port range for Object Spawners are 8591-8599, so I just put it at the top range, but the port may be different for you if 8599 is already being used.

 

I think once you make these changes you just need to restart the Object Spawners and then try to validate your Logical Workspace Server

 

Occasional Contributor
Posts: 16

Re: SAS Grid Validation error

Posted in reply to Timmy2383

Hi Timmy,

 

we are using SAS9.4 Grid Ts1M2 and we started receiving bunch on below Error every day(customers are mainly using SAS EG).

 

Objspawn was unable to launch the server SASApp - Workspace Server (A5UCJJ3G.AY00000C) due to the server launch exceeding the specified wait time

 

And then I did try to validate the application server using non-unrestricted user and got the below error.

 

[2/9/16 11:58 AM] INFO: Starting extended validation for Workspace server (level 1) - Making a connection
[2/9/16 11:59 AM] SEVERE: The launch of server SASApp - Workspace Server for user failed.
[2/9/16 11:59 AM] SEVERE: Failed to start the server.
[2/9/16 11:59 AM] SEVERE: The application could not connect to any server in the cluster

 

And I checked the available ports starting with 859* and below is the output I got(mostly all ports are established), can you please suggest me which Port I can use 

 

[sas@localhost ObjectSpawner]$ netstat -anp | grep 859
tcp 0 0 :::8591 :::* LISTEN 7296/objspawn
tcp 0 0 :::8593 :::* LISTEN 7296/objspawn
tcp 0 0 :::8594 :::* LISTEN 7296/objspawn
tcp 0 0 :::8595 :::* LISTEN 7296/objspawn
tcp 0 0 :::8596 :::* LISTEN 7296/objspawn
tcp 0 0 ::ffff:10.1.6.34:8591 ::ffff:144.131.10.135:59241 TIME_WAIT -
tcp 0 0 ::ffff:10.1.6.34:8596 ::ffff:144.131.24.62:34525 ESTABLISHED -
tcp 0 0 ::ffff:10.1.6.34:8596 ::ffff:144.131.24.62:34657 ESTABLISHED -
tcp 0 0 ::ffff:10.1.6.34:8596 ::ffff:144.131.24.75:55006 ESTABLISHED -
tcp 0 0 ::ffff:10.1.6.34:8596 ::ffff:144.131.24.75:53706 ESTABLISHED -
tcp 0 0 ::ffff:10.1.6.34:8596 ::ffff:10.47.26.27:54600 ESTABLISHED -

 

 

Based on your note to avoid ths problem, adding below line to ObjectSpawner_usermods.sh, can you please suggest which port I can use from above utput instead of 8599?.

 

USERMODS=" -dnsmatch $HOSTNAME.mydomain.com -conversationport 8599 $JREOPTIONS"

 

Thanks

Madhu

 

Trusted Advisor
Posts: 1,326

Re: SAS Grid Validation error

Hi Sireesha,

 

for validating the Grid, you might find those kind of errors for different reasons.

 

Indeed, first you would like to do is to enable extended loggin, but let me drop some tips here.

 

- Ensure that you are not validating the GRID Workspace server (any workspace server) with an Unrestricted account (sasadm@saspw or any other unrestricted).

- Ensure that the Grid has authentification domain as None (just the grid server), otherwise, configuration might be a bit more complex.

- You might want to restart the object spawners for this grid, with RTM (just because would make life more simple to you).

 

With all of this, and if even the extended logging would not work, then SAS Technical Support would be your next step, unless you already did Smiley Happy

Occasional Contributor
Posts: 16

Re: SAS Grid Validation error

Hi Sireesha,

 

Did you get around this problem?. can you please post the solution it worked for you?.

 

Thanks

Madhu

Regular Contributor
Posts: 174

Re: SAS Grid Validation error

Unless 8599 is specifically being used for some other service 8599 should work. The object spawner itself typically listens on port 8591 by default, as is shown in your grep output, but I think once the object spawner receives a request there is then additional communcication between LSF and other object spawners until it decides which grid node on which to start the workspace server. All that to say, I think anywhere between 8592 and 8599 should work unless you have other services using those ports
Occasional Contributor
Posts: 16

Re: SAS Grid Validation error

Posted in reply to Timmy2383

Thanks Timmy,

I do have one more doubt, since we are in grid, should we need to add $HOSTNAME.mydomain.com?.

 

curretly we do have 4 compute nodes, if I add this parameter, it is resolving to the firstnode it self and I ran a job multiple times using EG after adding this parameter with 8599 port, every time job is running only on Grid node1.

 

USERMODS=" -dnsmatch $HOSTNAME.mydomain.com -conversationport 8599 $JREOPTIONS"

 

earlier we had only below entry to this file.

USERMODS=$JREOPTIONS

 

I am little confused now whether toa dd $HOSTANME parameter or not. can you please help me and give me some more details?.

 

 

Regular Contributor
Posts: 174

Re: SAS Grid Validation error

Posted in reply to MadhuKiran1

I'm not sure if you need the "$HOSTNAME". I don't recall if we added this or not.  

 

Are your workspace servers configured to be grid-launched?  You may not actually have a problem.  You should be able to check the object spawner logs to see if it's attempting to spawner servers on the other nodes.  Depending on load-balancing settings you have it the servers may not spawn on other nodes until after a certain threshold is met.

 

 

Occasional Contributor
Posts: 16

Re: SAS Grid Validation error

Posted in reply to Timmy2383
Yes, all workspace servers are grid-launched. I checked the object spawner logs and it seems to be(not confirmed yet) actual processes are going to different nodes.



and we do have the configuration to create separate log file for all grid nodes and all these node log files has below entry , this means the port we have added seems to be activated on all.



2016-02-09T14:14:36,807 INFO [00000031] :sas - Reserved IPv6 port 8599 for launched server connect back listen (connection 15).

2016-02-09T14:14:36,807 INFO [00000031] :sas - Activated listen on IPv6 port 8599 (connection 15).


Regular Contributor
Posts: 174

Re: SAS Grid Validation error

I'll have to check when I get back to work tomorrow, but these are the options that SAS Tech Support provided me when I had a similar issue.
Regular Contributor
Posts: 174

Re: SAS Grid Validation error

[ Edited ]
Posted in reply to Timmy2383

It sounds like it's working properly. Can you validate the Workspace server in SMC?

Also, I'm checking my Object Spawner files and here's what I found...

The original "ObjectSpawner.sh" script was configured with the actual hostname of server on which we performed the installation/configuration of the app server (i.e. compute01.mydomain.com). However, our SAS consultant/installer had me make a backup of this file, then modify ObjectSpawner.sh. We replaced every reference of the actual server name with "$HOSTNAME". Then we updated "ObjectSpawner_usermods.sh" so had this line:

USERMODS=" -dnsmatch $HOSTNAME.mydomain.com -conversationport 8599 $JREOPTIONS"

Note that "mydomain.com" should be changed to whatever is applicable for your organization.

Also note that this works for us because we are using a shared file system between all the nodes. Our config directory is shared.

Hope this helps.

Occasional Contributor
Posts: 16

Re: SAS Grid Validation error

Posted in reply to Timmy2383
Thanks Timmy. Yes it looks like working properly. Yes I did validate workspace server(after Object spawner restart) through SMC and it is successful.



In our current configuration, even we don't have any hard coded host names and its already configured with $HOSTNAME, also config directory is shared across all nodes for us too.



Looks like we are good to go. The current problem I see (all jobs are going to grid-node1) may be because the grid-node1 machine is always free since it is a test environment and hence it is always accepting the connections.


Ask a Question
Discussion stats
  • 15 replies
  • 1312 views
  • 1 like
  • 7 in conversation