BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tim_acton
Fluorite | Level 6

We're having a strange problem with EG 7.1, and SAS 9.4 M2 with grid. Our users that lauch grid jobs through SAS EG will occasionally get the following error, and their grid job will fail. We're using grid launched workspace servers, and we have no issues with that. It's almost like EG loses it's connection to grid. So far I've not been able to find anything helpful on this anywhere. Any ideas?

 

NOTE: Remote session ID WK0 will use the grid service _ALL_.

NOTE: Remote signon to WK0 commencing (SAS Release 9.04.01M2P072314).

ERROR: A communication subsystem partner link setup request failure has occurred.

ERROR: Could not start grid job or grid job failed.

ERROR: Remote signon to WK0 canceled.

ERROR: A link must be established by executing the SIGNON command before you can communicate with WK0.

ERROR: A link must be established by executing the SIGNON command before you can communicate with WK0.

ERROR: A link must be established by executing the SIGNON command before you can communicate with WK0.

ERROR: A link must be established by executing the SIGNON command before you can communicate with WK0.

MPRINT(WKLY):   RSUBMIT CONNECTWAIT=NO CONNECTPERSIST=NO;

ERROR: A link must be established by executing the SIGNON command before you can communicate with WK0.

NOTE: Subsequent lines will be ignored until ENDRSUBMIT.

 

 

The "Remote signon to WK0 canceled" is the interesting part. Normally instead of WK0, it would have a hostname (gridserver03.local or something). 

1 ACCEPTED SOLUTION

Accepted Solutions
tim_acton
Fluorite | Level 6

Just in case anyone circles back to this topic, we basically found the same thing. It's some kind of issue with the Object Spawner. We found that our analysts could run their jobs right after the Object Spawner was restarted, but after a few days they'd start running in to this issue. You're probably going to assume some risk in reloading the Object Spawner, so proceed with some caution.

View solution in original post

16 REPLIES 16
GyaniBaba
Obsidian | Level 7
  1. Are you able to valodate Connect Server from SAS MC?
  2. Is connect spawner up and running fine?
tim_acton
Fluorite | Level 6

The object spawners are fine. Not sure what you mean by connect server, but lsf/grid seem fine. We have several hundred users using EG/grid right now with no problems. 

GyaniBaba
Obsidian | Level 7

under /sas/config/LevN/ directory, do you see connect spawner, can you check, if it is started?

Under server manager SAS MC, under SASAPP application server , do you see connect server, are you able to validate it?

JuanS_OCS
Amethyst | Level 16

Hi,

 

are your EG clients working on Citrix/virtualized clients? I can easily imagine that you have a firewall (F5?) blocking eventually some connections.

 

You might want to check your Windows Logs, or extend the logging to a DEBUG level.

 

Best regards,

Juan

GyaniBaba
Obsidian | Level 7

Juan, I don't think Grid will work, if connect server is not working. It could be firewall issue or it could be that connect spawener is not started.

tim_acton
Fluorite | Level 6

If by connect spawner you mean Object Spawner, then yes it's started.

GyaniBaba
Obsidian | Level 7

No, I don't mean object spawner by connect spawner.

 

could you expand server manager in smc and provide snapshot.? What is the operating system sas services are running on? 

Do you see ConnectSpawner directory inside LevN ?

DanielKaiser
Pyrite | Level 9

Hi 🙂
Did you solve your problem?

We have the same with our connect. 

 

Beside the errors in the EG-Job-Log we found several errors in the ObjectSpawner-Logs. This List is a collection of them.
The appear in all of our 4 environments and on every App-Server-Context

 

We have 4 Grid Computing Servers and 4 Grid Servers on which Metadata-Server runs. Our SAS-Servion is SAS9.4 TS1M3

svcsasit1 - The specified uuid 1CBF12F1-9C22-9E40-B0D2-D481F0758D1E did not match any process managed by this spawner.

sy071 - The launch of server SASITRM - Workspace Server for user svcsastuit failed.

svcsasit1 - /gpfs/sasconf/sasit/SASHome/SASFoundation/9.4/sasexe/tkiomsvc.so(tktracex+0x2e) [0x7f79babc3e3e]

svcsasit1 - Load Balancing interface call failed with exception <?xml version="1.0" ?><Exceptions><Exception><SASMessage severity="Error">(A51P1HY1.AZ00000C_!A51P1HY1.AY000004_@sap00782.lan.de) cannot be found in the metadata.</SASMessage></Exception></Exceptions>

svcsasit1 - The load balancing processor could not send update to peer (A51P1HY1.AY000004_@sap00783.lan.de)

svcsasit1 - New client connection (64) rejected from server port 17591 for user sy053@!*(generatedpassworddomain)*!. Peer IP address and port are [::ffff:10.132.137.127]:55632 for APPNAME=SAS Enterprise Guide.

The credentials specified for the SASBIIB - Pooled Workspace Server (A5QFW5AJ.AY00000B) server definition failed to authenticate. Therefore this server definition will not be included.

The SASBIIB - Logical Pooled Workspace Server (A5QFW5AJ.AW000006) cluster does not contain any valid server definitions. Therefore this cluster definition will not be included.

Load Balancing interface call failed with exception <?xml version="1.0" ?><Exceptions><Exception><SASMessage severity="Error">(A5QFW5AJ.AZ00000C_!A5QFW5AJ.AY000004_@sap00782.lan.de) cannot be found in the metadata.</SASMessage></Exception></Exceptions>.

The load balancing processor could not send update to peer (A5QFW5AJ.AY000004_@sap00783.lan.de)

rg
Fluorite | Level 6 rg
Fluorite | Level 6
Hi All,

Any update on this issue?

I have a similar issue but however i am able to validate connect server and i am not able to validate the workspace server:
[10/26/16 2:18 PM] INFO: Starting extended validation for Workspace server (level 1) - Making a connection
[10/26/16 2:19 PM] SEVERE: Unknown error in grid provider module.
[10/26/16 2:19 PM] SEVERE: The launch of server SASApp - Workspace Server for user gurramr failed.
[10/26/16 2:19 PM] SEVERE: Failed to start the server.
[10/26/16 2:19 PM] SEVERE: The application could not connect to any server in the cluster "(lexvpsas02:8591,lexvpsas03:8591,lexvpsas05:8591,lexvpsas04:8591)".
anja
SAS Employee

Hi,

 

there has been a lot of great input. Let me try to summarize and add some info, so you can go through this list and verify. This might help to narrow down the problem a bit further:

 

1) In SASMC, go to Server Manager, Connect Server and Connect Spawner, right click, validate or connect. You should get

    a prompt to enter a user ID. If you do have the pwd, enter the user gurramr.
    If not, then enter a user ID that is associated with an OS account.

    How does that work? Successful or error?

 

Note: After each server validation, go to FILE and do a CLEAR CREDENTIAL CASH. Even tho you might use the same user ID,

I'd like to make sure you enter "fresh" credentials for each validation.

 

2) Is "gurramr" a regular user who can work in EG w/out problem outside the Grid? Is this the only user experiencing problems?

 

3) Do you have any error messages in  the Object Spawner log and Metadata Server log?

 

4) Does the problem occur for all users, or, only certain ones. Does it happen in one environment but not there other, and if so,

    what is the difference between the environments.

 

5) Could you please confirm whether this problem occurs sporadically, or, on regular basis

 

6) When this problem occurs, have any servers/services been restarted/paused and resumed, right before this problem occurs?

 

Hopefully the answers will help us to further assist you.

 

Thanks

Anja

rg
Fluorite | Level 6 rg
Fluorite | Level 6
Hi All,



Gurramr is a regular id which had no problems before what so ever. I overcame this problem by just re-bouncing my Object spawners. The only change that was done before this issue arised was adding a macro to the autoexec file.


MadhuKiran1
Obsidian | Level 7

I had the same problem in my environment and it got solved after adding more roles and capabilities in metadata thrugh SMC.

earlier I have only SAS Admin role assigned since I am an admin, then I added one of the business group to the user and it worked fine.

kevind
Obsidian | Level 7

I resolved by bouncing the ObjectSpawners.    I knew this would be the resolution but it's something I'm reluctant to do with 140 connections running.   Only one user was having the issue and I first tried removing and re-adding them with SASMC.   I'm maintaining a SAS 9.4M3 Grid.

tim_acton
Fluorite | Level 6

Just in case anyone circles back to this topic, we basically found the same thing. It's some kind of issue with the Object Spawner. We found that our analysts could run their jobs right after the Object Spawner was restarted, but after a few days they'd start running in to this issue. You're probably going to assume some risk in reloading the Object Spawner, so proceed with some caution.

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

CLI in SAS Viya

Learn how to install the SAS Viya CLI and a few commands you may find useful in this video by SAS’ Darrell Barton.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 16 replies
  • 6371 views
  • 8 likes
  • 10 in conversation