As you have already discerned (I think), there are two connections at work here: EG to metadata and EG to workspace.
When a workspace is chugging away processing a job for EG, EG pesters the workspace server every so often (let's say 30 seconds) to ask: "got some SAS log for me? How about now? Okay, how about now?" (This reminds me of my 5-year-old asking to watch television.)
But EG never talks back to the metadata server, because we've got all of the metadata we need for the moment, thank you very much. After an hour, your connection takes this inattention personally and hangs up.
Now we get to the crux of the problem: EG 4.1 cannot recover from this very well. Eventually EG will need metadata information again -- your session cannot continue without it. But EG is not trained to reestablish this connection in mid-session.
This is a limitation that we have fixed in 4.2. We did not add keep-alive logic, but we did add the ability to reconnect to metadata. Unlike a workspace session, the metadata session is stateless and an interruption like this should have no adverse affects, providing EG can successfully reconnect.
Hopefully, the information shared in this thread (by everyone) will provide some ideas for workarounds in the meantime.
Chris
... View more