Re: Visual Analytics Web Applications Service Temporarily Unavailable

MarkPeskir · Posted 07-06-2016 11:55 AM

We have been running SAS VA 7.2 non-distributed on a Windows 2012 R2 server for several months now. Suddenly, all of the web applications (VA, Web Administration, Environment Manager, SAS Studio) running on this box now get "Service Temporarily Unavailable". Rebooting the box grants about 10 minutes of uptime, then it goes down again.

The windows server itself shows all expected SAS services running, and Management Console connects just fine and shows all configuration settings. The issue seems to be focused to just the web server (or web app server).

I have a support ticket submitted, but now we've been down for days and I'm recruiting all the help I can get.

dpage · Posted 07-06-2016 11:59 AM

What's in the LevN/Web/WebAppServer/SASServer12_1/logs/server.log (possibly SASServer1 or SASServer if you're single managed server)? If you're seeing that error message it means the WebServer is okay, but the web app server is not - the reverse proxy can't communicate with the backend server - typically that means additional information is available either in a particular application log in LevN/Web/Logs/SASServerN/ or the aforementioned server.log. Likely culprit is the JVM exiting for some reason or thrashing in garbage collection to the point it becomes unresponsive.

MarkPeskir · Posted 07-06-2016 12:20 PM

Most recent error message, written shortly after a reboot: (internal hostname obfuscated by me)

2016-07-06 11:19:45,694 ERROR (localhost-startStop-6) [org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/SASDeploymentBackup]] Exception sending context initialized event to listener instance of class org.springframework.web.context.ContextLoaderListener
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'identityCache': Invocation of init method failed; nested exception is com.gemstone.gemfire.distributed.DistributedSystemDisconnectedException: GemFire on [[INTERNAL HOSTNAME]]<v482>:30329/59464 started at Wed Jul 06 11:11:20 EDT 2016: Message distribution has terminated

[[followed by java stack trace]]

MarkPeskir · Posted 07-06-2016 12:44 PM

Also, SAS support helped us enable GC logging, and I see this behavior [abridged for clarity]:

{Heap before GC invocations=34 (full 0):
par new generation total 419392K, used 404067K
eden space 372800K, 100% used
from space 46592K, 67% used
to space 46592K, 0% used
2016-07-06T12:36:51 [Times: user=0.05 sys=0.02, real=0.02 secs]
Heap after GC invocations=35 (full 0):
par new generation total 419392K, used 26187K
eden space 372800K, 0% used
from space 46592K, 56% used
to space 46592K, 0% used

372 of our 512 GB memory goes from 0 to 100% in two seconds? This goes back and forth over and over several times, in very short intervals. I'm not fluent in log files, but this seems not good.

dpage · Posted 07-06-2016 01:19 PM

Sounds like support has you on the right track, none of that seems especially bad though (your eden space isn't the entire heap, see

http://www.oracle.com/technetwork/tutorials/tutorials-1876574.html#t2 for a decent summary)