Hi,
We have been using SAS 9.4 M6 where were have a cluster of 3 Meta nodes, 6 Compute nodes and 2 Web nodes. All these servers are on Linux OS.
We have been facing below two issues, which seem to be related, and we are looking for solution to these problems.
1. Whenever a SAS program has some error in it, the programmers close EG and create a new session or SAS EG closes itself, but the process IDs related to the session keep on running in the background and become orphan. Even if the faulty program is run through command line, it behaves the same, gives an error but the orphan process ID keeps on running at the background. Such orphan processes keep using the resources and take CPU usage to 100% or more. How to resolve such a problem?
2. Due to the high usage of the server resources, the particular compute node stops taking any new load, which is understandable. However, the whole Grid stops taking any new load or user session even within rest of the compute nodes for that particular App server. This makes the whole Grid hung in distributing the load and until we restart the Spawner in that particular node where process ID has reached to 100%, no new sessions are connected. So, we are unable to understand why the SAS Grid is not distributing the load to other compute nodes which are working fine.
Can someone please help with providing solution to these problems. Sometimes, if a particular Grid node reached to 100% CPU usage for user process IDs has also the WIP services running in it, that affects the whole SAS Studio access as well.
Thanks
... View more