Issue in WIP Data services due to high availability

sandeeppajni1 · Posted 07-23-2021 03:44 AM

Hi Folks,

We are using SAS 9.4M6 with having multiple tier architecture (3 meta, 6 compute and 2 web nodes - all in Linux). We have been facing issues frequently with the Web Infrastructure services getting down again and again, as due to High Availability the system tries to start the services in other compute nodes than the existing (it may be due a network glitch that the server might not be providing the correct status of service's availability). Once it tries to start the WIP services in another compute node with a new PID, this locks/corrupts the WIP database as it does not match with the existing PID added in the postmaster.pid file. So, every time we have to run the pg_resetxlog to fix the problem.

Is there any suggestion how this issue can be fixed so that we do not need to reset the lock file every time to make the WIP services and hence the web services (SAS Studio) working?

Sajid01 · Posted 07-23-2021 09:34 AM

I think you must take this up with SAS tech support.
Have a thorough look at the logs and take steps to rule it the network glitch that you suspect.

sandeeppajni1 · Posted 07-23-2021 10:37 AM

Hi @Sajid01

This case is already raised with the SAS tech support. It is still being investigated.

gwootton · Posted 07-23-2021 10:48 AM

Configuring the WIP Data Server as a highly available service shouldn't trigger a failover unless the original service goes down, but this does rely on successful communication between the hosts. An issue we sometimes see is that the server is started outside of the HA system (i.e. starting with sas.services), so the HA system is unable to start the service and eventually fails over.

--
Greg Wootton | Principal Systems Technical Support Engineer

sandeeppajni1 · Posted 07-25-2021 08:28 AM

Hi @gwootton,

This seems not to be the case in this particular issue. The services are being turned on as per the process (i.e. starting with sgmg.sh, but not sas.servers). In our case, the WIP services starts in another node on a weekday in between all of sudden. SAS tech support is still investigating the issue.

Issue in WIP Data services due to high availability

Re: Issue in WIP Data services due to high availability

Re: Issue in WIP Data services due to high availability

Re: Issue in WIP Data services due to high availability

Re: Issue in WIP Data services due to high availability