hi,
I've recently installed Platform Suite for SAS 9.1 and SAS 9.4 on RHEL 6.5 environment. I'm trying to configure Process Manger failover through RTM (2.0.8) and I'm getting error.
Can you please help me resolve the issue?
Here is some info about the environment and errors.
Please let me know if you need any additional info.
Version Info:
IBM Platform LSF Standard 9.1.3.0
IBM Platform Process Manager 9.1
Platform Grid Management Service 8.0.1
Environment overview:
two Grid Control server, 3 metadata server and 3 compute nodes
js.conf
--------------------
JS_ADMINS=lsfadmin,root
JS_FAILOVER=true
JS_FAILOVER_HOST=control2
fod.conf
----------
FOD_ADMIN = lsfadmin
Begin Hosts
HOSTNAME
control1.fqdn
control2
End Hosts
pem.log
-------------
<35510>, command </lsf/gms/bin/gaadmin start>, sig <9>
2015-10-18 10:05:39.000 EDT ERROR [35858:139637277837056] updateContainerStatus: kill pid <42886> failed. pgid <42886>, command </lsf/gms/bin/gaadmin start>, sig <9>
2015-10-18 10:17:32.000 EDT ERROR [35858:139637277837056] updateContainerStatus: kill pid <59046> failed. pgid <59046>, command </lsf/gms/bin/gaadmin start>, sig <9>
esc.log
--------------------------
2015-10-18 10:19:35.000 EDT WARN [36037:140702542190336] Service instance <PM:1> started or restarted on host control1.fqdn in cluster <sasgrid>
2015-10-18 10:19:35.000 EDT WARN [36037:140702542190336] Service <PM> started or restarted in cluster <sasgrid>, USAGE INFORMATION: the max instances for this service is <1>.
2015-10-18 10:19:35.000 EDT WARN [36037:140702542190336] do_containerStateChange(): on host control1.fqdn, the container< 435> belongs to instance <1> of service <PM> terminated, reason <None>, status <1>
2015-10-18 10:19:35.000 EDT WARN [36037:140702542190336] Service instance <PM:1> went down on host control1.fqdn in cluster <sasgrid>
2015-10-18 10:19:35.000 EDT WARN [36037:140702542190336] Service <PM> stopped in cluster <sasgrid>, USAGE INFORMATION: the max instances for this service is <1>