The lsf.conf file and the ego.conf file have parameters for the list of master hosts (LSF_MASTER_LIST & EGO_MASTER_LIST). The value of these variables is a space separated list of host names in order of priority.
If the dead master machine is the only host listed by these variables, LSF does not know what to do.
For example, if you have EGO_MASTER_LIST="myhost1.mydomain.com" then when myhost1 goes down, LSF does not know who to make the master. If, on the other hand, EGO_MASTER_LIST="myhost1.mydomain.com myhost2.mydomain.com", when myhost1 goes down, myhost2 becomes the master.
... View more