I noticed this thread on the VMTN communtity which discussed a time-out during a cluster election process. The one thing all scenarios described in the topic is that they upgraded from 4.1 to 5.0 or 5.0 base to a higher patch level. Marc Sevigny posted in the same thread that it is a known issue which the HA team is currently investigating…
After an upgrade, under conditions we’re still investigating, an error is occurring when issuing a start request of the HA service on the upgraded host. When that fails, HA then tries to re-install HA, and the re-install does nothing because the service is already there (and the right version) but we’re left without an HA service running.
This is the way to fix it if you are experiencing this issue. Now, if you do experience this issue please report it to VMware and submit log files as that will help the HA team fixing the problem.
- Place host into Maintenance Mode
- Take a copy of /opt/vmware/uninstallers/VMware-fdm-uninstall.sh (we copied to /tmp)
- From the location you made a copy of the file, run the command (./VMware-fdm-uninstall.sh)
- You should see a short pause before it gets back to the prompt (you’ll see why I mention this below)
- Exit host out of Mainenance Mode and within the “Recent Tasks” area you should see the client being pulled from vCenter and installing