I was just cleaning up our Cloud Lab and noticed HA wasn’t enabled. I enabled it and immediately it threw the following error at me:

Error Message: Configuration Issues – HA agent on esx4.mgm.local in cluster ams-hadrs-01 in Lab2 has an error : Error while running health check script.

When experiencing HA configuration issues there are a couple of steps I usually take to try to fix the experienced issues:

  • Click “reconfigure for VMware HA” and see if the issue is still there, if so:
    • Is DNS configured and does it actually work? If not, fix and reconfigure for HA.
    • Is the gateway reachable? If not, fix and reconfigure for HA.

This usually solves 75% of the issues. If it hasn’t been fixed the next step I usually take is unloading the agent and restarting the management services. Although it is pretty rigurous it is the fastest way of fixing HA issues.  In my case I am using ESXi and this is what I needed to do to clean up the host:

  • Disable HA on the cluster
  • /opt/vmware/aam/VMware-aam-ha-uninstall.sh
  • /sbin/services.sh restart
  • Enable HA on the cluster

This solved the issue I had with HA,