So I’ve been collecting some HA best practices lately. I just wanted to have them all in one place so I can use them myself for the VMTN forum and/or customers. The first two are obvious in my opinion but still often overlooked:

  1. Your ESX host-names should be in lowercase and use fqdn’s
  2. Provide Service Console redundancy
  3. If you add an isolation validation address with “das.isolationaddress”, add an additional 5000 to “das.failuredetectiontime”
  4. If your Service Console network is setup with “active / standby” redundancy then your ”das.failuredetectiontime” needs to be set to 60000
  5. If you ensured Service Console redundancy by adding a secondary service console then ”das.failuredetectiontime” needs to be set to 20000 and you need to setup an additional “das.isolationaddress”
  6. If you setup a secondary Service Console use a different subnet and vSwitch then your primary has
  7. If you don’t want to use your default gateway as an isolation validation address or can’t use it because it’s a non-pingable device then disable the usage by setting das.usedefaultisolationaddress to false and add a pingable “das.isolationaddress”
  8. Change default isolation response to “power off vm” and set restart priorities for your AD/DNS/VC/SQL servers
So if you’ve got more, add them into the comments and I will update the list!