I had a question today around the restart of VMs with no access to storage by HA. The question was if HA would try to restart the VM and time out after 5 times. With the follow up question, if HA would try again when the storage would return for duty.
By default HA will try to restart a VM up to 5 times in roughly 30 minutes. If the master does not exceed it will stop trying. On top of that HA manages a “compatibility list”. This list will contain the details around which VM can be restarted and where. In other words; which hosts have access to the datastores and network portgroup required for this VM to successfully power-on. Now if for whatever reason there are no compatible hosts available for the restart then HA will not try to restart the VM.
But what if the problem is resolved? As soon as the problem is resolved, and reported as such, the compatibility list will be updated. When the list is updated HA will continue with the restarts again.
It might also be good to know that if for whatever reason the master fails, a new master will continue trying to restart the VM. It will start with 5 new attempts and not take the number of restart attempts that the previous master did into account.
** Disclaimer: This article contains references to the words master and/or slave. I recognize these as exclusionary words. The words are used in this article for consistency because it’s currently the words that appear in the software, in the UI, and in the log files. When the software is updated to remove the words, this article will be updated to be in alignment. **