The amount of retries is configurable as of vCenter 2.5 U4 with the advanced option “das.maxvmrestartcount”. My colleague Hugo Strydom wrote about this a while ago and after a short discussion with one of our developers I realised Hugo’s article was not 100% correct. The default value is 5. Pre vCenter 2.5 U4 HA would keep retrying forever which could lead to serious problems as described in KB article 1009625 where multiple virtual machines would be registered on multiple hosts simultaneously leading to a confusing and inconsistent state. (http://kb.vmware.com/kb/1009625)
Important to note is that HA will try to start the virtual machine one of your hosts in the affected cluster; if this is unsuccessful on that host the restart count will be increased by 1. The first restart attempt will than occur after two minutes. If that one fails the next will occur after 4 minutes, and if that one fails the following will occur after 8 minutes until the “das.maxvmrestartcount” has been reached.
To make it more clear look at the following:
- T+0 – Restart
- T+2 – Restart retry 1
- T+4 – Restart retry 2
- T+8 – Restart retry 3
- T+8 – Restart retry 4
- T+8 – Restart retry 5
In other words, it could take up to 30 minutes before a successful restart has been initiated when using the default of “5″ restarts max. If you increase that number, each following will also be “T+8″ again.