As of VMware vCenter 2.5 Update 2 the HA default isolation response changed from “Power Off” to “Leave powered on”. A lot of people liked this new default setting because it would lower the chances of downtime due to a “false positive”. I’ve never been a fan though, I just don’t like using degraded hardware or a degraded ESX host for that matter.
Those that did like the change should take notice of the fact that with vSphere comes a new default isolation response:
Note that this change is only for new clusters, if you upgrade(d) your vCenter the selected isolation response will remain. For those of you who never looked into the setting “Shut down”, it uses VMware Tools to initiate a guest shut down. If the shut down does not complete within five minutes the VM will be powered off. These five minutes are a configurable setting, if you want to increase or decrease it add the following advanced option das.isolationShutdownTimeout with the new value in seconds.
bitsorbytes says
“false positive” and cluster shutdowns can be a pain when networks are working on switches etc causing our vlans to be totally off the air for a few seconds.
We increased the das.isolationShutdownTimeout to 60 seconds. While this seems like a long time, it has stopped any false positives we were seeing from network outages causing farms to shutdown!
Also I would highly recommend using shutdown over power off for isolation response.
solgae says
@bitsorbytes: Don’t you mean decreased…? Default for das.isolationShutdownTimeout is 5 minutes = 300 seconds.
rsj says
I think bitsorbytes means the das.failureInterval which still defaults to 15 seconds.
which again (i think) still is on the low side. 30 or 60 seconds might be more suitable, even when portfast fx. is enabled.
for the most part 30 or 60 does the trick for me, if that is what bitsorbytes meant 🙂
rsj says
and by das.failureInterval i meant das.failuredetectiontime 🙂