I got a question around VM Monitoring (aka virtual machine level HA) this week. A customer wanted to test if VM Monitoring worked and as such disabled the NIC of the virtual machine and waited for 30 seconds for the VM Monitoring response to kick in…. nothing happened.
VM Monitoring restarts individual virtual machines when needed. VM monitoring uses a similar concept as HA, heartbeats. If heartbeats, and in this case VMware Tools heartbeats are not received for a specific amount of time the virtual machine will be rebooted. An example of when this will happen for instance is when a Windows virtual machine shows a BSOD.
The big question of course was why didn’t this trigger a response?
The answer is simple: The VMware Tools heartbeat does not use the virtual machine NIC. This heartbeat is “caught” by hostd and passed on to vCenter. vCenter uses this to show those “green/yellow/red” alarm dots. The same heartbeat is used by VM Monitoring to detect the failure of a virtual machine. Even without any NIC attached to your virtual machine these heartbeats will still be received.
One thing to keep in mind though is that when heartbeats are no longer received, by default sent out every second, VM Monitoring will check if there is any Network or Storage I/O to avoid false positives.
Question for you guys! One thing that I always wondered is how many people use VM Monitoring? And if you use it, do you use it on all VMs in every cluster?