Today I witnessed something weird. For reason VirtualCenter was totally lost. There were 3 ESX 3.5 hosts in a cluster. One of them failed and it seemed that all the vm’s failed over to the other two. This could be confirmed in VirtualCenter cause all VM’s were registered on either the first or the second host. I could not double check it on the third host because it was impossible to run “vmware-cmd -l” or contact is via the VI Client.
This also meant that I did not have the opportunity to put the host in maintenance mode, because it was also disconnected. Seeing all these symptoms one would expect that the host was completely empty so I decided to reboot the host. Well I guess that was a big mistake because around 15 VM’s got shutdown. Although according to VirtualCenter they were running on a different ESX host the third host decided to kill them.
When I restarted the machine VirtualCenter still showed me wrong information. So I decided to kill the cluster and recreated it. When added the ESX hosts to the cluster everything functioned like it should. Anyway, it’s really tough troubleshooting when you can’t seem to rely on the management tools. Hope this is something VMware fixes soon, or create a workaround like “forced database update”….