I received a question today about a partitioned clusters (in vSphere 5.0) with HA. The question was which Master would own which VMs when a partition exists and how is determined which master can own which VM? I have already briefly explained the difference between an isolation and a partition. Now it is good to realize that when a partition exists the new master may take responsibility for VMs. I used “may” intentionally as there is no guarantee it will own any VMs at that point, so why is this and how does it take ownership of VMs?
A master takes ownership of VMs by locking a file on the datastores it is connected to. As soon as the master has an exclusive lock it owns the virtual machines that have their config file on the datastore. (Note that HA cares about the .vmx file and not the VMDK of the VMs.) The file is called “protectedlist” and is stored on each of the datastores under the .vSphere-HA folder as shown in the screenshot below.
Now a master will only release the lock on that file when it is placed in to maintenance mode, removed from the cluster, loses access to the datastore, is rebooted etc. If and when that would happen the other master(s) would take ownership of the VMs which are located on that datastore by locking that file. So what does that mean?ย That does indeed mean that in a “normal” situation even though you have two or more masters in a cluster, due to the fact the cluster is partitioned, you would still only have 1 master being responsible for all VMs in that cluster. Unless… unless that master is indeed placed in maintenance mode, loses access to the datastore etc.
Hope that clarifies things,
** edit: changed the wording to clarify the behavior of the master with regards to owning VMs **
Manish Patel says
One can find out the MOID of the host using MOB to determine the master host as well.
Magnus says
thanks Duncan, this is the information i have been looking for quite some time.
Duncan says
Read the book ๐
Magnus says
i thought i did but obviously not if this information exists in the book:) I found information that several masters can exists and that a master in one partition can claim responsibility for a VM in another partition but not information about when a second master in a cluster is able to take ownership of VMs which are already protected by a master (apart from the normal procedure when a new master is elected)
Need to reread the book.
Duncan says
I did not spell it out like this though ๐
James says
Duncan, how would this work if you had a virtual machine that is split over multiple datastores ?
Duncan Epping says
As designed :). It fails over the VM it owns which hasn’t got much to do with the separate VMDKs. That is more a matter of which host is compatible with this VM.
Keith Farkas says
Duncan, great summary about how ownership of a VM is determined during a partition. I think it would be useful to clarify that VM ownership is based on the datastore that contains a VM’s config file. If the VM is using other datastores, these are ignored when determining ownership.
Joe Mudra says
I have an interesting issue. I am doing a storage migration to better size my LUNs, and I’m giving my storage team back smaller luns in return for large one’s.
The issue i have is I’ve migrated all machines off of several Datastores that I want to give back, however, the protected list file remains.
How do I tell my master that nothing is on these datastores anymore, and that it can ignore this protectedlist file… or even better, delete it.
Alfred says
I think the place of protectedlist file depends on the (2) elected or manually assigned datastores in cluster HA and is independent on the number of VMs on that particular datastore. Manually assign different datastores in cluster HA or remove the datastore prior to unpresenting the LUN. If cluster HA is any smart, that should trigger a new election of datastores in cluster HA.
Dan says
Hi Duncan,
So how will the master from the other partition will be notified if the master who actually owns the protectedlist file, has entered maintenance, was rebooted or has lost contact with the datastore?
Duncan Epping says
@Dan: As soon as that happans the “lock” on the protectlist file is gone which means that the other master can claim it. That is how it knows the master has “disappeared”.
@Alfred: Not sure if you are referring to this, but the protectedlist has got nothing to do with the “heartbeat datastores”.
Ganesh says
How the newly elected master receives the protected file? I mean, where the file is located?
Duncan says
it is located on VMFS datastores