A week ago I already touched on this topic but I wanted to get a better understand for myself what could go wrong in these situations and how vSphere 4.1 solves this issue.
Pre-vSphere 4.1 an issue could arise when shares had been set custom on a virtual machine. When HA fails over a virtual machine it will power-on the virtual machine in the Root Resource Pool. However, the virtual machine’s shares were scaled for its appropriate place in the resource pool hierarchy, not for the Root Resource Pool. This could cause the virtual machine to receive either too many or too few resources relative to its entitlement.
A scenario where and when this can occur would be the following:
VM1 has a 1000 shares and Resource Pool A has 2000 shares. However Resource Pool A has 2 VMs and both will have 50% of those “2000” shares.
When the host would fail both VM2 and VM3 will end up on the same level as VM1. However as a custom shares value of 10000 was specified on both VM2 and VM3 they will completely blow away VM1 in times of contention. This is depicted in the following diagram:
This situation would persist until the next invocation of DRS would re-parent the virtual machine to it’s original Resource Pool. To address this issue as of vSphere 4.1 DRS will flatten the virtual machine’s shares and limits before fail-over. This flattening process ensures that the VM will get the resources it would have received if it would have been failed over to the correct Resource Pool. This scenario is depicted in the following diagram. Note that both VM2 and VM3 are placed under the Root Resource Pool with a shares value of 1000.
Of course when DRS is invoked both VM2 and VM3 will be re-parented under Resource Pool A and will receive the amount of shares they had originally assigned again. I hope this makes it a bit more clear what this “flattened shares” mechanism actually does.
Scott Lowe says
Great post, Duncan; this is VERY valuable information. Is this documented anywhere in the official documentation? I’d love to be able to direct customers to some sort of KB article or an appropriate section in the official PDFs. If not, I’ll just direct them here! 🙂
Duncan Epping says
Nope, not at this level of detail at the moment…..