I’ve had multiple discussions around Resource Pool level shares in vCloud Director over the last 2 years so I figured I would write an article about it. A lot easier to point people to the article instead, and it also allows me to gather feedback on this topic. If you feel I am completely off, please comment… I am going to quote a question which was raised recently.
One aspect of “noise neighbor” that seems to never be discussed within vCloud is the allocation of shares. An organization with a single VM has better CPU resource access per VM than an organization that has 100 VMs. The organization resource pools have equal number of shares, so each VM gets a smaller and smaller allocation of shares as the VM count in an organization virtual data center increases.
Before I explain the rationale behind the design decision around shares behavior in a vCloud environment it is important to understand some of the basics. An Org vDC is nothing more than a resource pool. The chosen “allocation model” for your Org vDC and the specified charateristics determine what your Resource Pool will look like. I wrote a fairly lengthy article about it a while back, if you don’t understand allocation models take a look at it.
When an Org vDC is created on a vSphere layer a resource pool is created and it will typically have the following characteristics. In this example I will use the “Allocation Pool” allocation model as it is the most commonly used:
Org vDC Characteristics –> Resource Pool Characteristics
- Total amount of resources –> Limit set to Y
- Percentage of resources guaranteed –> Reservation set to X
On top of that each resource pool has a fixed number of shares. The difference between the limit and the reservation is often referred to as the “bust space”. Typically each VM will also have a reservation set. If 80% of your memory resources are guaranteed this will result in a 80% reservation on memory on your VM as well. This means that when you start deploying new VMs in to that resource pool you will be able to create as many until the limit is reached. In other words:
10GHz/10GB allocation pool Org vDC with 80% guaranteed resources = Resource pool with a 10GHz/GB limit and an 8GHz/GB reservation. In this pool you can create as many VMs until you hit those limits. Resources are guaranteed up to 8GHz/8GB!
Now what about those shares? The statement is, will the Org vDC with 100 VMs have less resource access than the Org vDC with only 10 VMs? Lets use that previous example again:
10GHz/10GB allocation pool with 80% resource guaranteed. This results in a resource pool with a 10GHz/10GB limit and an 8GHz/GB reservation.
Two Org VDCs are deployed, and each have the exact same characteristics. In “Org VDC – 1” 10 VMs were provisioned, while in “Org VDC – 2” 100 VMs are provisioned. It should be pointed out that the provider charges these customers for their Org VDC. As both decided to have 8GHz/GB guaranteed that is what they will pay for and when they exceed that “guarantee” they will be charged for it on top of that. They are both capped at 10GHz/GB however.
If there is contention than shares come in to play. But when is that exactly? Well after the 8GHz/GB of resources has been used. So in that case Org VDCs will be fighting over:
limit - reservation
In this scenario that is “10GHz/GB – 8GHz/GB = 2GHz/GB”. Is Org VDC 2 entitled to more resource access than Org VDC 1? No it is not. Let me repeat that, NO Org VDC 2 is not entitled to more resources.
Both Org VDC 1 and Org VDC 2 bought the exact same amount of resource. The only difference is that Org VDC 2 chose to deploy more VMs. Does that mean Org VDC 1’s VMs should receive less access to these resources just because they have less VMs? No they should not have less access! A provider cannot, in any shape or form, decide which Org VDC is entitled to more resources in that burst space, especially not based on the amount of VMs deployed as this gives absolutely no indication of the importance of these workloads.Org VDC 2 should buy more resources to ensure their VMs get what they are demanding.
Org VDC 1 cannot suffer because Org VDC 2 decided to overcommit. Both are paying for an equal slice of the pie… and it is up to themselves to determine how to carve that slice up. If they notice their slice of the pie is not big enough, they should buy a bigger or an extra slice!
However, there is a a scenario where shares can cause a “problem”… If you use “Pay As You Go” and remove all “guarantees” (reservations) and have contention in that scenario each resource pool will get the same access to the resources. If you have resource pools (Org VDCs) with 500 VMs and resource pools with 10 VMs this could indeed lead to a problem for the larger resource pools. Keep in mind that there’s a reason these “guarantees” were introduced in the first place, and overcommitting to the point where resources are completely depleted is most definitely not a best practice.
David Hill says
Nice article Duncan.
This is the exact reason why the vCAT recommends PvDC using separate clusters if there are different allocation models used in org vdcs.
Chad King says
Nice article. In my world we use no resource pool reservations/limitations. Its all Pay as you Go. But you raise an interesting point about that model. Ideally though one wouldn’t out perform another with this model as they are all sharing the same resources in essence, right? Because there are not reservations (limits are set by VCD though)they can expand grow as needed. I can see where Over allocation could be a problem if best practice isn’t followed. The reality is that customers are going to get what they want :(. How do you avoid those scenarios. Curious to hear your feedback.
Jake says
Somewhat related…http://kb.vmware.com/kb/2006684
Rob Farnell says
Hi Duncan,
This situation caused me a massive headache around December/January time with a major datacenter who are providing a mixture of vCloud Allocation and Reservation Pools. As it is enterprise level, the majority of the allocations are 100% to each customer (from what I understand). At the time there were at least 6 hosts in the datacenter cluster. In our reservation, our VMs kept having performance problems, yet we weren’t particularly close to the limits of the CPU. The provider wasn’t seeing any issues on the host/s that had the VMs we were particularly complaining about, with CPU ~30%. Previously, we didn’t have reservations set against these VMs and the majority of the other VMs were set on the allocation model so a number of VMs basically strangled the other VMs despite the fact they were from a completely different resource pool. The only way to alleviate it was for us to set sensible reservations on a per VM basis, but that reduced the amount that the rest of the VMs could use in our pool and cost us whether we utilised the CPU or not. I saw the possibility where we would effectively create an arms race where each customer in a reservation mode would up the share level across their pool so at the host level they were prioritised compared to the other VMs.
Milos says
Hi Duncan,
Maybe I am missing something, but the problem is not when two customers buy the same amount and deploy different number of VMs. The problem is when two customers buy ovDCs of different sizes.
For example, Cust-1 buys ovDC with 10 GHz/10 GB and Cust-2 buys ovDC with 100 GHz/100 GB. Assume that the reservation is the same (e.g., 80%). As far as I know the shares are still set to exactly the same values for the RPs for these two customers (4000/163840). Does this not mean that when it comes to contention they will get absolute equal shares of the non-reserved resources. Assume this is happening in a 90 GHz/ 90 GB cluster (ignore HA and overheads, and I know that this sizing would be ridiculous) then if both customers run flat out they will each get their reserved amounts: Cust-1 will get 8 (I am going to drop the units for sake of brevity) and Cust-2 will get 80. That leaves 90 – 88 = 2 left to share. The two resource pools will get 1 each, since they have the same share settings. That would mean that Cust-1 would get (1/(10-8) = 0.5) 50% of what her workloads are asking for and Cust-2 will get (1/(100-80) = 0.05) 5% of what he is asking for above reservations.
Yet, Cust-1 is paying 1/10th of what Cust-2 is paying (excluding volume discounts 🙂 ).
Please correct my reasoning if it is not accurate.
Cheers,
Milos
P.S. Has any of this changed in 5.1 with new allocation model mechanics?
DanMan says
Hey Duncan,
I’ve not been able to find any good solid articles on how to remove existing resource pools. Can you kindly share any of you experiences in this regards… Thanks!
Tarun says
HI Duncan,
Can a OrgVDC can have 2 resource Pools.??
Tarun