In the last couple of weeks I have had various discussions around creating imbalanced clusters. Imbalanced from either CPU, memory and even a storage point of view. This typically comes up in discussions where either someone wants to bring larger scale to their cluster and they want to add hosts with more resources of any of the before mentioned types. Or also when licensing costs need to be limited and people want to restrict certain VMs to run a specific set of hosts. Something that comes up often when people are starting to look at virtualizing Oracle. (Andrew Mitchell published this excellent article on the topic of Oracle Licensing and soft vs hard partitioning which is worth reading!)
Why am I not a fan of imbalanced clusters when it comes to compute or storage resources? Why am I not a fan of crippling your environment purposely to ensure your VMs will only run on a subset of vSphere hosts? The reason is simple, the problems I have seen and experienced and the inefficiency in certain scenarios. Lets look at some examples:
Lets assume I have 4 hosts with each 128GB of memory. I need more memory in my cluster and I add a host with 256GB of memory. Now you just went from 512Gb to 768GB which is a huge increase. However, this is only true when you don’t do any form of admission control and resource management. When you do proper resource management or admission control than you would need to make sure that all of your virtual machines can run in the case of a failure, and preferably run with equal performance before and after the failure has occured. If you added 256GB of memory and this is being used and that host containing 256GB goes down your virtual machines could potentially be impacted. They might not restart, and if they restart they may not get the same amount of resources as they received before the failure. This scenario also applies to CPU, if you create an imbalance .
Another one I encountered recently was presenting a LUN to a limited set of hosts, in this case a LUN was only presented to 2 hosts out of the 20 hosts in that cluster… Guess what, when those two hosts die… so do your VMs. Not optimal right when they are running an Oracle database for instance. On top of that I have seen people pitching a VSAN cluster of 16 nodes with only 3 hosts contributing storage. Yes you can do that, but again… when things go bad, they will go horribly bad. Just imagine 1 host fails, how will you rebuild your components that were impacted? What is the performance impact? Very difficult to predict how it will impact your workload, so just keep it simple. Sure there is a cost overhead associated with separating workloads and creating dedicated clusters, but it will be easier to manage and more predictable in failure scenarios.
I guess in summary: If you want predictability in terms of availability and recoverability of your virtual machines go for a balanced environment, don’t create a Frankencluster!
Brian Suhr says
A couple of good examples. I see this too often in customers existing environments.
daunce says
I’ve seen this too, but some SMB’s only have the budget to buy 1 host every 12 months, and a newer/bigger host has sweeter price point. Although it’s not usually double the size of the previous host.
That’s the thing about virtualisation. The VM’s can be configured for only the resources it needs, and changed as the VM grows, yet it’s recommended the hosts are the same, which requires several years worth of hosts be purchased upfront.
James Hess says
How about this basic rule for SMBs: Within a cluster, always present the same LUNs to every host. Always provide the same standard set of NICs, of the same type, plugged into the same PCI slots. Stick with the same hardware vendor and same form factor and onboard storage and network I/O ports for new servers.
Add more hosts with whatever extra memory or CPU if you want.
To be safe: when evaluating cluster availability in case of a failure, penalize yourself until you upgrade the weakest host to match: always take the host with the LOWEST amount of resource X, say CPU in your entire cluster, multiply by the number of hosts, N,
to determine raw available resources.
If you’re supposed to survive 2 failures of an operational cluster, or to survive 1 failure of a cluster while a single host is in maintenance node, then:
(1 – 2/N) * ( (Total resources of smallest node) x N )
Repeat for each system resource — amount of system RAM, etc.
divnull says
Thanks for that statement. To my customers – or their CFO – I must probably sound like a broken record. “Do not imbalance your cluster!”. Unfortunately budget plans often allow only partly replacements of servers. I also do not recommend pinning VMs to hardware resources for the sake of licensing savings. That linked article explained it pretty well.
Gabrie van Zanten says
I’m still struggling with HA admission control and reservations. I have your (and Frank’s) Clustering Deepdive books and understand the way the HA calculations are made, but how should we use reservations at VM level? We have discussed this before and I suggested to just put a 75% reservation (or any other number) on each VM to start with and for the very important VMs go to 100% reservations. But you told me that is not the way to go. Reservations should be used to guarantee resources for VMs not to make HA Admission control work. To me this currently boils down to using almost no reservations at all, except for some very important VMs.
Maybe an idea for a blogpost to explain how you come up with the reservations for VMs? Some examples of large scale environments?
Brett says
Reservations are for SLAs and mission critical workloads only.
James Hess says
All my production workloads are mission-critical as far as the application owner, or customer buying the VPS/Hosted VM, and management are concerned.
That is… some workloads are more important than others, but ALL of the workloads are considered mission-critical, or else, it would be powered off, right?
I allow, in general, a maximum of a 50% reservation on memory, and a 5% reservation on CPU; to be allowed reservations, the VM must be providing a customer-facing real-time service, such as a Hosted virtual machine, E-mail, Websites, DNS, Tier 1 Database servers.
These reservations get implemented at the resource pool level, though, and are shared by other VMs related to the specific service: not ever on the settings of an individual VM, since managing those would become a nightmare.
Brett says
If a reservation isn’t set on a VM, then you don’t have reservations beyond the VM overhead. A resource pool is a logical abstraction of host resources, not a resource consumer (a VM). Setting a reservation on the pool serves no point unless you have disabled the resource pool’s expandable reservation (not the default).
If you give VMs reservations and they are in default resource pools then the reservations are granted as long as the host can still provide the resource because the RP passes the request to the hypervisor due to the expandable reservation.
A reservation on a pool of VMs without reservations does nothing.
You can argue every VM is mission critical and that’s like asking a user to act as your monitoring system. Bad idea. Furthermore it leads to far too many reservations and ultimately may lead to contention when idle VMs maintain their reservations because they needed the resource a week ago but haven’t used it since.
Tree Dude says
Had a frankencluster for a while out of necessity. It sucked. Taking the largest host (96GB of RAM vs the 32GB in the other hosts) down for updates meant maxing out the memory on the other hosts and taking non-essential VMs down. Not to mention that the CPU difference made some of the VMs get a little pokey if moved hot.
When we planned an upgrade this year I pushed hard to replace all 4 hosts in the cluster. After a lot of resistance from my boss he finally conceded and let me replace them all. I could not be happier. The servers even have half their slots free, so when memory becomes an issue we won’t have to add a host.