In the last couple of weeks I have had various discussions around creating imbalanced clusters. Imbalanced from either CPU, memory and even a storage point of view. This typically comes up in discussions where either someone wants to bring larger scale to their cluster and they want to add hosts with more resources of any of the before mentioned types. Or also when licensing costs need to be limited and people want to restrict certain VMs to run a specific set of hosts. Something that comes up often when people are starting to look at virtualizing Oracle. (Andrew Mitchell published this excellent article on the topic of Oracle Licensing and soft vs hard partitioning which is worth reading!)
Why am I not a fan of imbalanced clusters when it comes to compute or storage resources? Why am I not a fan of crippling your environment purposely to ensure your VMs will only run on a subset of vSphere hosts? The reason is simple, the problems I have seen and experienced and the inefficiency in certain scenarios. Lets look at some examples:
Lets assume I have 4 hosts with each 128GB of memory. I need more memory in my cluster and I add a host with 256GB of memory. Now you just went from 512Gb to 768GB which is a huge increase. However, this is only true when you don’t do any form of admission control and resource management. When you do proper resource management or admission control than you would need to make sure that all of your virtual machines can run in the case of a failure, and preferably run with equal performance before and after the failure has occured. If you added 256GB of memory and this is being used and that host containing 256GB goes down your virtual machines could potentially be impacted. They might not restart, and if they restart they may not get the same amount of resources as they received before the failure. This scenario also applies to CPU, if you create an imbalance .
Another one I encountered recently was presenting a LUN to a limited set of hosts, in this case a LUN was only presented to 2 hosts out of the 20 hosts in that cluster… Guess what, when those two hosts die… so do your VMs. Not optimal right when they are running an Oracle database for instance. On top of that I have seen people pitching a VSAN cluster of 16 nodes with only 3 hosts contributing storage. Yes you can do that, but again… when things go bad, they will go horribly bad. Just imagine 1 host fails, how will you rebuild your components that were impacted? What is the performance impact? Very difficult to predict how it will impact your workload, so just keep it simple. Sure there is a cost overhead associated with separating workloads and creating dedicated clusters, but it will be easier to manage and more predictable in failure scenarios.
I guess in summary: If you want predictability in terms of availability and recoverability of your virtual machines go for a balanced environment, don’t create a Frankencluster!