At the end of 2010 I wrote an article about cluster sizes… ever since it has been a popular article and I figured that it was time to update it. vSphere 5 changed the game when it comes to sizing/scaling of your clusters and I this is an excellent opportunity to emphasize that. The key take-away of my 2010 article was the following:
I am not advocating to go big…. but neither am I advocating to have a limited cluster size for reasons that might not even apply to your environment. Write down the requirements of your customer or your environment and don’t limit yourself to design considerations around Compute alone. Think about storage, networking, update management, max config limits, DRS & DPM, HA, resource and operational overhead.
We all know that HA used to be a constraint for your cluster size… However these times are long gone. I still occasionally see people referring to old “max config limits” around the amount of VMs per cluster when exceeding 8 hosts… This is not a concern anymore. I also still see people referring to the max 5 primary node limit… Again not a concern anymore. I guess we can generalize things and using the 2010 article and applying that to vSphere 5 I guess we can come to the following conclusions:
- HA does not limit the number of hosts in a cluster anymore! Using more hosts in a cluster results in less overhead. (N+1 for 8 hosts vs N+1 for 32 hosts)
- DRS loves big clusters! More hosts equals more scheduling opportunities.
- SCSI Locking? Hopefully all of you are using VAAI capable arrays by now… This should not be a concern. Even if you are not using VAAI, optimistic locking should have relieved this for almost all environments!
- Max number of hosts accessing a file = 8! This is a constraint in an environment using linked clones like View
- Max values in general (256 LUNs, 1024 Paths, 512 VMs per host, 3000 VMs per cluster)
Once again, I am not advocating to scale-up or scale-out. I am mere showing that there are hardly any limiting factors anymore at this point in time. One of the few constraints that is still valid is the max of 8 hosts in a cluster using linked clones. Or better said, a max of 8 hosts accessing a file concurrently. (Yes we are working on fixing this…)
I would like to know from you guys what the cluster sizes are you are using, and if you are constraint somehow… what those constraints are… chip in!
