Lately I have been seeing more and more people recommending to limit clusters to eight hosts. I guess I might be more or less responsible for this “myth“, unintentionally of course as I would never make a recommendation like that.
My article was based on the maximum amount of VMs per host in a HA cluster with 9 hosts or more. The current limit is 40 VMs per host when there are 9 hosts or more in a cluster. With a maximum of 1280 VMs per cluster. (32 hosts x 40 VMs)
So why this post? I want to stress that you don’t need to limit your cluster based on these “limitations”. Just think about it for a second, how many environments do you know where they have 40+ VMs running on every single host? I don’t know many environments where they do exceed these limits, I guess exceptions are VDI environments…
So why would you want to “risk” exceeding these limits? Simple answer: TCO. Having two clusters is more expensive than a single a cluster. For those who don’t understand what I am trying to say: N+1. In the case of a single cluster you will have 1 spare host. In the case of two clusters you will have two spare hosts in total.
Another justification for a single cluster is DRS. More hosts in a cluster leads to more opportunities for DRS to balance the cluster. A positive “side effect” is also that the chances of resource congestion are reduced because there are more VM placement combinations possible.
Is there a recommendation? What is the VMware Best Practice? There simply isn’t one that dictates the cluster size. Although the maximums should be taken into consideration for support you should calculate your cluster size based on customer requirements and not on a max config sheet.
Jason Boche says
Our soft limit is 6 hosts per cluster.
With over 250 hosts total, we have many hosts running more than 40 VMs.
With an environment this size, I favor the higher density. Keeping up with patching the hosts is like painting the Golden Gate Bridge. At this point, the less hosts, the better, while maintaining N+1 or N+2. ESX can scale to the sky on today’s dense hardware and do its job.
wpatton says
Duncan,
Thank you, I have been seeing this more and more and also don’t agree with always limiting your clusters to 8 hosts. Even getting some recommending this in our environment.
We have over 500 hosts, and nowhere near 40+ VMs per host. There are limitations with density for VDI, as you mention, or having too many Hosts per LUN but in that size you should probably be seeing more than 1 LUN per 8 Hosts.
As always, thank you for clearly articulating complex issues.
Brian Knudtson says
I do a lot of implementations involving blades. In these cases I do try and limit cluster sizes to 8 due to HA when they only have two enclosures.
Sander Bierman says
Unfortunally I known an environment with 2 HA cluster (12 ESX 3.5 U4 hosts) where the maximum hosts an issue is right now. But I read your information how to increase the maximum hosts to 80.
Furtheron I read that when we use vSphere 4.0 U1 we can host a maximum of 160 VMs when we create a cluster of 8 hosts.
I can imagine that the infrastructure is a hot issue on the moment. Companies are saving money on the IT Infrastructure. I think a lot of big companies will reach the limits of their VMware ESX Infrastructure.
Craig says
You can also achieve higher consolidation ratio with Cisco UCS in place š
With the ESX host with High IO and memory capacity in place, to scale more than 50 : 1 is quite common today actually
Duncan says
Don’t know how many environments you actually seen where they do 50:1 but many enterprise environments can’t risk 50 VMs going down at once. Again, I know there are exceptions, that’s not what my article is about.
Forbes Guthrie says
“Keeping up with patching the hosts is like painting the Golden Gate Bridge.”
LOL, sounds familiar š
Forbes.
daVikes says
Other than reaching the max. cluster size is there other reasons for having seperate clusters? One may be different CPU Types, or ESX Clusters, and ESXi Clusters. I’m sure there are probably others but it seems you would want to scale a cluster out as much as possible to use DRS with a cluster as much as possible, getting back to the internal cloud concept, you want that internal cloud to be easy as possible. Need more capacity just add another host to your internal cloud “cluster” which really should be able to expand to an unlimited number of hosts. So what’s the hold up to the max. # of hosts per cluster? storage? how do we get past that?
Stefan Nguyen says
Well, if you happened to work for web hosting or companies that hosts hundreds of small websites for business units, consolidated 50:1 is a good solution for higher TCO. Due to the fact these small web servers don’t requires that much CPU/RAM usages. But, from an architecture standpoint, having that much eggs in a single basket is very risky and most mid/high end enterprise applications won’t exceed to 15 VMs per hosts, and if using HP Blades, HP recommended 4 hosts per clusters but that’s too low to take advantage of DRS. With vSphere 4.0 & View 4.0 + Nehalem, higher consolidation ratio is double comparing to ESX 3.5 days. I used to consolidate 64:1 on VDI no problems, but you could architect to run 128:1 now assuming you have latest hardware/RAM to support it.
chuck8773 says
We arrived at an 8 host cluster limit due to low iSCSI connection limits imposed by our SAN. It is always interesting to see the limits that people set and the reasons behind them.
Thanks Duncan
dconvery says
I’m with @Brian Knudtson. Most of my designs have been centered around HP blade servers. Because of the primaries in HA, I try to limit 4 members per chassis if they only have two chassis. In larger environments, where there are numerous chassis, I recommend no more than 16. That is 4 hosts per HA cluster per chassis, 4 chassis per rack. If they have multiple racks, I will try to spread between racks as well. As I start getting into designing around the UCS, I will most likely keep the same numbers unless there is a compelling reason to change it.
I have seen only two instances where a customer has had more than 40 VMs per host. These were DL580 servers at one and R900’s at the other. Hosts had 128-256GB RAM.
I guess if the customer is willing to take the risk, has sufficient resources for HA AND has sufficient I/O and bandwidth to the hosts, then it may be OK. I usually tend to take a conservative approach and make availability a high priority.
Dave
Stefan Nguyen says
Dave,
That is exactly what I’m doing for our HP Blades & c7000 enclosures with 4×4 standards to be safe. We will have to deploy approximately 200 hosts and may spike to 300 if surge season kicks in. If I have the option using Cisco UCS would be awesome too.
Happy New Year Everyone!
clarke thomas says
I can’t speak on the # of hosts per cluster, & am unsure as to why you’d limit it anyhow? Too much load on the SAN?
as far as VMs, the load can go well above 40+ VMs per host mostly due to balancing your loads. For instance you could have 40+ VDIs which have very little i/o, but a few servers which have more i/o needs. Why waste hardware on low usage VDIs? Also with pools you could separate them as needed or let VC do it for you.
Craig says
1 of the important consideration we always consider is to avoid everything in 1 bucket. you may need to consider your tiers architecture on the application you deploy, and ensure it been split to multiple physical ESX host to achieve better availability. In that case, when 1 host failure, it does not bring down the entire environment.
Harley Stagner says
I believe with VMware View, the max cluster size of 8 is a little more concrete. If you want to use linked clones, the max supported size is 8. This is due to the 8 simultaneous read locks on a given VMFS volume if I remember correctly.
Preetam Zare says
Hello Duncan
One question came to mind and that is how I’m going configure fail over capacity here. If we configure HA and DRS cluster we will have limit of 160 VM’s per host.
1.So each host will hold 1280/8=160 VM assuming 8 node.
2.Even if one node goes down 160 VM are gone, as other host are already up to their full capacity.
3.So you cannot configure any failover capacity when you reach 160 VM limit on 8 node.
you will have to go for more than 8 node which automatically reduce per host vm to 40. Is this conclusion right ?
I would really appreciate your reply
I full understand we are never going to reach this limit but just as design consideration i should be aware of this.
Duncan says
Theoretically yes, that is correct. It’s all theoretical as my guess is HA will just restart them as long as you have enough capacity.
Dan says
Hi,
Does anyone know if it is possible to enable multiple DRS clusters
on the same vCenter machine in parallel? What is the max number of
clusters in this case? Also, where does the limitation of 32
hosts per cluster come from?
Dan
Duncan Epping says
Yes you can have multiple clusters with 32 Hosts. 32 Host limit is just a “limitation” from a DRS/HA standpoint. Probably has to do with QA.
Dan says
Can you run more than one cluster of 32 hosts on the same vCenter?
Dan says
Thanks, Duncan.
In a 1,000 host scenario, what are the best practices in terms of the number of vCenter instances required in linked mode. Does the DRS cluster size impact this number? i.e. if it is split into 32 clusters of 32 or 100 clusters of 10, is there a difference?
Dan
Kayser Soze says
What is the recommended MINIMUM no. of host per HA/DRS cluster?
Duncan says
I would say 3. In that case even when you are doing maintenance you will have resiliency. But if a customer feels that that is unneeded than 2 is an option. It depends basically,
Scott Rosenblatt says
View: we weren’t even thinking about view or view Linked clones when we set up our 12 host vsphere cluster, now we got a ticket to install and configure view and we are screwed, lol will have to see if we will have enough of a demand for view for a budget for a separate view cluster of less than 8 hosts, arg.
Taj says
Hi Duncan,
i was told that there is a limitation of 16 hosts per cluster if each host is configured with 2 HBA port in vsphere 5.1, and if i wanted to reach the host maximumm which is 32 hosts per cluster i need to configure each host with single hba !
and i was told that VMware PSO team recommends 12 hosts per cluster as if we configure it with 16 hosts there will be alot of paths to manage, hence more management over head for the hypervisor that would lead to performance issue
it doesn’t make sense, since the maximum luns per host is 256 and number of total path per server is 1024 and number of path to a lun is 32, why would my cluster be limited to 16 ? can you comment on this please ?
Duncan Epping says
I don’t who exactly recommended this, but I don’t know of any best practices / recommendation to limit your cluster size to 12 or 16 to be honest. I would say that it is a myth, or an outdated recommendation that is still floating around.