vSphere 4.1, VMware HA New maximums and DRS integration will make our life easier

Duncan Epping · Jul 14, 2010 ·

I guess there are a couple of keypoints I will need to stress for those creating designs:

New HA maximums

32 host clusters
320 virtual machines per host
3,000 virtual machines per cluster

In other words:

you can have 10 hosts with 300 VMs each
or 20 hosts with 150 VMs each
or 32 with 93 VMs….

as long as you don’t go beyond 320 per host or 3000 per cluster you are fine!

DRS Integration

HA integrates on multiple levels with DRS as of vSphere 4.1. It is a huge improvement and it is something in my opinion that everyone should know about.

Resource Fragmentation

As of vSphere 4.1 HA is closely integrated with DRS. When a failover occurs HA will first check if there are resources available on that host for the failover. If resources are not available HA will ask DRS to accommodate for these where possible. Think about a VM with a huge reservations and fragmented resources throughout your cluster as described in my HA Deepdive. HA, as of 4.1, will be able to request a defragmentation of resources to accommodate for this VMs resource requirements. How cool is that?! One thing to note though is that HA will request it, but a guarantee can not be given so you should still be cautious when it comes to resource fragmentation.

DPM

In the past there barely was integration between DRS/DPM and HA. Especially when DPM was enabled this could lead to some weir behaviour when resources where scarce and an HA failover would need to happen. With vSphere 4.1 this has changed. In such cases, VMware HA will use DRS to try to adjust the cluster (for example, by bringing hosts out of standby mode or migrating virtual machines to defragment the cluster resources) so that HA can perform the failovers.

I didn’t even found out about this one until I read the Availability Guide again. Prior to vSphere 4.1, an HA failed over virtual machine could be granted more resource shares then it should causing resource starvation until DRS balanced the load. As of vSphere 4.1 HA calculates normalized shares for a virtual machine when it is powered on after an isolation event!

Comments

Harold says

14 July, 2010 at 15:51

Hi Duncan, another nice blog on vSphere 4.1.
Not surprised it’s focusing on HA/DRS, as you are the author 😉
Would you agree that prior to vSphere 4.1, if you enable HA Admission Control that you would not run into the problem you described in the DPM paragraph?
Sketch says

14 July, 2010 at 16:29

Is there still a different limit on clusters with 8 or more hosts?
Duncan Epping says

14 July, 2010 at 16:34

@sketch, no there is not.
Sketch says

14 July, 2010 at 17:18

sorry, clusters with 9+ hosts. So that limitation has been removed… cool. I guess I could always look at the maximums for 4.1… http://www.vmware.com/pdf/vsphere4/r41/vsp_41_config_max.pdf
Duncan Epping says

14 July, 2010 at 20:26

@sketch: yes
AK says

14 July, 2010 at 20:28

in the config maximum, there is a notation of this:
“Maximum concurrent host HA failover – 4”

Does this mean, only 4 nodes in a cluster can fail at once? Otherwise HA will not work at all?

In our case, where we split our clusters between datacenters (50/50) This would be a problem if we lost a datacenter. (i.e. in a 10 node cluster with 5 on each side, when 5 nodes go offline, no HA will happen???)
Sketch says

14 July, 2010 at 21:18

See Duncan’s HA Deepdive – very good…
http://www.yellow-bricks.com/vmware-high-availability-deepdiv/

The first 5 hosts that join the VMware HA cluster are automatically selected as primary nodes. All the others are automatically selected as secondary nodes… HA needs at least one primary host to restart VMs. This is why you can only take four host failures in account when configuring the “host failures” HA admission control policy. (Remember 5 primaries…)
Ben says

15 July, 2010 at 06:32

Nice Post
DP says

15 July, 2010 at 14:06

VMware’s continuing work on DPM makes me nervous…will it become another VCB that will have its plug pulled on it, or is the intention to make it a legitimate technology for larger organizations beyond the SMB space?
Jason Boche says

15 July, 2010 at 20:47

I welcome these changes – one can never have too much flexibility as far as I am concerned.

I’m glad VMware brought the VM scalability numbers more into alignments, much easier to remember now without all the DRS/HA/# of host clauses. DRS as a service for HA restart is also interesting, although, the amount of time for VMs to restart, if DRS has work to do first, would seem to be rather unpredictable as it’s going to vary from cluster to cluster.
AFidel says

16 July, 2010 at 19:54

DP,
They are working to make it work much better for large datacenters. I have no timeline but talking to one of the guys responsible for the strategic vision for DPM, they want to enable a large datacenter to shut down entire rows of servers AND their associated environmental equipment live HVAC and UPS’s to get some real cost savings. Again, no idea when that would make it into the product, but that is their driving direction.
pedro says

14 June, 2012 at 11:08

From now on, I´ll try to follow your blog in order to learn more a more, anyway, your book is in my amazon´s wish list

greatings from spain!

vSphere 4.1, VMware HA New maximums and DRS integration will make our life easier

New HA maximums

DRS Integration

Resource Fragmentation

DPM

Shares

Related

New HA maximums

DRS Integration

Resource Fragmentation

DPM

Shares

Related

Reader Interactions

Comments