vSphere 4.1, VMware HA New maximums and DRS integration will make our life easier

I guess there are a couple of keypoints I will need to stress for those creating designs:

New HA maximums

  • 32 host clusters
  • 320 virtual machines per host
  • 3,000 virtual machines per cluster

In other words:

  • you can have 10 hosts with 300 VMs each
  • or 20 hosts with 150 VMs each
  • or 32 with 93 VMs….

as long as you don’t go beyond 320 per host or 3000 per cluster you are fine!

DRS Integration

HA integrates on multiple levels with DRS as of vSphere 4.1. It is a huge improvement and it is something in my opinion that everyone should know about.

Resource Fragmentation

As of vSphere 4.1 HA is closely integrated with DRS. When a failover occurs HA will first check if there are resources available on that host for the failover. If resources are not available HA will ask DRS to accommodate for these where possible. Think about a VM with a huge reservations and fragmented resources throughout your cluster as described in my HA Deepdive. HA, as of 4.1, will be able to request a defragmentation of resources to accommodate for this VMs resource requirements. How cool is that?! One thing to note though is that HA will request it, but a guarantee can not be given so you should still be cautious when it comes to resource fragmentation.

DPM

In the past there barely was integration between DRS/DPM and HA. Especially when DPM was enabled this could lead to some weir behaviour  when resources where scarce and an HA failover would need to happen. With vSphere 4.1 this has changed. In such cases, VMware HA will use DRS to try to adjust the cluster (for example, by bringing hosts out of standby mode or migrating virtual machines to defragment the cluster resources) so that HA can perform the failovers.

Shares

I didn’t even found out about this one until I read the Availability Guide again. Prior to vSphere 4.1, an HA failed over virtual machine could be granted more resource shares then it should causing resource starvation until DRS balanced the load. As of vSphere 4.1 HA calculates normalized shares for a virtual machine when it is powered on after an isolation event!

You can skip to the end and leave a response. Pinging is currently not allowed.

11 Responses to “vSphere 4.1, VMware HA New maximums and DRS integration will make our life easier”

  1. Harold says:

    Hi Duncan, another nice blog on vSphere 4.1.
    Not surprised it’s focusing on HA/DRS, as you are the author ;-)
    Would you agree that prior to vSphere 4.1, if you enable HA Admission Control that you would not run into the problem you described in the DPM paragraph?

  2. Sketch says:

    Is there still a different limit on clusters with 8 or more hosts?

  3. @sketch, no there is not.

  4. Sketch says:

    sorry, clusters with 9+ hosts. So that limitation has been removed… cool. I guess I could always look at the maximums for 4.1… http://www.vmware.com/pdf/vsphere4/r41/vsp_41_config_max.pdf

  5. AK says:

    in the config maximum, there is a notation of this:
    “Maximum concurrent host HA failover – 4″

    Does this mean, only 4 nodes in a cluster can fail at once? Otherwise HA will not work at all?

    In our case, where we split our clusters between datacenters (50/50) This would be a problem if we lost a datacenter. (i.e. in a 10 node cluster with 5 on each side, when 5 nodes go offline, no HA will happen???)

  6. Sketch says:

    See Duncan’s HA Deepdive – very good…
    http://www.yellow-bricks.com/vmware-high-availability-deepdiv/

    The first 5 hosts that join the VMware HA cluster are automatically selected as primary nodes. All the others are automatically selected as secondary nodes… HA needs at least one primary host to restart VMs. This is why you can only take four host failures in account when configuring the “host failures” HA admission control policy. (Remember 5 primaries…)

  7. DP says:

    VMware’s continuing work on DPM makes me nervous…will it become another VCB that will have its plug pulled on it, or is the intention to make it a legitimate technology for larger organizations beyond the SMB space?

  8. Jason Boche says:

    I welcome these changes – one can never have too much flexibility as far as I am concerned.

    I’m glad VMware brought the VM scalability numbers more into alignments, much easier to remember now without all the DRS/HA/# of host clauses. DRS as a service for HA restart is also interesting, although, the amount of time for VMs to restart, if DRS has work to do first, would seem to be rather unpredictable as it’s going to vary from cluster to cluster.

  9. AFidel says:

    DP,
    They are working to make it work much better for large datacenters. I have no timeline but talking to one of the guys responsible for the strategic vision for DPM, they want to enable a large datacenter to shut down entire rows of servers AND their associated environmental equipment live HVAC and UPS’s to get some real cost savings. Again, no idea when that would make it into the product, but that is their driving direction.

Leave a Reply

Subscribe to RSS Feed Follow me on Twitter!