HA and Slot sizes

Duncan Epping · Aug 12, 2009 ·

This has always been a hot topic, HA and Slot sizes/Admission Control. One of the most extensive (Non-VMware) articles is by Chad Sakac aka Virtual Geek, but of course since then a couple of things has changed. Chad commented on my HA Deepdive if I could address this topic, here you go Chad.

Slot sizes

Lets start with the basics.

What is a slot?

A slot is a logical representation of the memory and CPU resources that satisfy the requirements for any powered-on virtual machine in the cluster.

In other words a slot size is the worst case CPU and Memory reservation scenario in a cluster. This directly leads to the first “gotcha”:

HA uses the highest CPU reservation of any given VM and the highest memory reservation of any given VM.

If VM1 has 2GHZ and 1024GB reserved and VM2 has 1GHZ and 2048GB reserved the slot size for memory will be 2048MB+memory overhead and the slot size for CPU will be 2GHZ.

Now how does HA calculate how many slots are available per host?

Of course we need to know what the slot size for memory and CPU is first. Then we divide the total available CPU resources of a host by the CPU slot size and the total available Memory Resources of a host by the memory slot size. This leaves us with a slot size for both memory and CPU. The most restrictive number is the amount of slots for this host. If you have 25 CPU slots but only 5 memory slots the amount of available slots for this host will be 5.

As you can see this can lead to very conservative consolidation ratios. With vSphere this is something that’s configurable. If you have just one VM with a really high reservation you can set the following advanced settings to lower the slot size being used during these calculations: das.slotCpuInMHz or das.slotMemInMB. To avoid not being able to power on the VM with high reservations these VM will take up multiple slots. Keep in mind that when you are low on resources this could mean that you are not able to power-on this high reservation VM as resources are fragmented throughout the cluster instead of located on a single host.

Host Failures?

Now what happens if you set the number of allowed host failures to 1?
The host with the most slots will be taken out of the equation. If you have 8 hosts with 90 slots in total but 7 hosts each have 10 slots and one host 20 this single host will not be taken into account. Worst case scenario! In other words the 7 hosts should be able to provide enough resources for the cluster when a failure of the “20 slot” host occurs.

And of course if you set it to 2 the next host that will be taken out of the equation is the host with the second most slots and so on.

What more?

One thing worth mentioning, as Chad stated with vCenter 2.5 the number of vCPUs for any given VM was also taken in to account. This led to a very conservative and restrictive admission control. This behavior has been modified with vCenter 2.5 U2, the amount of vCPUs is not taken into account.

Comments

Frank Wegner says

12 August, 2009 at 16:52

If you do not want to tweak advanced parameters you could also check if you really need the large exceptional reservations. Reducing the reservation will also increase the HA slot size. One example I know: a single VM with 40 GB memory reservation crippled HA (failover capacity = 0), setting the reservation to 10 GB helped a lot.
Frank Wegner says

12 August, 2009 at 16:52

oops, not “increase slot size” but “increase failover capacity”.
NiTRo says

15 August, 2009 at 00:26

So the best case Is when no vm in the cluster has reservation ?
Duncan says

15 August, 2009 at 06:43

it’s not the best but the least restrictive. the best is when you are using a realistic reservation on at least one machine so that the slots are correctly sized.
Suttoi says

17 August, 2009 at 08:12

If you pool servers by service / application and give them no direct reservation, but instead make a reservation at the resource pool level, will this avoid admission control problems?
michael says

24 August, 2009 at 10:04

5. Suttoi said, August 17th, 2009 at 08:12 If you pool servers by service / application and give them no direct reservation, but instead make a reservation at the resource pool level, will this avoid admission control problems?

can someone answer the question?
Duncan Epping says

24 August, 2009 at 10:54

Yes it will avoid these issues.
michael says

25 August, 2009 at 09:40

so reservations at resource pool level do not affect failover capacity?
Duncan Epping says

2 October, 2009 at 16:16

no they do not.
michael says

4 October, 2009 at 23:02

thanks, Duncan
Jarrod Sturdivant says

26 March, 2012 at 16:50

If you have no CPU or Memory reservations in the virtual infrastructure, is the slot size calculated using the highest configured CPU and RAM for a virtual machine? If you have a single virtual machine that has significantly more RAM configured than your other virtual machines, how can you keep it from inflating your slot size without a reservation?
Duncan says

26 March, 2012 at 17:45

No it is calculated using a default: 256MHz for CPU (4.1 and prior) or 32MHz for CPU (5.0 and above) and the “memory overhead”.
Alan Wilson says

13 March, 2013 at 13:09

Are available ports/portgroups taken into consideration in the slot calculation?
- Duncan says
  
  13 March, 2013 at 13:48
  
  No.
  - Alan Wilson says
    
    13 March, 2013 at 14:07
    
    Didn’t think so but I have a maxed out cluster (View 5, 600+ VMs) with no available slots which won’t allow any more powered on VMs. Plenty of cpu and memory available – most of the powered on VMs are idle waiting for users to connect. Getting these errors in Vcenter log which triggered the question.
    
    2013-03-13T11:46:49.932Z [04240 verbose ‘Default’ opID=a917af28] [VpxdMoVm::PowerOnInt] PowerOnIntImpl failed on VM /vpx/vm/#18815/ for reason : vim.fault.InsufficientFailoverResourcesFault
    2013-03-13T11:46:49.932Z [04240 verbose ‘Default’ opID=a917af28] [MoDVSwitch::CheckForEagerPortAssignment] vm [COMP1] has no late binding portgroups in datacenter [VDI1] to connect to, moving on
    2013-03-13T11:46:49.932Z [04240 error ‘Default’ opID=a917af28] (Log recursion level 2) vim.fault.InsufficientFailoverResourcesFault
    
    Thanks for your quick response. will investigate further 🙂
    - Duncan says
      
      13 March, 2013 at 14:57
      
      Hold, there is a difference between having available resources and slotsize.
      
      Slotsize = based on reservation / memory overhead etc
      Used resources = what the VM is actually using, which says nothing about what is reserved
      
      Do you have a reservation set somewhere? Which version of vSphere are you using? Why are you using the “Host failures” admission control and not the “Percentage based”?
      - Alan Wilson says
        
        13 March, 2013 at 15:16
        
        vCenter 5.0.0, esxi 5.0.0 ,702118, view 5.0.1
        
        Why are you using the “Host failures” admission control and not the “Percentage based”?
        
        Good question: I’ve joined this project late and just starting to pick it apart. 😉 The original VMware design doc did specify 17% for the VDI clusters but they’ve all been set to 1 host failover. Just trying to find out why.
        What may have happened to trigger the errors is that some pools were changed recently to have 2vCPU VMs rather than 1 and HA using worst scenario has changed slot size to from 1 to 2 vCPU in response. Can always set the slot size manually, i suppose but I’d prefer to let HA do it, in case things change in the future.
        No reservations set anywhere that I can see. How often is the slot size calculated?
Rodd says

4 February, 2014 at 00:01

I have a question for this old blog/Q&A. Above Jarrod asked, “If you have no CPU or Memory reservations in the virtual infrastructure, is the slot size calculated using the highest configured CPU and RAM for a virtual machine?” Duncan replied, “No it is calculated using a default: 256MHz for CPU (4.1 and prior) or 32MHz for CPU (5.0 and above) and the memory overhead” We’re preparing for a 4.1 to 5.5 upgrade including a restructering of how guest are arranged on hosts/clusters. We’re trying to eliminate the rare instances of reservations so we can plan/predict the slot sizes for the new cluster configurations. If I can collect a spreadsheet of the VMs we’ll have in a on the hosts in a cluster, how do I predict the slot size in cluding this memory overhead?

Slot sizes

Host Failures?

What more?

Related

Reader Interactions

Comments