I had a question this week from one of my colleagues which had me dazzled for a while. A customer had an HA enabled cluster and used “Host Failures Cluster Tolerates” as the admission control policy. As you hopefully all know it uses a slot algorithm, in short:
HA uses the highest CPU reservation of any given VM and the highest memory reservation of any given VM. If no reservations of higher than 256Mhz are set HA will use a default of 256Mhz for CPU and a default of 0MB+memory overhead for memory.
In their case they ended up with a slot size of 405MB. However after validating the overhead of all machines they found that the largest memory overhead was 149MB. So where did this 405MB come from? Luckily one of the engineers responded to the email thread and managed to clear things up. With vCenter 2.5 we also used a default slotsize of 256MB for memory. This default slotsize is configured in “vpxd.cfg” and unfortunately after upgrading from 2.5 to vCenter 4.0 this setting is not reset. For this customer that meant that the result was:
256 (default slotsize) + 149 (dynamic memory overhead) = 405MB
Although a minor issue, definitely something to keep in mind when troubleshooting HA slotsize issues. Always check the vpxd.cfg and check if there are any values defined for “<slotMemMinMB>”.