I was reading a whitepaper by VKernel and it mentioned the following “a failover host for these VMs requires sufficient idle resources“. In this whitepaper it is discussed how Monster VMs pose challenges for both HA and DRS. As I had a similar question last week at VMworld I figured I would post this. Also because it is fundamental to understand this with regards to HA. Now the thing is, I agree that there is no point in creating large VMs just because you can. Without a doubt do Monster VMs pose challenges with regards to managing resources. However I do want to point out that technically speaking the statement is incorrect.
To power-on a VM you need unreserved memory capacity! The unreserved memory capacity needs to be equal to the memory reservation of the VM and the memory overhead! In other words, if you set no memory reservation you can power-on multiple 96GB VMs on a 48GB host. Just because the memory overhead is much lower than 48GB of memory. Now this doesn’t mean it is a best practice, or this is something I would recommend, but it does mean that if you look at how HA handles a fail-overs it will accommodate the restart of these virtual machine. This also means that with regards to HA Admission Control, chances of not being able to power-on your virtual machine because of insufficient resources are fairly slim. I bet that if you over-commit to such an extent that a power-on operation is impossible you have a lot more challenges to begin with!
Frank Denneman wrote a nice article about this a while back, it explains perfectly what the impact is of a memory reservation.
Yes, and HA is not prone to admission control. I think it hadn’t be reminded during the Q&A session.
We have this more with CPU. We have one host with 48 cores, the others only have 8. So our 12 vCPU VMs need to be re-configured when powering on after a failure.