When I was looking in to vSphere Flash Read Cache (part of vSphere 5.5) there was one thing that had my interest, how does it interact with vSphere HA and DRS and more specifically, are there any caveats? It all started with the question what are the requirements for a virtual machine to be successfully restarted by vSphere HA?
The answer was simple, when you define a vSphere Flash Read Cache size for a virtual disk on a virtual machine, that amount of cache capacity defined for that virtual disk needs to be available on a local flash resource in order for the VM to be restarted / powered-on. So what does this mean? Well it means that when you set a flash read cache for a given virtual disk to 4GB, that 4GB needs to be available on your local host where the VM will be powered on. But what in the case of an HA initiated restart? Will HA ignore this requirement during restarts or will it try to guarantee the same performance?
This is a caveat when it comes to vSphere Flash Read Cache… there is no “HA level Admission Control”. What does that mean? Well vSphere Flash Read Cache offers a “resource” to your virtual machine, in this case we are talking about a read cache resource but it could be compared to CPU or Memory resources that are being offered. Now with CPU and Memory you have HA Admission Control which allows / disallows new VMs to be powered on based on the amount of resources used while taking a “failure scenario” in to account. All of you have probably set a reservation on a VM at some point and noticed vCenter started yelling “out of resources” when powering on a new VM because of that reservation… So here are three key things to remember:
- When defining a vSphere Flash Read Cache size, please note that in the current release this is a “hard reservation”.
- There is no integration between vSphere Flash Read Cache and vSphere HA Admission Control.
- When vSphere HA needs to restart a VM which has a vSphere Flash Read Cache resource assigned it can only do so if sufficient resources are available on a single host.
- HA will not restart the VM when flash resources cannot be provided by any host in the cluster. You either need to disable vSphere Flash Read Cache on the VM or decrease the size to the point where it can be restarted.
- HA does not have the ability to “defragment vSphere Flash Read Cache resources” at the moment, so you will need to manually move VMs when resources are fragmented.
- Each VM using vSphere Flash Read Cache resources also requires the host to use additional memory. The amount of extra memory used is based on the block size chosen and the amount of Flash Read Cache resources assigned to the respective VMs. HA will not restart the VM when there are not enough memory resources available.
So what does this mean? It means that from a vSphere Flash Read Cache perspective you need to make sure you don’t “over-provision” your flash cache resources. So if you have 5 hosts in a cluster and 1TB of flash resources in total and want to take a “1 host failure in to account” then you will need to make sure you don’t provision more than 800GB. Personally I would play it safe and provision even less as there is some overhead for metadata and you will also want to make sure vSphere HA can always power-on the virtual machine with the largest vSphere Flash Read Cache “hard reservation”. Maybe a recommended practice should be:
(Total Flash Resource - 10%) - Capacity N host failures
Does that make sense? Now one thing to note is also the potential impact on vSphere DRS that vSphere Flash Read Cache can have. Not a lot of people will realize this but there are some caveats when it comes to vSphere DRS and vSphere Flash Read Cache:
- DRS Initial Placement and Balancing – DRS will take vSphere Flash Read Cache in to account for initial placement, trying to find an optimal place for CPU/MEM and vSphere Flash Read Cache. However, it will not take optimal placement into account for balancing, so this could lead to a form of fragmentation. In other words, if you enable vSphere Flash Read Cache on a virtual machine this virtual machine will not be migrated except when the host is placed in maintenance mode, to comply to affinity/anti-affinity rules or when there is extreme resource contention.
- Fragmentation – If you set the vSphere Flash Read Cache size of a particular VM to lets say 10GB and every host in a 32 host cluster has 2GB of free space on the Flash Resource then DRS will not be able to move that VM around. Remember, the 10GB needs to be available on one of the hosts. There is no “defragmentation” currently for vSphere Flash Read Cache, this is a manual effort.
- Maintenance Mode – If you evict a host then Maintenance Mode can take a while… Reason for this being is that the cache migrates with the virtual machine (X-vMotion is used), so the larger the cache… the longer it takes before the host has entered maintenance mode. Of course you can still do a manual vMotion of the VMs with vSphere Flash Read Cache enabled first and select to discard the cache, this should speed up the process.
Just a couple of things which from an operational and architectural perspective are useful to know if you ask me… Note, that this is stuff I realized/observed when working with vSphere Flash Read Cache the last couple of months. These are my recommendations and this is my opinion not necessarily those/that of VMware.
PS: there are some more caveats documented in the release notes, make sure to read those.
s24sean says
Is there a direction to allow vFlash to simply be enabled on a per vmdk basis option and utilize whatever resources are available. Or VMware could automagically split the amount of read cache based on the number VM’s that have it enabled.
If I am an administrator I would like to simply select X,Y, and Z vmdk’s and enable read cache on them as a “best effort” service. Hard reservation could always be an option, similar to CPU allocation.
Does this make sense?
Rawlinson says
Nope, in the current release the only you have to define a working set per VMDK in order to utilize vFRC. The dynamic allocation is something that can come in a future release.
Duncan Epping says
Yes that makes sense, no we do not have that option unfortunately.
divnull says
It’s time for the next issue of vSphere 5.5 DeepDive. 😉