I was reading up on vMotion today and stumbled on this excellent article by my colleague Kyle Gleed and noticed something that hardly anyone has blogged about…. Quick Resume. Quick Resume is a feature that allows you to vMotion a virtual machine which has a high memory page change rate. Basically when the change rate of your memory pages exceeds the capabilities of your network infrastructure you could end up in a scenario where vMotioning a virtual machine would fail as the change rate would make a switch-over impossible. With Quick Resume this has changed.
Quick Resume enables the source virtual machines to be stunned while starting the destination virtual machine before all pages have copied. However, as the virtual machine is already running at the destination it could possibly attempt to touch (read or write) a page which hasn’t been copied yet. In that case Quick Resume requests the page from the source to allow the guest to complete the action while continuously copying the remaining memory pages until all pages are migrated. But what if the network would fail at that point, wouldn’t you end up with a destination virtual machine which cannot access certain memory pages anymore as they are “living” remotely? Just like Storage IO Control, vMotion leverages shared storage. A special file would be created in the case Quick Resume is used and this file is basically used as a backup buffer. In the case the network would fail this file would allow for the migration to complete. This file is typically in the order of just a couple MBs. Besides being used as a buffer for transferring the memory pages it also enables bi-directional communication between the two hosts allowing the vMotion to complete as though the network hadn’t failed. Is that cool or what?
The typical question that arises immediately is if this will impact performance? It is good to realize that without Quick Resume vMotioning large memory active virtual machines would be difficult. The switch-over time could potentially be too large and lead to temporary loss of connection with the virtual machine. Although Quick Resume will impact performance when pages that are not copied yet are accessed, the benefits of being able to vMotion very large virtual machines with minimal impact by far outweigh this temporary increase of memory access time.
There is so many cool features and enhancements in vSphere that I just keep being amazed.
