VMware

vGPUs and vMotion, why the long stun times?

Duncan Epping · Feb 7, 2020 ·

Last week one of our engineers shared something which I found very interesting. I have been playing with Virtual Reality technology and NVIDIA vGPUs for 2 months now. One thing I noticed is that we (VMware) introduced support for vMotion in vSphere 6.7 and support for vMotion of multi vGPU VMs in vSphere 6.7 U3. In order to enable this, you need to set an advanced setting first. William Lam described this in his blog how to set this via Powershell or the UI. Now when you read the documentation there’s one thing that stands out, and that is the relatively high stun times for vGPU enabled VMs. Just as an example, here are a few potential stun times with various sized vGPU frame buffers:

2GB – 16.5 seconds
8GB – 61.3 seconds
16GB – 100+ seconds (time out!)

This is all documented here for the various frame buffer sizes. Now there are a couple of things to know about this. First of all, the time mentioned was tested with 10GbE and the NVIDIA P40. This could be different for an RTX6000 or RTX8000 for instance. Secondly, they used a 10GbE NIC. If you use multi-NIC vMotion or for instance a 25GbE NIC than results may be different (times should be lower). But more importantly, the times mentioned assume the full frame buffer memory is consumed. If you have a 16GB frame buffer and only 2GB is consumed then, of course, the stun time would be lower than the above mentioned 100+ seconds.

Now, this doesn’t answer the question yet, why? Why on earth are these stun times this long? The vMotion process is described in this blog post by Niels in-depth, so I am not going to repeat it. It is also described in our Clustering Deep Dive book which you can download here for free. The key reason why with vMotion the “down time” (stun times) can be kept low is that vMotion uses a pre-copy process and tracks which memory pages are changed. In other words, when vMotion is initiated we copy memory pages to the destination host, and if a page has changed during that copy process we mark it as changed and copy it again. vMotion does this until the amount of memory that needs to be copied is extremely low and this would result in a seamless migration. Now here is the problem, it does this for VM memory. This isn’t possible for vGPUs unfortunately today.

Okay, so what does that mean? Well if you have a 16GB frame buffer and it is 100% consumed, the vMotion process will need to copy 16GB of frame buffer memory from the source to the destination host when the VM is stunned. Why when the VM is stunned? Well simply because that is the point in time where the frame buffer memory will not change! Hence the reason this could take a significant number of seconds unfortunately today. Definitely something to consider when planning on using vMotion on (multi) vGPU enabled VMs!

Disabling the frame rate limiter for your vGPU

Duncan Epping · Feb 4, 2020 ·

I have been testing with Virtual Reality apps within a VM for the past few days and I am leveraging NVIDIA vGPU technology on vSphere 6.7 U3. I was running into some undesired behavior and was pointed to the fact that this could be due to the frame rate being limited by default (Thank Ben!). I first checked the head-mounted display to see at what kind of frame rate it was running, by leveraging “adbLink” (for Mac) and the logcat command I could see the current frame rate hovering between 55-60. For virtual reality apps that leads to problems when moving your head from left to right as you will see black screens. For the Quest, for those wanting to play around with it as well, I used the following command to list the current frame rate for the NVIDIA CloudXR application (note that this is specific to this app) and “-s” filters for the keyword “VrApi”:

logcat -s VrApi

The result will be a full string, but the important bit is the following:

FPS=72,Prd=32ms,Tear=0,Early=0

I was digging through the NVIDIA documentation and it mentioned that if you used the Best Effort scheduler a frame rate limit would be applied. I wanted to test with the different schedulers anyway so I switched over to the Equal Share scheduler, which solved the problems listed above as it indeed disabled the frame rate limit. I could see the frame rate going up between 70 and 72. Of course, I also wanted to validate the “best effort” scheduler with frame rate limit disabled, I did this by adding an advanced setting to the VM:

pciPassthru0.cfg.frame_rate_limiter=0

This also resulted in a better experience, and again the frame rate going up to 70-72.

Can I still provision VMs when a vSAN Stretched Cluster site has failed? Part II

Duncan Epping · Dec 18, 2019 ·

3 years ago I wrote the following post: Can I still provision VMs when a VSAN Stretched Cluster site has failed? Last week I received a question on this subject, and although officially I am not supposed to work on vSAN in the upcoming three months I figured I could test this in the evening easily within 30 minutes. The question was simple, in my blog I described the failure of the Witness Host, what if a single host fails in one of the two “data” fault domains? What if I want to create a snapshot for instance, will this still work?

So here’s what I tested:

vSAN Stretched Cluster
4+4+1 configuration
- Meaning, 4 hosts in each “data site” and a witness host, for a total of 8 hosts in my vSAN cluster
Create a VM with cross-site protection and RAID-5 within the location

So I first failed a host in one of the two data sites. When I fail the host, the following is what happens when I create a VM with RAID-1 across sites and RAID-5 within a site:

Without “Force Provisioning” enabled the creation of the VM fails
When “Force Provisioning” is enabled the creation of the VM succeeds, the VM is created with a RAID-0 within 1 location

Okay, so this sounds similar to the originally described scenario, in my 2016 blog post, where I failed the witness. vSAN will create a RAID-0 configuration for the VM. When the host returns for duty the RAID-1 across locations and RAID-5 within each location is then automatically created. On top of that, you can snapshot VMs in this scenario, the snapshots will also be created as RAID-0. One thing to mind is that I would recommend removing “force provisioning” from the policy after the failure has been resolved! Below is a screenshot of the component layout of the scenario by the way.

I also retried the witness host down scenario, and in that case, you do not need to use the “force provisioning” option. One more thing to note. The above will only happen when you create a RAID configuration which is impossible to create as a result of the failure. If 1 host fails in a 4+4+1 stretched cluster you would like to create a RAID-1 across sites and a RAID-1 within sites then the VM would be created with the requested RAID configuration, which is demonstrated in the screenshot below.

Joined GigaOm’s David S. Linthicum on a podcast about cloud, HCI and Edge.

Duncan Epping · Oct 14, 2019 ·

A while ago I had the pleasure to join David S. Linthicum from GigaOm on their Voices in Cloud Podcast. It is a 22 minute podcast where we discuss various VMware efforts in the cloud space, edge computing and of course HCI. You can find the episode here, where they also have the full transcript for those who prefer to read instead of listen to a guy with a Dutch accent. It was a fun experience for sure, I always enjoy joining podcast’s and talking tech… So if you run a podcast and are looking for a guest, don’t hesitate to reach out!

Of course you can also find Voices in Cloud on iTunes, Google Play, Spotify, Stitcher, and other platforms.

Can you move a vSAN Stretched Cluster to a different vCenter Server?

Duncan Epping · Sep 17, 2019 ·

I noticed a question today on one of our internal social platforms, the question was if you can move a vSAN Stretched Cluster to a different vCenter Server. I can be short, I tested it and the answer is yes! How do you do it? Well, we have a great KB that actually documents the process for a normal vSAN Cluster and the same applies to a stretched cluster. When you add the hosts to your new vCenter Server and into your newly created cluster it will pull in the fault domain details (stretched cluster configuration details) from the hosts itself, so when you go to the UI the Fault Domains will pop up again, as shown in the screenshot below.

What did I do? Well in short, but please use the KB for the exact steps:

Powered off all VMs
Placed the hosts into maintenance mode (do not forget about the Witness!)
Disconnected all hosts from the old vCenter Server, again, do not forget about the witness
Removed the hosts from the inventory
Connected the Witness to the new vCenter Server
Created a new Cluster object on the new vCenter Server
Added the stretched cluster hosts to the new cluster on the new vCenter Server
Took the Witness out of Maintenance Mode first
Took the other hosts out of maintenance

That was it, pretty straight forward. Of course, you will need to make sure you have the storage policies in both locations, and you will also need to do some extra work if you use a VDS. Nevertheless, it works pretty much straight-forward and as you would expect it to work!