I was always under the impression that ESX 3.x still used “strict co-scheduling” as 2.5.x did. In other words when you have a multi vcpu vm all vcpu’s need to be scheduled and started at the same time on seperate cores/cpu’s. You can imagine that this can cause the VM to have high “ready times”, which means waiting for physical cpu’s to be ready to serve the multiple cpu task.
About a week ago and a month ago two blog’s appeared around this subject which clarifies the way ESX does vcpu scheduling as of 3.x. Read them for more in depth information. (1, 2)
So in short. Since 3.x VMware ESX uses “relaxed coscheduling”. And is this as relaxed as the name implies? Yes it is. And for a simple reason:
Idle vCPUs, vCPUs on which the guest is executing the idle loop, are detected by ESX and descheduled so that they free up a processor that can be productively utilized by some other active vCPU. Descheduled idle vCPU’s are considered as making progress in the skew detection algorithm. As a result, for co-scheduling decisions, idle vCPUs do not accumulate skew and are treated as if they were running . This optimization ensures that idle guest vCPUs don’t waste physical processor resources, which can instead be allocated to other VMs.
In other words VM’s with multiple vCPU’s don’t take up cycles anymore when these vCPU’s aren’t used by the OS. ESX checks the CPU’s for the idle proces loop and when it’s idle the CPU will be released and available for other vCPU’s. This also means that when you are using an application that will use all vCPU’s the same problems will still exists as they did in ESX 2.5, my advice don’t over do it. Only 1 vCPU is more than enough most of the times!
And what appeared to be a sidenote in the blog that deserves special attention is the following statement:
The %CSTP column in the CPU statistics panel shows the fraction of time the VCPUs of a VM spent in the “co-stopped” state, waiting to be “co-started”. This gives an indication of the coscheduling overhead incurred by the VM. If this value is low, then any performance problems should be attributed to other issues, and not to the coscheduling of the VM’s virtual cpus.
In other words, check esxtop to determine if there are coscheduling problems.
Hi,
Good catch. I’ve also read the articles you mentioned, but probably not good enough. Didn’t know that bit about relaxed co-sheduling.
Thanks
Gabrie