I was working on an ESXTOP post when Jason Boche published his blog post “ESXTOP Drilldown“. My post was similar so I decided to dump the post and start over again within a few weeks or so.
Yesterday I encountered a performance issue at a customer site. One thing I’ve learned over the last couple of years is that “ESXTOP” can be very useful in pinpointing performance issues, so writing this article happened sooner than I expected. The customer measured all sorts of counters within the VM and all the symptoms made the customer conclude that the problem was related to the virtual SCSI controller and / or the virtual harddisks(vmdk’s). The symptoms were high “Physical Disk\Avg. Disk sec/Transfer” and peak “Physical Disk\Avg. Disk Writes/Sec” behaviour. In other words, transferring data to and from the disk took too long and there wasn’t a constant stream of I/O.
The initial conclusion that it was VMDK / SCSI Controller related isn’t weird at all. Looking at the values of the VM itself I would also suspect that this was the case. But ESXTOP revealed something totally different. There wasn’t a high “disk queue” or heavy I/O for this particular VM or host.
The “C” shows all the CPU information with a breakdown in worlds and groups. For more info on this check this Community Document. CPU revealed a high %RDY value. %RDY is also explained in the community document, besides all the other relevant values. In short what %RDY represents:
A world in a run queue is waiting for CPU scheduler to let it run on a PCPU. %RDY accounts the percentage of this time. So, it is always smaller than 100%.
Best practice would be values of around 5%, an occasional max of 10-20% wouldn’t hurt most VM’s / Applications. In this case the value was at least 50% all of the time.
Next question of course was “what causes this high ready times”. This one was easy to answer/explain: overprovisioning. The host had 8 cores (2 x quad core) and currently running on the host: 8 VM’s with a total of 18 vCPU’s. In other words, most VM’s were provisioned with multiple vCPU’s which makes scheduling even more difficult. Especially when you’ve got multiple 4 vCPU VM’s running. And when scheduling is successful the task would still have to wait for the complete socket to be available when there are no “idle loops” running within the VM vCPU’s.
One socket indeed, that’s based on the scheduler cell size which is 4 by default. In other words:
2 x dual core = 1 cell
1 x quad core = 1 cell
2 x quad core = 2 cell’s (each proc = 1 cell, vm’s can’t span multiple cell’s)
Each vCPU will be tied to a core, meaning that with 4 vCPU’s one would need 4 Cores, 1 complete cell available. (For those thinking I’ve got 6-core processors, there’s a way to increase the scheduler cell size! In short set vmkernel.boot.cpucellsize to 6) Conclusion of this very usefull day, be very careful when provisioning your VM’s. Do they really need 4 vCPU’s, or even 2 vCPU’s… and when you do provision them with multiple vCPU’s monitor these VM’s. Claim the vcpu’s back when these aren’t used or you don’t get the expected performance results.