During my flight from Boston back to the Netherlands I listened to the VMworld esxtop session “Troubleshooting using ESXTOP for Advanced Users, TA6720“. As always an excellent session with a lot of in-depth info. Most of it was already documented, however there were a couple of key points that I hadn’t documented yet. I just added those to my esxtop page which I wanted to stress as I personally believe it is very useful info. It seems pretty random but it rolled up nicely into the esxtop page in my opinion.
- %SYS should be less than 20, %SYS is the percentage of time spent by system services on behalf of the world. The possible system services are interrupt handlers, bottom halves, and system worlds.
- -b = batch mode, adding “-a” will force all metrics to be gathered
- Limit display to a single group (l)
- enables you to focus on a specific VM
- Limiting the number of entities (#)
- this enables you for instance to watch the top 5 worlds for
I have also added thresholds for ZIP/s, UNZIP/s and CACHEUSD. These should of course be 0 from a performance perspective as anything larger than 0 means the host was overcommitted on memory and had to resort to memory compression.
If anyone has more metrics/thresholds to contribute which they used in the past to troubleshoot issues let me know!
Sebastian Kayser says
Duncan, I already spoke to Haiping after the VMworld Europe session as I _really_ like the limit option and we have a use case where I think it is hitting it’s limits (excuse the pun and correct me if I am missing something).
We manage a bunch of VDI ESX hosts each running about 80 VMs where we sometimes need to look at a specific VM. Now when we pull up esxtop it’s not even granted that the VM we want to look at is visible in the list (terminal height limited). Thus, we can’t determine the GID from esxtop, which we would need to limit the display.
What we end up with is to use the esxtop command line parameters -export-entity and -import-entitity to limit the display to the VM (or sometimes VMs) in question. We have a script snippet in place to ease the effort (http://blog.consol.de/virtualisierung/2010/09/06/vmware-esxtop-auf-diaet-teil-1/, german only, sorry).
Still, here it would be super-helpful, if
a) esxtop would understand a commandline option by which one could pass it a list of VMs to look at, e.g. “esxtop -vm vm1,vm2,vm3”. This would bring up esxtop monitoring all entities related to these VMs (storage, network, CPU, memory).
or
b) the limit option from within esxtop would understand VM names instead of GIDs.
Haiping said that the names displayed within esxtop are not necessarily the ones from vCenter (VM could have been online renamed, esxtop doesn’t reflect this) and that’s why they work with GIDs rather with names. A name-based would however streamline the troubleshooting process because that’s what an admin already knows about the VM.
Sebastian
P.S.: Having said all this, is there a commandline way besides esxtop to determine the GID from a VM name? That way, we could find it out despite the terminal height limitations and then feed it to esxtop.
Cherian says
Hey Duncan,
Could you please tell me how did you watched the VMworld session while you were on the place? Is there a way we can get these sessions in a CD?
I’m becoming a FAN of yours and I love your blog.
Can I ask you one more question…
we have around 100 ESX (not ESXi)hosts sitting in three data centers running as part of different clusters. I need to present a NAS share as a data store to all these hosts where we are planning to store ISO files.
The issues in this case, I need 100 IP address to create a VMkernal port group. I do have a Vswitch with Service console and Vmotion port group but the issue is the subnets used for SC and VMotion are different.
If our servers were ESXi, then I could have simply used the Management interface IP since it is already using a VMkernal Port group.
How can I get this done without wasting 100 IP address?
CHerian