I was watching INF-VSP1423 – esxtop for Advanced Users today by Krishna Raj Raja. This is a VMworld 2012 San Francisco session, if you attended SF but did not attend this session look it up and watch it… If you are going to VMworld Barcelona, schedule it. It is an excellent session, deep technical with some great insights presented by a very smart VMware engineer. There was a tip in there which I found very useful.
Krishna showed an example where he noticed a lot of I/O being generated on a particular LUN. How do you figure out who / what is causing this? Well it is not as difficult as you think it would be…
- Open up esxtop (more details on my esxtop page)
- Go to the “Device” view (U)
- Find the device which is causing a lot of I/O
- Press “e” and enter the “Device ID” in my case that is an NAA identifier so “copy+paste” is easiest here
- Now look up the World ID under the “path/world/partition” column
- Go back to CPU and sort on %USED (press “U”)
- Expand (press “e”) the world that is consuming a lot of CPU, as CPU is needed to drive I/O
This should enable you to figure out which world is driving the high amount of I/Os. Now you can kill it, contact the user / admin causing it… nice right.
There are some more nuggets in this session around PSTATE (power state), co-sharing, Host Caching (llswp) and much more… I am not going to reveal those as you should be attending this session or at a minimum watch it online.
JC says
Does VMWare allow probing ESXTop stat via SNMP?