performance

Storage IO Control, the movie

Duncan Epping · Jun 17, 2010 ·

Not sure why hardly anyone picked up on this cool youtube movie about Storage IO Control(SIOC), but I figured it was worth posting. SIOC is probably one of the coolest version coming to a vSphere version in the near future. Scott Drummonds wrote a cool article about it which shows the strength of SIOC when it comes to fairness. One might say that there already is a tool to do it and that’s called per VM disk shares, well that’s not entirely true… The following diagrams depict the current situation(without…) and the future(with…) :

As the diagrams clearly shows, the current version of shares are on a per Host basis. When a single VM on a host floods your storage all other VMs on the datastore will be effected. Those who are running on the same host could easily, by using shares, carve up the bandwidth. However if that VM which causes the load would move a different host the shares would be useless. With SIOC the fairness mechanism that was introduced goes one level up. That means that on a cluster level Disk shares will be taken into account.

There are a couple of things to keep in mind though:

SIOC is enabled per datastore
SIOC only applies disk shares when a certain threshold(Device Latency, most likely 30ms) has been reached.
- The latency value will be configurable but it is not recommended for now
SIOC carves out the array queue, this enables a faster response for VMs doing for instance sequential IOs
SIOC will enforce limits in terms of IOPS when specified on the VM level
No reservation setting for now…

Anyway, enough random ramblings… here’s the movie. Watch it!

For those with a VMworld account I can recommend watching TA3461.

vSphere Update 2 released

Duncan Epping · Jun 11, 2010 ·

By now the whole world has probably read that vSphere 4 Update 2 has been released. (release notes vCenter, release notes ESX, release notes ESXi ) Some of you might have even started slowly upgrading their test systems. (Like I am doing at the moment…)

I will not copy the full release notes but I do want to point out a couple of things on which I have been waiting for.

What’s Cool:

vSphere 4.0 U2 includes an enhancement of the performance monitoring utility, resxtop. The resxtop utility now provides visibility into the performance of NFS datastores in that it displays the following statistics for NFS datastores: Reads/s, writes/s, MBreads/s, MBwrtn/s, cmds/s, GAVG/s (guest latency).
VMware High Availability configuration might fail when advanced HA option das.allowNetwork uses vNetwork Distributed Switch (vDS) port group on an HA-enabled cluster, if you specify a vDS port group by using the advanced HA configuration option das.allowNetwork, the HA configuration on the hosts might fail. This issue is resolved in this release. Starting with this release, das.allowNetwork works with vDS.
The esxtop and resxtop utilities do not display various logical cpu power state statistics; this issue is resolved in this release. A new Power screen is accessible with the esxtop utility (supported on ESX) and resxtop utility (supported on ESX and ESXi) that displays logical cpu statistics. To switch to the Power screen, press y at the esxtop or resxtop screens.
For devices using the roundrobin PSP the value configured for the –iops option changes after ESX host reboot. If a device that is controlled by the roundrobin PSP is configured to use the –iops option, the value set for the –iops option is not retained if the ESX Server is rebooted. This issue is resolved in this release.

In this release many issues have been fixed, but also some new features have been added. For me personally the first one in the list is important. Up to ESX 4 Update 1 it was always needed to dive in to vscsiStats to see the Guest Latency for NFS based storage. As of Update 2 you can just run esxtop and check the statistics for your NFS datastore. This will definitely simplify troubleshooting, single pane of glass!

Is this VM actively swapping? (helping @heiner_hardt)

Duncan Epping · Jun 10, 2010 ·

On twitter @heiner_hardt asked for help with a performance related issue he was experiencing. As I am starting to appreciate esxtop more every single day and I really start to appreciate solving performance problems I decided to dive in to it.

After the initial couple of questions Heiner posted a screenshot:

Heiner highlighted (red outline) a couple of metrics which indicated swapping and ballooning as he pointed out with the text boxes. Although I can’t disagree that swapping and ballooning happened at some point in time I do disagree with the conclusion that this virtual machine is swapping. Lets break it down:

Global Statistics:

1393 Free -> Currently 1393MB memory available
High State -> Hypervisor is not under memory pressure
SWAP /MB 146 Cur -> 146MB has been swapped
SWAP /MB 83 Target -> Target amount that needed to be swapped was 83MB
0.00 r/s -> No reads from swap currently
0.00 w/s -> No writes to swap currently

World Statistics:

MCTLSZ 1307.27 -> The amount of guest physical memory that has been reclaimed by the balloon driver is 1307.27MB
MCTLTGT 1307.27 -> The amount of guest physical memory to be kept in the balloon driver is 1307.27MB
SWCUR 146.61 -> The current amount of memory that has been swapped is 146.61.
SWTGT 83.75 -> The target amount of memory that needed to be swapped was 83.75MB

Now that we know what these metrics mean and what the associated values are we can easily draw a conclusion:

At one point the host has most likely been overcommitted. However currently there is no memory pressure (state = high (>6% free memory)) as there is 1393MB of memory available. The metric “swcur” seems to indicate that swapping has occurred” however currently the host is not actively reading from swap or actively writing to swap (0.00 r/s and 0.00 w/s).

If the host is not experiencing memory pressure why is the balloon driver still inflated (MCTLTGT 1307.27MB)? Although the host is currently in a high memory state the amount of available memory almost equals the amount of claimed memory by the balloon driver. However deflating the balloon would return the host to a memory constrained state again.

My recommendation? Cut down on memory on your VMs! The fact that memory has been granted does not necessarily mean it is actively used and in this case it leads to serious overcommitment which in its turn leads to ballooning and even worse swapping.

One thing to point out though is the amount of “PSHARE” (TPS) is compared to average environments low. Might be something to explore!

PVSCSI and a 64bit OS

Duncan Epping · Jun 8, 2010 ·

Yesterday we had an internal discussion about the support of PVSCSI in combination with a 64bit OS. VMware’s documentation currently states the following:

Paravirtual SCSI adapters are supported on the following guest operating systems:

Windows Server 2008
Windows Server 2003
Red Hat Enterprise Linux (RHEL) 5

source

As we normally spell out every single detail this KB article is kind of ambiguous in my opinion. To clarify it, both 32bit and 64bit versions of the detailed operating systems are currently supported (vSphere 4.0). One thing to note though is that there are still limitations, for instance booting a Linux guest from a disk attached to a PVSCSI adapter is currently not supported.

esxtop -l ?

Duncan Epping · Jun 2, 2010 ·

I received a couple of questions around my esxtop article yesterday so I guess it wasn’t completely clear what “locked” meant. I had a difficult time understanding it myself but I was fortunate enough that one of my colleagues (Thanks Valentin) got to the bottom of it and emailed me the following explanation. I rewrote parts of it and this is the outcome, hope that it clears things up:

As most of you know esxtop takes snapshots from VSI nodes (similar to proc nodes) to capture the running entities and their states. The rate in which these snapshots are taken can be changed with the “s”. The default setting is 5 seconds and the minimum, which most people probably use, is 2 seconds. This means that every entity (worlds, for instance a virtual machine) and the associated info is queried again every two seconds. As many of the metrics shown in esxtop are calculated based on the difference of two successive snapshots, e.g. %USED (CPU), esxtop just rereads all the info(all entities and all values) and calculates the values of the metrics.

As you can imagine this can cause stress on your CPU in a very large environment. The reason for this is the amount of data that needs to be gathered for these entities and the amount of calculations which need to take place. However, with “lock mode” enabled only the changing states from those entities will be read from the VSI nodes. The entities(VMs, Worlds, LUNs etc) themselves will be copied over from the first snapshot that was taken when esxtop was started. This does however mean that when a new helper world is spawned or a virtual machine is powered on or VMotioned to the host it will not appear within esxtop until esxtop is restarted!

Below you see an example of entities and values that will definitely not change as long as esxtop with lock mode is running. All other stats will be updated and you are still free to select whatever fields you want, everything will be available as if nothing happened.

Since those entities and their relations don’t have to be read and calculated every time, esxtop’s CPU consumption will drop significantly. Again, please note that when a new VM is powered on, a VM is vMotion to the host or a new world is created it will not show up within esxtop when “-l” is used as the entities are locked! This also applies to starting esxtop in batch mode with -b.