performance

Evaluating SSDs in Virtualized Datacenters by Irfan Ahmad

Duncan Epping · Jun 3, 2013 ·

Flash-based solid-state disks (SSDs) offer impressive performance capabilities and are all the rage these days. Rightly so? Let’s find out how you can assess the performance benefit of SSDs in your own datacenter before purchasing anything and without expensive, time-consuming and usually inaccurate proofs-of-concept.

** Please note that this article is written by Irfan Ahmad, follow him on twitter and make sure to attend his webinar on the 5th of June on this topic, and vote for CloudPhysics in the big data startup top 10. **

I was fortunate enough to have started the very first project at VMware that optimized ESX to take advantage of Flash and SSDs. Swap to Host Cache (aka Swap-to-SSD) shipped in vSphere 5. For those customers wanting to manage their DRAM spend, this feature can be a huge cost saving. It also continues to serve as a differentiator for vSphere against competitors.

Swap-to-SSD has the distinction of being the first VMware project to fully utilize the capabilities of Flash but it is certainly not the only one. Since then, every established storage vendor has entered this area, not to mention a dozen awesome startups. Some have solutions that apply broadly to all compute infrastructures, yet others have products that are specifically designed to address the hypervisor platform.

The performance capabilities of the Flash are indeed impressive. But they can cost a pretty penny. Marketing machines are in full force trying to convince you that you need a shiny hardware or software solution. An important question remains: can the actual benefit keep up with the hype? The results are mixed and worth reading through.

Do I still need to set “HaltingIdleMsecPenalty” with vSphere 5.x?

Duncan Epping · Feb 4, 2013 ·

I received a question last week from a customer. They have a fairly big VDI environment and are researching the migration to vSphere 5.1. One of the changes they made in the 4.1 time frame was the advanced setting “HaltingIdleMsecPenalty” in order to optimize hyper threading fairness for their specific desktop environment. I knew that this was no longer needed but didn’t have an official reference for them (There is a blog post by Tech Marketing performance guru Mark A. that mentions it though). Today I noticed it was mentioned in a whitepaper which was recently released titled “The CPU Scheduler in VMware vSphere 5.1“. I recommend everyone to read this whitepaper as it gives you a better understanding of how the scheduler works and how it has been improved over time.

The following section is an outtake from that white paper.

Improvement in Hyper-Threading Utilization

In vSphere 4.1, a strict fairness enforcement policy on HT systems might not allow achieving full utilization of all logical processors in a situation described in KB article 1020233 [5]. This KB also provides a work-around based on an advanced ESX host attribute, “HaltingIdleMsecPenalty”. While such a situation should be rare, a recent change in the HT fairness policy described in “Policy on Hyper-Threading,” obviates the need for the work-around. Figure 8 illustrates the effectiveness of the new HT fairness policy for VDI workloads. In the experiments, the number of VDI users without violating the quality of service (QoS) requirement is measured on vSphere 4.1, vSphere 4.1 with “HaltingIdleMsecPenalty” tuning applied, and vSphere 5.1. Without the tuning, vSphere 4.1 supports 10% fewer users. On vSphere 5.1 with the default setting, it slightly exceeds the tuned performance of vSphere 4.1.

Nice advanced ESXTOP tip from #VMworld session INF-VSP1423

Duncan Epping · Sep 24, 2012 ·

I was watching INF-VSP1423 – esxtop for Advanced Users today by Krishna Raj Raja. This is a VMworld 2012 San Francisco session, if you attended SF but did not attend this session look it up and watch it… If you are going to VMworld Barcelona, schedule it. It is an excellent session, deep technical with some great insights presented by a very smart VMware engineer. There was a tip in there which I found very useful.

Krishna showed an example where he noticed a lot of I/O being generated on a particular LUN. How do you figure out who / what is causing this? Well it is not as difficult as you think it would be…

Open up esxtop (more details on my esxtop page)
Go to the “Device” view (U)
Find the device which is causing a lot of I/O
Press “e” and enter the “Device ID” in my case that is an NAA identifier so “copy+paste” is easiest here
Now look up the World ID under the “path/world/partition” column
Go back to CPU and sort on %USED (press “U”)
Expand (press “e”) the world that is consuming a lot of CPU, as CPU is needed to drive I/O

This should enable you to figure out which world is driving the high amount of I/Os. Now you can kill it, contact the user / admin causing it… nice right.

There are some more nuggets in this session around PSTATE (power state), co-sharing, Host Caching (llswp) and much more… I am not going to reveal those as you should be attending this session or at a minimum watch it online.

VMworld session report: INF-STO2223 – Tech Preview vSphere Integration with Existing Storage

Duncan Epping · Sep 7, 2012 ·

A couple of weeks ago I posted an article about Virtual Volumes aka vVOLs. This week at VMworld Thomas (Tom) Phelan and Vijay Ramachandran delivered a talk which again addressed this topic but they added Virtual Flash to the mix. The session was “INF-STO2223”.

For those attending Barcelona, sign up for it! It is currently scheduled once on Wednesday at 14:00.

The session started out with a clear disclaimer, this was a technology preview and there is no guarantee whatsoever that this piece of technology will ever be released.

Tom Phelan covered Virtual Flash and Vijay covered Virtual Volumes but as Virtual Volumes was extensively covered in my other blog post I would like to refer back to that blog post for more details on that topic. This blog post will discuss the “Virtual Flash” portion of the presentation, virtual flash or vFlash in short is often also called “SSD caching”.

The whole goal of the Virtual Flash project is to allow vSphere to manage SSD as a cluster resource, just like CPU and memory today. Sounds familiar right for those who read the blog post about vCloud Distributed Storage?! The result of this project should be a framework which allows partners to insert their caching solution and utilize SSD resources more effectively without some of the current limitations.

Virtual Flash may be VM-transparent but also VM-aware. Meaning that it should for instance be possible to allocate resources per virtual machine or virtual disk. Some controls that should be included are reservations, shares and limits. On top of that, it should fully work with vMotion and integrate with DRS.

Two concepts were explained:

VM transparent caching
VM-aware caching

VM transparent caching uses a hypervisor kernel caching module which sits directly in the virtual disk’s data path. It can be used in two modes, write thru cache (read only) and write back cache (read and write). On top of that it will provide the ability to migrate cache content during a vMotion or discard the cache.

VM-aware caching is a type of caching where the Virtual Flash resource is presented directly to the virtual machine as a device. This allows the virtual machine to control the caching algorithm. The cache will in this case automatically “follow” the virtual machine during migration. It should be pointed out that if the VM is powered off the cache is flushed.

For those managing virtual environments, architecting them or providing health check services… think about the most commonly faced problem, yes that typically is storage performance related. Just imagine for a second having a caching solution at your disposal which could solve most of these problems…. Indeed that would be awesome. Hopefully we will hear more soon!

Why is %WAIT so high in esxtop?

Duncan Epping · Jul 17, 2012 ·

I got this question today around %WAIT and why it was so high for all these VMs. I grabbed a screenshot from our test environment. It shows %WAIT next to %VMWAIT.

First of all, I suggest looking at %VMWAIT. This one is more relevant in my opinion than %WAIT. %VMWAIT is a derivative of %WAIT, however it does not include %IDLE time but does include %SWPWT and the time the VM is blocked for when a device is unavailable. That kind of reveals immediately why %WAIT seems extremely high, it includes %IDLE! Another thing to note is the %WAIT for a VM is multiple worlds collided in to a single metric. Let me show you what I mean:

As you can see 5 worlds, which explains the %WAIT time to be around 500% constantly when the VM is not doing much. Hope that helps…

<edit> I just got pointed to this great KB article by one of my colleagues. It explains various CPU metrics in-depth. Key take away from that article for me is the following: %WAIT + %RDY + %CSTP + %RUN = 100%. Note that this is per world! Thanks Daniel for pointing this out!</edit>