Server

Re: Large Pages (@gabvirtualworld @frankdenneman @forbesguthrie)

Duncan Epping · Jan 26, 2011 ·

I was reading an article by one of my Tech Marketing colleagues, Kyle Gleed and coincidentally Gabe published an article about the same topic to which Frank replied and just now Forbes Guthrie… the topic being Large Pages. I have written about this topic many times in the past and both Kyle, Gabe, Forbes and Frank mentioned the possible impact of large pages so I won’t go into detail.

There appears to be a lot of concerns around the benefits and the possible downside of leaving it enabled in terms of monitoring memory usage. There are a couple of things I want to discuss as I have the feeling that not everyone fully understands the concept.

First of all what are the Large/Small Pages? Small Pages are regular 4k memory pages and Large Pages are 2m pages. I guess the difference is pretty obvious. Now as Frank explained when using Large Pages there is a difference in TLB(translation lookaside buffer) entries; basically a VM provisioned with 2GB would need would need a 1000 TLB entries with Large Pages and 512.000 with Small Pages. Now you might wonder what this has got to do with your VM, well that’s easy… If you have an CPU that has EPT(Intel) or RVI(AMD) capabilities the VMkernel will try to back ALL pages with Large Pages.

Please read that last sentence again and spot what I tried to emphasize. All pages. So in other words where Gabe was talking about “does your Application really benefit from” I would like to state that that is irrelevant. We are not merely talking about just your application, but about your VM as a whole. By backing all pages by Large Pages the chances of TLB misses are decreased, and for those who never looked into what the TLB does I would suggest reading this excellent wikipedia page. Let me give you the conclusion though, TLB misses will increase latency from a memory perspective.

That’s not just it, the other thing I wanted to share is the “impact” of breaking up the large pages into small pages when there is memory pressure. As Frank so elegantly stated “the VMkernel will resort to share-before-swap and compress-before-swap”. There is no nicer way of expressing uber-sweetness I guess. Now one thing that Frank did not mention though is that if the VMkernel detects memory pressure has been relieved it will start defragmenting small pages and form large pages again so that the workload can benefit again from the performance increase that these bring.

Now the question remains what kind of performance benefits can we expect as some appear to be under the impression that when the application doesn’t use large pages there is no benefit. I have personally conducted several tests with a XenApp workload and measured a 15% performance increase and on top of that less peaks and lower response times. Now this isn’t a guarantee that you will see the same behavior or results, but I can assure it is beneficial for your workload regardless of what types of pages are used. Small on Large or Large on Large, all will benefit and so will you…

I guess the conclusion is, don’t worry too much as vSphere will sort it out for you!

Cool Tool Update: RVTools 3.0

Duncan Epping · Jan 23, 2011 ·

When I was enjoying some family time yesterday Eric Sloof stole my usual RVTools scoop. Nevertheless I believe it is worth publishing this as RVTools is one of the most valuable free non-vendor tools out there. Rob de Veij released a major version of RVTools. There are couple of major improvements in this version and hence the reason it took Rob slightly longer than expected to come with this update.

Here are the improvements in RVTools 3.0:

Pass-through authentication implemented. Allows you to use your logged on Windows credentials to automatically logon.
All numeric columns are now formated to make it more readable.
On vInfo the columns Commited, Uncommited, Shared and on vSnapshot the column size are now formated in MBs instead of bytes.
New tabpage created with service console and VMKernel information.
Now using vSphere Web Services SDK 4.1 which supports the new features available in vSphere 4.1
Export to csv file now uses Windows regional separator
Using NPOI to make it possible to write directly to xls files without the need for a installed Excel version on the system.
New menu function to write all information to one excel workbook with for each tabpage a new worksheet.
New command line options. Check the documentation!

Download it now,

Enable Storage IO Control on all Datastores!

Duncan Epping · Jan 20, 2011 ·

This week I received an email from one of my readers about some weird Storage IO Control behavior in their environment. On a regular basis he would receive an error stating that an “external I/O workload has been detected on shared datastore running Storage I/O Control (SIOC) for congestion management”. He did a quick scan of his complete environment and couldn’t find any hosts connecting to those volumes. After exchanging a couple of emails about the environment I managed to figure out what triggered this alert.

Now this all sounds very logical but probably is one of the most common made mistakes… sharing spindles. Some storage platforms carve out a volume from a specific set of spindles. This means that these spindles are solely dedicated to that particular volume. Other storage platforms however group spindles and layer volumes across these. Simply said, they are sharing spindles to increase performance. NetApp’s “aggregates” and HP’s “disk groups” would be a good example.

This can and probably will cause the alarm to be triggered as essentially an unknown workload is impacting your datastore performance. If you are designing your environment from the ground-up, make sure that all spindles that are backing your VMFS volumes have SIOC enabled.

However, in an existing environment this will be difficult, don’t worry that SIOC will be overly conservative and unnecessarily throttle your virtual workload. If and when SIOC detects an external workload it will stop throttling the virtual workload to avoid giving the external more bandwidth while negatively impact the virtual workload. From a throttling perspective that will look as follows:

32 29 28 27 25 24 22 20 (detect nonVI –> Max Qdepth )
32 31 29 28 26 25 (detect nonVI –> Max Qdepth)
32 30 29 27 25 24 (detect nonVI –> Max Qdepth)
…..

Please note that the above example depicts a scenario where SIOC notices that the latency threshold is still exceeded and the cycle will start again, SIOC checks latency values every 4 seconds. The question of course remains how SIOC knows that there is an external workload accessing the datastore. SIOC uses a what we call a “self-learning algorithm”. It keeps track of historical observed latency, outstanding IOs and window sizes. Based on that info it can identify anomalies and that is what triggers the alarm.

To summarize:

Enable SIOC on all datastores that are backed by the same set of spindles
If you are designing a green field implementation try to avoid sharing spindles between non VMware and VMware workloads

More details about when this event could be triggered can be found in this KB article.

Re: Do You Really Need the vMA? (reply to @maishsk)

Duncan Epping · Jan 20, 2011 ·

I was going through the Planet V12n blog posts and noticed one by Maish. His question was “Do You Really Need the vMA?” and I guess that is a valid question… I guess. The main theme in his article is PowerCLI good, vMA bad. Well that might be a bit overdone on my account, but when reading this quote you get my drift:

So my question to you is, what do we need to keep the vMA for? If someone would tell me because of the built-in syslog server which is available – that does not sell me. The amount of API objects that are currently available for use in PowerCLI – that are not available or exposed through the Perl SDK, has continuously been rising with each new version released. What is the reason to keep the vMA around in the future? What (if anything) is the necessity for you to have a vMA? Is there anything that you cannot do today with PowerCLI that can only be done with the vMA?

As Maish points out the things you can do with the vMA can also “easily” be done through the use of PowerCLI. Now if you are a PowerCLI expert like Alan Renouf or Luc Dekens that is certainly true as they know how to deal with all the API objects as not all features are exposed through standard cmdlets. I guess I just touched the major pain point of any scripting language out there, you need to be an expert for quick results… So let’s line up a couple of things here to make my point clear:

First and foremost, VMware is moving away from the Service Console, direct console access is not what an appliance type of hypervisor (ESXi) is about!
People use the console primarily for troubleshooting, agents and bash scripts

I guess that says enough. Who of you troubleshoots their environment by using a scripting language? Yes, maybe William Lam or again Luc / Alan, but “normal” people who don’t think in PowerCLI statements or Perl code will not. No, we will grab resxtop/esxtop to do performance troubleshooting. Yes, I know you can do more or less the same with PowerCLI but lets be honest nothing beats flicking through those metrics after typing “resxtop”. (even with powerCLI you can’t beat that!)

Or even better having access to the whole suite of “esxcfg-” commands. I don’t know about you, but when I want to troubleshoot my environment I need the “esxcfg-” to my disposal and I might not always get direct console access in an ESXi environment.

On top of that as a Consultant/Architect there is much value in having a single appliance running in an environment where you can store your scripts (just look at the huge archive that William Lam has created over the years) and run when ever you please to. I even used to have a vMA all set up in VMware Workstation and carried it around with all sorts of scripts, this way I didn’t need to access the customers “backbone”.

You can guess by now that I believe that the vMA has value and that this value not only lies in the fact that you have the commandline tools to your disposal which you have become accustomed to over the years but also that you have the option to “convert” your bash scripts to run within the vMA.

Don’t get me wrong, I am certainly not trying to dismiss PowerCLI here. PowerCLI has proven itself over and over again and is probably the scripting language with the least steepest learning curve. However it is by no-means a tool to do in-depth troubleshooting as it simply requires too much hands-on experience to be able to extract the details you need when your (internal) customer is breathing down your neck.

Yes the vMA is here to stay indeed,

Storage IO Control and Storage vMotion?

Duncan Epping · Jan 14, 2011 ·

I received a very good question this week to which I did not have the answer, I had a feeling but that is not enough. The question was if Storage vMotion would be “throttled” by Storage IO Control. As I happened to have a couple of meetings scheduled this week with the actual engineers I asked the question and this was their answer:

Storage IO Control can throttle Storage vMotion when the latency threshold is exceeded. The reason for this being is that Storage vMotion is “billed” to the virtual machine.

This basically means that if you initiate a Storage vMotion the “process” belongs to the VM and as such if the host is throttled the Storage vMotion process might be throttled as well by the local scheduler(SFQ) depending on the amount of shares that were originally allocated to this virtual machine. Definitely something to keep in mind when doing a Storage vMotion of a large virtual machine as it could potentially lead to an increase of the amount of time it takes for the Storage vMotion to complete. Don’t get me wrong, that is not necessarily a negative thing cause at the same time it will prevent that particular Storage vMotion to consume all available bandwidth.