ESX

VMware Availability Solutions and Futures (BC3425 – Banjot Chanana)

Duncan Epping · Sep 16, 2009 ·

I was just replaying Banjot Chanana’s session “VMware Availability Solutions and Futures“. Banjot is the product manager for the availability solutions HA and FT. I met Banjot in Palo Alto the week before VMworld and we spoke about HA, present and futures. Unfortunately I can’t elaborate on anything that has been discussed but I can however repeat what Banjot spoke about during his session at VMworld.

The most exciting part of the presentation, for me at least, start at roughly 35:40. Banjot start to elaborate on futures especially when the 3D model gets expanded with “Stretched Clusters with FT” and “Stretched HA Clusters” I start to get interested. Some bullet points on future developments:

VM Component Protection -> loss of storage / loss of VM network -> fail-over / alert
Drives higher availability against granular outages
Stretched HA Clusters -> Carving up Clusters in “sub-clusters” by tagging VMs -> fail-over to other “sub-cluster” based on affinity
Drives higher availability against site failures
Application Monitoring -> Application awareness / correlation between infrastructure and application events -> SLA awareness also performance by using DRS
Drives higher availability against application / service failure
Host Retirement -> Host health scores would also indicate “VM readiness” of a host -> VMotion based on host health scores ->
Drives higher availability by monitor host health and taking action when thresholds are exceeded
Integrated Availability -> Availability Policies vs per VM settings -> Defining tiers and applying them to sets of VMs -> Based on SLA
Decreases operational efforts and increases availability by reducing “human errors”

Although some people were disappointed by the lack of announcements of new products I think there’s more than enough exciting features coming up if you know where to find them. Thanks Banjot for these insights,

Future HA developments… (VMworld – BC3197)

Duncan Epping · Sep 15, 2009 ·

I was just listening to “BC3197 – High Availability – Internals and Best Practices” by Marc Sevigny. Marc is one of the HA engineers and is also my primary source of information when it comes to HA. Although most information can be found on the internet it’s always good to verify your understanding with the people who actually wrote it.

During the session Marc explains, and I’ve written about in this article, that when a dual host failure occurs the global startup order is not taking into account. The startup order will be processed per host with the current version. In other words “Host a” first with taking startup order into account and then “Host B” with taking startup order into account.

During the session however Marc revealed that in a future version of HA global startup settings(Cluster based) will be taken into account for any number of host failures! Great stuff, another thing to mention is that they are also looking into an option which would enable you to pick your primary hosts. For blade environment this will be really useful. Thanks Marc for the insights,

Memory alarms triggered with AMD RVI and Intel EPT?

Duncan Epping · Sep 11, 2009 ·

I’ve reported on this twice already but it seems a fix will be offered soon. I discovered the problem back in March when I did a project where we virtualized a large amount of Citrix XenApp servers on an AMD platform with RVI capabilities. As Hardware MMU increased performance significantly it was enabled by default for 32Bit OS’es. This is when we noticed that large pages(side effect of enabling MMU) are not TPS’ed and thus give a totally different view of resource consumption than on your average cluster. When vSphere and Nehalem was released more customers experienced this behavior, as EPT(Intel’s version of RVI) is fully supported and utilized on vSphere, as reported in this article. To be absolutely clear: large pages were never supposed to be TPS’ed and this is not a bug but actually working as designed. However; we did discover an issue with the algorithm being used to calculate Guest Active Memory which causes the alarms to be triggered as “kichaonline” describes in this reply.

I’m not going to reiterate everything that has been reported in this VMTN Topic about the problem, but what I would like to mention is that a patch will be released soon to fix the incorrect alarms:

Several people have, understandably, asked about when this issue will be fixed. We are on track to resolving the problem in Patch 2, which is expected in mid to late September.

In the meantime, disabling large page usage as a temporary work-around is probably the best approach, but I would like to reiterate that this causes a measurable loss of performance. So once the patch becomes available, it is a good idea to go back and reenable large pages.

Also a small clarification. Someone asked if the temporary work-around would be “free” (i.e., have no performance penalty) for Win2k3 x64 which doesn’t enable large pages by default. While this may seem plausible, it is however not the case. When running a virtual machine, there are two levels of memory mapping in use: from guest linear to guest physical address and from guest physical to machine address. Large pages provide benefits at each of these levels. A guest that doesn’t enable large pages in the first level mapping, will still get performance improvements from large pages if they can be used for the second level mapping. (And, unsurprisingly, large pages provide the biggest benefits when both mappings are done with large pages.) You can read more about this in the “Memory and MMU Virtualization” section of this document:

http://www.vmware.com/resources/techresources/10036

Thanks,
Ole

Mid / Late september may sound to vague for some and that’s probably why Ole reported the following yesterday:

The problem will be fixed in Patch 02, which we currently expect to be available approximately September 30.

Thanks,
Ole

Mythbusters: Hyperthreading and VMware FT

Duncan Epping · Sep 10, 2009 ·

When vSphere was still in beta one of the requirements for using FT was to have hyperthreading disabled. For most people this wasn’t an issue as traditional hyperthreading usually did not improve performance and thus was disabled by default. However with the Nehalem all this changed. Of course I can’t guarantee a specific percentage of performance increase but increases of up to 20% have been reported which is the primary reason for having HT enabled on any Nehalem system.

As you can imagine the HT requirement for FT has been floating around ever since and is a myth which have never been debunked. I’ve spoken with product management about it and they confirmed it’s an obsolete requirement. Hyperthreading does not have to be disabled for FT to work. Or to put it even more strongly: FT is supported on systems which have hyperthreading enabled. Product Management promised me that a KB article will be created to debunk this myth or an entry will be added to the FT FAQ KB article soon.

UPDATE: The FT FAQ KB Article has been updated and includes the following statement.

Does Fault Tolerance support Intel Hyper-Threading Technology?
Yes, Fault Tolerance does support Intel Hyper-Technology on systems that have it enabled. Enabling or disabling Hyper-Threading has no impact on Fault Tolerance.

Understanding Memory Resource Management in VMware ESX Server

Duncan Epping · Sep 9, 2009 ·

VMware white-papers are my primary source of information. Almost every single one of them contains valuable information. VMware just released a brand new white paper titled “Understanding Memory Resource Management in VMware ESX Server” which is most definitely worth reading.

Download:
http://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdf

Description: VMware® ESX™ is a hypervisor designed to efficiently manage hardware resources including CPU, memory, storage, and network among
multiple concurrent virtual machines. This paper describes the basic memory management concepts in ESX, the configuration
options available, and provides results to show the performance impact of these options.