esxi

HA: Did you know?

Duncan Epping · Sep 20, 2009 ·

Did you know that…

the best practice to increase the isolation response time(das.failuredetectiontime) from 15000 to 60000 for an Active/Standby situation for your service console has been deprecated as of vSphere.
(In other words for active/standby leave it set to the default 15000 for vSphere)
the limit of 100 VMs per host is actually “100 powered on and HA enabled VMs”. Of course this also goes for the 40 VM limit for clusters with more than 8 hosts.
the limit of 100VMs per host in an HA cluster less than 9 hosts is a soft limit.
das.isolationaddress[0-9] is one of the most underrated advanced settings.
It should be used as an additional safety net to rule out false positives.

Just four little things most people don’t seem to realize or know…

IO DRS – Providing Performance Isolation to VMs in Shared Storage Environments (TA3461)

Duncan Epping · Sep 16, 2009 ·

This was probably one of the coolest sessions of VMworld. Irfan Ahmad was the host of this session and some of you might know him from Project PARDA. The PARDA whitepaper describes the algorithm being used and how the customer could benefit from this in terms of performance. As Irfan stated this is still in a research phase. Although the results are above expectations it’s still uncertain if this will be included in a future release and if it does when this will be. There are a couple of key take aways that I want to share:

Congestion management on a per datastore level -> limits on IOPS and set shares per VM
Check the proportional allocation of the VMs to be able to identify bottlenecks.
With I/O DRS throughput for tier 1 VMs will increase when demanded (More IOPS, lower latency) of course based on the limits / shares specified.
CPU overhead is limitied -> my take: with the new hardware of today I wouldn’t worry about an overhead of a couple percent.
“If it’s not broken, don’t fix it” -> if the latency is low for all workloads on a specific datastore do not take action, only above a certain threshold!
I/O DRS does not take SAN congestion in account, but SAN is less likely to be the bottleneck
Researching the use of Storage VMotion move around VMDKs when there’s congestion on the array level
Interacting with queue depth throttling
Dealing with end-points and would co-exist with Powerpath

That’s it for now… I just wanted to make a point. There’s a lot of cool stuff coming up. Don’t be fooled by the lack of announcements(according to some people, although I personally disagree) during the keynotes. Start watching the sessions, there’s a lot of knowledge to be gained!

VMware Availability Solutions and Futures (BC3425 – Banjot Chanana)

Duncan Epping · Sep 16, 2009 ·

I was just replaying Banjot Chanana’s session “VMware Availability Solutions and Futures“. Banjot is the product manager for the availability solutions HA and FT. I met Banjot in Palo Alto the week before VMworld and we spoke about HA, present and futures. Unfortunately I can’t elaborate on anything that has been discussed but I can however repeat what Banjot spoke about during his session at VMworld.

The most exciting part of the presentation, for me at least, start at roughly 35:40. Banjot start to elaborate on futures especially when the 3D model gets expanded with “Stretched Clusters with FT” and “Stretched HA Clusters” I start to get interested. Some bullet points on future developments:

VM Component Protection -> loss of storage / loss of VM network -> fail-over / alert
Drives higher availability against granular outages
Stretched HA Clusters -> Carving up Clusters in “sub-clusters” by tagging VMs -> fail-over to other “sub-cluster” based on affinity
Drives higher availability against site failures
Application Monitoring -> Application awareness / correlation between infrastructure and application events -> SLA awareness also performance by using DRS
Drives higher availability against application / service failure
Host Retirement -> Host health scores would also indicate “VM readiness” of a host -> VMotion based on host health scores ->
Drives higher availability by monitor host health and taking action when thresholds are exceeded
Integrated Availability -> Availability Policies vs per VM settings -> Defining tiers and applying them to sets of VMs -> Based on SLA
Decreases operational efforts and increases availability by reducing “human errors”

Although some people were disappointed by the lack of announcements of new products I think there’s more than enough exciting features coming up if you know where to find them. Thanks Banjot for these insights,

Future HA developments… (VMworld – BC3197)

Duncan Epping · Sep 15, 2009 ·

I was just listening to “BC3197 – High Availability – Internals and Best Practices” by Marc Sevigny. Marc is one of the HA engineers and is also my primary source of information when it comes to HA. Although most information can be found on the internet it’s always good to verify your understanding with the people who actually wrote it.

During the session Marc explains, and I’ve written about in this article, that when a dual host failure occurs the global startup order is not taking into account. The startup order will be processed per host with the current version. In other words “Host a” first with taking startup order into account and then “Host B” with taking startup order into account.

During the session however Marc revealed that in a future version of HA global startup settings(Cluster based) will be taken into account for any number of host failures! Great stuff, another thing to mention is that they are also looking into an option which would enable you to pick your primary hosts. For blade environment this will be really useful. Thanks Marc for the insights,

Memory alarms triggered with AMD RVI and Intel EPT?

Duncan Epping · Sep 11, 2009 ·

I’ve reported on this twice already but it seems a fix will be offered soon. I discovered the problem back in March when I did a project where we virtualized a large amount of Citrix XenApp servers on an AMD platform with RVI capabilities. As Hardware MMU increased performance significantly it was enabled by default for 32Bit OS’es. This is when we noticed that large pages(side effect of enabling MMU) are not TPS’ed and thus give a totally different view of resource consumption than on your average cluster. When vSphere and Nehalem was released more customers experienced this behavior, as EPT(Intel’s version of RVI) is fully supported and utilized on vSphere, as reported in this article. To be absolutely clear: large pages were never supposed to be TPS’ed and this is not a bug but actually working as designed. However; we did discover an issue with the algorithm being used to calculate Guest Active Memory which causes the alarms to be triggered as “kichaonline” describes in this reply.

I’m not going to reiterate everything that has been reported in this VMTN Topic about the problem, but what I would like to mention is that a patch will be released soon to fix the incorrect alarms:

Several people have, understandably, asked about when this issue will be fixed. We are on track to resolving the problem in Patch 2, which is expected in mid to late September.

In the meantime, disabling large page usage as a temporary work-around is probably the best approach, but I would like to reiterate that this causes a measurable loss of performance. So once the patch becomes available, it is a good idea to go back and reenable large pages.

Also a small clarification. Someone asked if the temporary work-around would be “free” (i.e., have no performance penalty) for Win2k3 x64 which doesn’t enable large pages by default. While this may seem plausible, it is however not the case. When running a virtual machine, there are two levels of memory mapping in use: from guest linear to guest physical address and from guest physical to machine address. Large pages provide benefits at each of these levels. A guest that doesn’t enable large pages in the first level mapping, will still get performance improvements from large pages if they can be used for the second level mapping. (And, unsurprisingly, large pages provide the biggest benefits when both mappings are done with large pages.) You can read more about this in the “Memory and MMU Virtualization” section of this document:

http://www.vmware.com/resources/techresources/10036

Thanks,
Ole

Mid / Late september may sound to vague for some and that’s probably why Ole reported the following yesterday:

The problem will be fixed in Patch 02, which we currently expect to be available approximately September 30.

Thanks,
Ole