ha

NetApp is now officially vMSC certified

Duncan Epping · Jul 27, 2012 ·

As I had many people asking about this over the last couple of months I figured I would share it. I just noticed that NetApp is now finally officially vSphere Metro Storage Cluster certified (see SAN HCL). NetApp has certified their platform for the following array types:

NFS
iSCSI

Yes indeed, FC is currently not listed… But for me the great news is that NFS is listed! A KB article has been published with all the details… make sure to read it if you are looking to deploy a stretched cluster with NetApp and vSphere 5.0.

Answering some admission control questions

Duncan Epping · Jul 3, 2012 ·

I received a bunch of questions on HA admission control in this blog post and I figured I would answer them in a blog post so that everyone would be able to find / read it. This was the original set of questions:

There are 4 ESXi Hosts in the network and 4 VMs (Same CPU, RAM Reservation for all VMs) on each Host. Admission Control is policy is set to ‘Host failure cluster tolerates’ to 1. All the available 12 slots have been used by the powered ON VMs, except the 4 reserved slots for failover.
1) What happens if 2 ESXi Hosts fails now? ( 2 * 4 VMs needs to fail over). Will HA restart only 4 VMs as it has only 4 slots available? And Restart of the remaining 4 VM fails?
Same Scenario, but Policy is set to ‘% of cluster resources reserved’ = 25%. All the available 75 % resources have been utilized by all the 16 VMs, except 25 % reserved for failover
2) What happens if 2 ESXi Hosts fails now? ( 2 * 4 VMs needs to fail over). Will HA restart only 4 VMs as it consumes 25 % of resources? And Restart of the other 4 VM fails?
3) Does HA check the VM reservation (or any other factor) at the time of restart ?
4) HA only restart a VM if the Host could guarantee the reserved resources or restart Fails?
5) What if no VM reservations are set VM level ?
6)What does HA takes into consideration when it has to restart VMs which has no reservation ?
7)Will it guarantee the configured Resources for each VMs ?
8)If not, How HA can restart 8 VMs (as per our eg) when it only has configured reserved resources for just 4 VM
9)Will it share the reserved resources across 8 VMs and will not care about the resource crunch or is it about first come first serve
10)Admission control doesn’t have any role at all in the event of HA failover ?

Let me tackle these questions one by one:

In this scenario 4 VMs will be restarted and 4 VMs might be restarted! Note that the “slot size” policy is used and that this is based on the worst case scenario. So if your slot is 1GB and 2GHz but your VMs require way less than that to power-on it could be all VMs are restarted. However, HA guarantees the restart of 4 VMs. Keep in mind that this scenario doesn’t happen too often, as you would be overcommitting to the extreme here. As said HA will restart all VMs it can. It just needs to be able to satisfy the resource reservations on memory and CPU!
Again, also in this HA will do its best to restart. It can restart new VMs until all “unreserved capacity” is used. As HA only needs to guarantee reserved resources chances of hitting this is very slim, as most people don’t use reservations at a VM level it would mean you are overcommiting extremely
Yes it will validate if there is a host which can back the the resource reservations before it tries the restart
Yes it will only restart the VM when this can be guaranteed. If it cannot be then HA can call,”DRS” to defragment resources for this VM
If there are no reservations then HA will only look at the “memory overhead” in order to place this VM
HA ensures the portgroup and datastore are available on the host.
It will not guarantee configured resources, HA is about restarting virtual machines not about resource management. DRS is about resource management and guaranteeing access to resources.
HA will only be able to restart the VM if there are unreserved resources available to satisfy the VMs request
All resources required for a virtual machine need to be available on a single host! Yes resources will be shared on a single host, just as long as no reservations are defined.
No Admission Control doesn’t have any role in an HA failover. Admission Control happens on a vCenter level, HA failovers happen on an ESX(i) level.

Maximum amount of FT virtual machines per host?

Duncan Epping · Jun 29, 2012 ·

There was a discussion yesterday on our Socialcast system. The question was what the max amount of FT virtual machines was and what dictated this. Of course there are many things that will be a constraint when it comes to FT (memory reservations, bandwidth etc) but the one thing that stands out and not many realize is that the amount of FT virtual machines per host is limited to 4 by default.

This is currently controlled by a vSphere HA advanced setting called “das.maxftvmsperhost”. By default this setting is configured to 4. This advanced setting is an HA advanced setting (in combination with vSphere DRS) and defines the max amount of FT virtual machines, either primary or secondary or a combination of both, that can run on a single host. So if for whatever reason you want a max of 6 you will need to add this advanced setting with a value of 6.

I do not recommend changing this however, FT is a fairly heavy process and in most environments 4 is the recommended value.

VMworld 2012 here I come

Duncan Epping · Jun 27, 2012 ·

I just got the news that two of my VMworld sessions have been accepted. I wanted to share with you which ones so you can keep track (if you want):

BCO1159 – Architecting and Operating a vSphere Metro Storage Cluster by Lee Dilworth and Duncan Epping
In this session Lee Dilworth and Duncan Epping will discuss the design and operational considerations for vSphere Metro Storage Clusters environments, also commonly referred to as stretched cluster environments. Best practices around implementation and design will be shared. Various failure scenarios which can occur in a stretched storage environment are discussed in-depth including how vSphere 5.x responds to these failures. We will cover the implication on your vSphere HA, DRS and Storage DRS configuration and provide recommendations how to increase availability and simplify operations!
VSP1504 – Ask the Expert vBloggers with Rick Scherer, Frank Denneman, Chad Sakac, Scott Lowe and Duncan Epping
Back by high demand, the Ask the Expert vBloggers panel session. Show up and ask any question you like to a panel consisting out of know community members! This was one of the best voted sessions last year and with people like Frank, Rick, Scott and Chad sitting next to me I know it is going to be awesome again. Lets just hope Rick brings his buzzer again so he can buzz Chad when he starts preaching again 🙂

In a couple of weeks when all sessions are listed I will also create a nice “Top 20 – VMworld Sessions” article again, but for now I want to thank everyone who voted and am hoping to see all of you at VMworld.

vSphere HA in 5.0 constantly pinging my gateway?

Duncan Epping · Jun 26, 2012 ·

I had this question today and noticed someone also dropped it on the community forums. The question was if vSphere HA is constantly pinging the default gateway or not. I knew HA would ping the gateway on a regular basis as of vSphere 5.0, and on a more frequently basis if a ping would fail but I wasn’t sure about the timing. I pointed Marc Sevigny from the HA engineering team to the thread on the community forums and he added added some nice juicy details to the it. I figured I would share them with you.

First of all, each ESXi host in a 5.x cluster will ping the isolation address every 5 minutes (300 seconds). Could this flood the isolation device?

There should be no “flood” of ICMP messages, and it should have little impact on network performance. The ICMP packet is 53 bytes long and sent once every 5 seconds from each of the HA hosts until the address(es) become pingable once again, at which point it returns to pinging once per hour.

If your default gateway is never pingable because of your firewall, you should open up the ports needed by HA. It is also possible to or disable the isolation address monitoring on the default gateway by using an advanced option (das.useDefaultIsolationAddress = false). It is recommended to specify a different isolation address (das.isolationaddress0) when the default gateway is a non-pingable device. Note that it is highly recommend to use a device as the default gateway which is as few hops removed from your hosts as possible!