ha

vSphere HA heartbeat datastores, the isolation address and vSAN

Duncan Epping · Nov 8, 2017 ·

I’ve written about vSAN and vSphere HA various times, but I don’t think this has been explicitly called out before. Cormac and I were doing some tests this week and noticed something. When we were looking at results I realized I described it in my HA book a long time ago, but it is so far hidden away that probably no one has noticed.

In a traditional environment when you enable HA you will automatically have HA heartbeat datastores selected. These heartbeat datastores are used by the HA primary host to determine what has happened to a host which is no longer reachable over the management network. In other words, when a host is isolated it will communicate this to the HA primary using the heartbeat datastores. It will also inform the HA primary which VMs were powered off as the result of this isolation event (or not powered off when the isolation response is not configured).

Now, with vSAN, the management network is not used for communication between the hosts but the vSAN network is used. Typically in a vSAN environment, there’s only vSAN storage so there are no heartbeat datastores. As such, when a host is isolated it is not possible to communicate this to the HA primary. Remember, the network is down and there is no access to the vSAN datastore so the host cannot communicate through that either. HA will still function as expected though. You can set the isolation response to power-off and then the VMs will be killed and restarted. That is, if isolation is declared.

So when is isolation declared? A host declares itself isolated when:

It is not receiving any communication from the primary
It cannot ping the isolation address

Now, if you have not set any advanced settings then the default gateway of the management network will be the isolation address. Just imagine your vSAN Network to be isolated on a given host, but for whatever reason, the Management Network is not. In that scenario isolation is not declared, the host can still ping the isolation address using the management network vmkernel interface. HOWEVER… vSphere HA will restart the VMs. The VMs have lost access to disk, as such the lock on the VMDK is lost. HA notices the hosts are gone, which must mean that the VMs are dead as the locks are lost, lets restart them.

That is when you could be in the situation where the VMs are running on the isolated hosts and also somewhere else in the cluster. Both with the same mac address and the same name / IP address. Not a good situation. Now, if you would have had datastore heartbeats enabled then this would be prevented. As the isolated host would inform the primary it is isolated, but it would also inform the primary about the state of the VMs, which would be powered-on. The primary would then decide not to restart the VMs. However, the VMs which are running on the isolated host are more or less useless as they cannot write to disk anymore.

Let’s describe what we tested and what the outcome was in a way that is a bit easier to consume a table:

Isolation Address	Datastore Heartbeats	Observed behavior
IP on vSAN Network	Not configured	Isolated host cannot ping the isolation address, isolation declared, VMs killed and VMs restarted
Management Network	Not configured	Can ping the isolation address, isolation not declared, yet rest of the cluster restarts the VMs even though they are still running on the isolated hosts
IP on vSAN Network	Configured	Isolated host cannot ping the isolation address, isolation declared, VMs killed and VMs restarted
Management Network	Configured	VMs are not powered-off and not restarted as the “isolated host” can still ping the management network and the datastore heartbeat mechanism is used to inform the master about the state. So the master knows HA network is not working, but the VMs are not powered off.

So what did we learn, what should you do when you have vSAN? Always use an isolation address that is in the same network as vSAN! This way during an isolation event the isolation is validated using the vSAN vmkernel interface. Always set the isolation response to power-off. (My personal opinion based on testing.) This would avoid the scenario of duplicate mac / ip / names on the network when you have a single network being isolated for a specific host! And if you have traditional storage, then you can enable heartbeat datastores. It doesn’t add much in terms of availability, but still it will allow the HA hosts to communicate state through the datastore.

PS1: For those who don’t know, HA is configured to automatically select a heartbeat datastore. In a vSAN only environment you can disable this by selecting “Use datastore from only the specified list” in the HA interface and then set “das.ignoreInsufficientHbDatastore = true” in the advanced HA settings.

PS2: In a non-routable vSAN network environment you could create a Switch Virtual Interface on the physical switch. This will give you an IP on the vSAN segment for the isolation address leveraging the advanced setting das.isolationaddress0.

The difference between an isolation and a partition with vSphere

Duncan Epping · Oct 10, 2017 ·

I have a lot of discussions with customers on the topic of stretched clusters, but also regular vSphere clusters. Something that often comes up is the discussion around what happens in an isolation or partition scenario. Fairly often customers (but also VMware employees) use those words interchangeably. However, a partition is not the same as an isolation. They are 2 different scenarios, and also as a result they have a different type of response associated with it. Before I explain the difference in the two responses to a situation like this, what is a partition and what is an isolation?

An isolation event is a situation where a single host cannot communicate with the rest of the cluster. Note: single host!
A partition is a situation where two (or more) hosts can communicate with each other, but no longer can communicate with the remaining two (or more) hosts in the cluster. Note: two or more!

Why is that such a big deal? Well the response in the case of these two scenarios are different. And the response/result is also determined by what types of configuration you have. Lets break down the scenarios one by one, including the type of infrastructure used (when it is relevant).

Isolation Event

When a host is isolated it will:

start an election process
- declare itself primary
ping the isolation address
declare itself isolated
power off / shut down VMs (when this is configured)
communicate through the connected datastores that it is isolated
the VMs will be restarted on the remaining hosts in the cluster

And then of course vSphere HA will be able to restart the VMs. Note that in the case of vSAN, it isn’t possible to write to the datastore when a host is isolated, so it won’t do that. Yet the workloads will still have been powered off / shutdown so it is safe for vSphere HA to restart them

Partition (traditional storage)

When two or more hosts are partitioned (they can communicate with each other) and the vSphere HA primary is not part of the partition it will:

start an election process
declare a primary in the partition
figure out what has happened to the hosts and VMs in the other partition
- restart any VMs that somehow were impacted, or appeared now to be powered off while the last known state was powered on
if all VMs are running, vSphere HA won’t try to restart any, this is the expected result!

Partition (vSAN stretched)

When the partition scenario happens in a stretched vSAN environment there’s an extra (potential) step. Along the way, vSAN will identify all VMs which have no accessible components and kill those VMs so they can be restarted in the partition which has quorum. In this scenario, you have 3 locations, two for data and 1 for the witness. If a data site loses access to the other locations then the data site is partitioned (the hosts can still communicate with each other within the site), as such the isolation response is not triggered. However, vSAN will still kill these VMs as they are rendered useless (lost access to disk).

I know it is just semantics, but nevertheless, I do feel it is important to understand the difference between an isolation and a partition, especially as the response (and who responds) is different in these situations. Hope it helps,

Can you use the management IPs as the isolation address for HA?

Duncan Epping · Aug 11, 2017 ·

There was a question on VMTN this week about the use of the management IP’s in a “smaller” cluster as the isolation address for vSphere HA. The plan was to disable the default isolation address (default gateway) and then add every management IP as an isolation address. In this case 5 or 6 IP’s would be added. I had to think this through and went through the steps of what happens in the case of an isolation event:

no traffic between secondary and primary or primary and secondary hosts (depending on whether the primary is isolated or one of the secondary hosts)
if it was a secondary which is potentially isolated then the secondary will start a “primary election process”
if it was the primary which is potentially isolated then the primary will try to ping the isolation addresses
if it was a secondary and there’s no response to the election process then the secondary host will ping the isolation address after it has elected itself as primary host
if there’s no response to any of the pings (happen in parallel) then the isolation is declared and the isolation response is triggered

Now the question is: will there be a response when the host tries to ping itself while it is isolated, as you need to add all ip-addresses to “isolation address” options for it to make sense… And that is what I tested. It will ping all isolation addresses. All but one will fail, the one that will be successful is the management IP address of the host which is isolated. (You can still ping your own IP when the NICs are disconnected even.) Leaving the VMs running as one of the isolation addresses responded.

In other words, don’t do this. The isolation address should be a reliable address outside of the ESXi host, preferably on the same network as the management.

Where’s the HA enforce VM-Host and Affinity rules option in vSphere 6.5?

Duncan Epping · Apr 25, 2017 ·

Last week on (VMware internal) Socialcast someone asked where the UI option is in vSphere 6.5 that allows you to enable the ability for vSphere HA to respect VM-Host Affinity and VM-VM Anti Affinity rules. In vSphere 6.0 there is an option in the Rules part of the UI as shown in the screenshot below.

In vSphere 6.5 that option has disappeared completely. The reason for this is because vSphere HA now respects these rules by default, as it appeared this is the behavior customers wanted anyway. Note, that if for whatever reason vSphere HA cannot respect the rule it will restart the VMs (violating the rule) as these are non-mandatory rules it chooses availability over compliance in this situation.

If you would like to disable this behavior and don’t care about these rules during a fail-over you can set either or both advanced settings:

das.respectvmvmantiaffinityrules – set to “true” by default, set to “false” if you want to disable it
das.respectvmhostsoftaffinityrules – set to “true” by default, set to “false” if you want to disable it

I hope that helps those looking to make changes to this behavior.

VM showing that HA failure response is disabled in 6.5?

Duncan Epping · Apr 11, 2017 ·

I had a customer asking me today why on each VM it was showing that all HA responses are disabled. This customer is running vSphere 6.5 and below you can see what the UI showed. Note that it still says the VM is Protected, yet none of the protection mechanisms appeared to be enabled

I asked them to show me a screenshot of their HA configuration, and the HA configuration actually had several of these response mechanisms enabled. I checked my vSphere 6.5 lab and it seems I have the same problem and there’s a UI issue on the VM level details for vSphere HA in vSphere 6.5. I verified with engineering, and this is indeed a known issue which has been identified and is fixed in vCenter Server 6.5.0b! KB Article on the topic can be found here, and in the release notes for 6.5.0b it is mentioned that it is fixed.