2-node

Witness resiliency feature with a 2-node cluster

Duncan Epping · Oct 9, 2023 · Leave a Comment

A few weeks ago I had a conversation with a customer about a large vSAN ESA 2-node deployment they were planning for. One of the questions they had was if they would have a 2-node configuration with nested fault domains if they would be able to tolerate a witness failure after one of the node had gone down. I tested this for a stretched cluster, but I hadn’t tested it with a 2-node configuration. Will we actually see the votes be re-calculated after a host failure, and will the VM remain up and running when the witness fails after the votes have been recalculated?

Let’s just test it, and look at RVC at what happens in each case. Let’s look at the healthy output first, then we will look at a host failure, followed by the witness failure:

Healthy

    DOM Object: 71c32365-667e-0195-1521-0200ab157625 
      RAID_1
        Concatenation
          Component: 71c32365-b063-df99-2b04-0200ab157625 
            votes: 2, usage: 0.0 GB, proxy component: true
          RAID_0
            Component: 71c32365-f49e-e599-06aa-0200ab157625 
              votes: 1, usage: 0.0 GB, proxy component: true
            Component: 71c32365-681e-e799-168d-0200ab157625 
              votes: 1, usage: 0.0 GB, proxy component: true
            Component: 71c32365-06d3-e899-b3b2-0200ab157625 
              votes: 1, usage: 0.0 GB, proxy component: tru
        Concatenation
          Component: 71c32365-e0cb-ea99-9c44-0200ab157625 
            votes: 1, usage: 0.0 GB, proxy component: false
          RAID_0
            Component: 71c32365-6ac2-ee99-1f6d-0200ab157625 
               votes: 1, usage: 0.0 GB, proxy component: false
            Component: 71c32365-e03f-f099-eb12-0200ab157625 
               votes: 1, usage: 0.0 GB, proxy component: false
            Component: 71c32365-6ad0-f199-a021-0200ab157625 
               votes: 1, usage: 0.0 GB, proxy component: false
      Witness: 71c32365-8c61-f399-48c9-0200ab157625 
        votes: 4, usage: 0.0 GB, proxy component: false

1 host down, as you can see the votes for the witness changed, of course the staste also changed from “active” to “absent”.

    DOM Object: 71c32365-667e-0195-1521-0200ab157625 
      RAID_1
        Concatenation (state: ABSENT (6)
          Component: 71c32365-b063-df99-2b04-0200ab157625 
            votes: 1, proxy component: false
          RAID_0
            Component: 71c32365-f49e-e599-06aa-0200ab157625 
              votes: 1, proxy component: false
            Component: 71c32365-681e-e799-168d-0200ab157625 
              votes: 1, proxy component: false
            Component: 71c32365-06d3-e899-b3b2-0200ab157625 
              votes: 1, proxy component: false
        Concatenation
          Component: 71c32365-e0cb-ea99-9c44-0200ab157625 
             votes: 2, usage: 0.0 GB, proxy component: false
          RAID_0
            Component: 71c32365-6ac2-ee99-1f6d-0200ab157625 
              votes: 1, usage: 0.0 GB, proxy component: false
            Component: 71c32365-e03f-f099-eb12-0200ab157625
              votes: 1, usage: 0.0 GB, proxy component: false
            Component: 71c32365-6ad0-f199-a021-0200ab157625 
              votes: 1, usage: 0.0 GB, proxy component: false
      Witness: 71c32365-8c61-f399-48c9-0200ab157625 
        votes: 1, usage: 0.0 GB, proxy component: false

And after I failed the witness, of course we had to check if the VM was still running and didn’t show up as inaccessible in the UI, and it did not. vSAN and the Witness Resilience feature worked as I expected it would work. (Yes, I double checked it through RVC as well, and the VM was “active”.)

Nested Fault Domains on a 2-Node vSAN Stretched Cluster, is it supported?

Duncan Epping · Jun 20, 2022 ·

I spotted a question this week on VMTN, the question was fairly basic, are nested fault domains supported on a 2-node vSAN Stretched Cluster? It sounds basic, but unfortunately, it is not documented anywhere, probably because stretched 2-node configurations are not very common. For those who don’t know, with a nested fault domain on a two-node cluster you basically provide an additional layer of resiliency by replicating an object within a host as well. A VM Storage Policy for a configuration like that will look as follows.

This however does mean that you would need to have a minimum of 3 fault domains within your host as well if you want to, this means that you will need to have a minimum of 3 disk groups in each of the two hosts as well. Or better said, when you configure Host Mirroring and then select the second option failures to tolerate the following list will show you the number of disk groups per host you need at a minimum:

Host Mirroring – 2 Node Cluster
- No Data Redundancy – 1 disk group
- 1 Failure – RAID1 – 3 disk groups
- 1 Failure – RAID5 – 4 disk groups
- 2 Failures – RAID1 – 5 disk groups
- 2 Failures – RAID6 – 6 disk groups
- 3 Failures – RAID1 – 7 disk groups

If you look at the list, you can imagine that if you need additional resiliency it will definitely come at a cost. But anyway, back to the question, is it supported when your 2-node configuration happens to be stretched across locations, and the answer is yes, VMware supports this.

Is a crossover cable needed for vSAN 2-Node Direct Connect?

Duncan Epping · Jan 8, 2021 ·

I had this question last week around vSAN 2-node direct connect and whether using a crossover cable is still required to be used or if a regular CAT6 cable (CAT 5E works as well) can be used. I knew the answer and figured this would be documented somewhere, but it doesn’t appear to be. To be honest, many websites when talking about the need for crossover cables are blatantly wrong. And yes, I also spotted some incorrect recommendations in VMware’s own documentation, so I requested those entries to be updated. Just to be clear, with vSAN 2-Node Direct Connect, or vMotion, or any other service for that matter, you can use a regular CAT6 cable, combined with 10GbE (or faster) NICs, this gives you great performance without the cost of a 10GbE (or faster) switch. I can’t recall having seen a NIC in the past 10 years that does not have Auto MDI/MDI-X implemented, even though it was an optional feature in the 1000Base-T standard. In other words, there’s no need to buy a crossover cable, or make one, just use a regular cable.

2 node direct connect vSAN and error “vSphere HA agent on this host could not reach isolation address”

Duncan Epping · Oct 7, 2019 ·

I’ve had this question over a dozen times now, so I figured I would add a quick pointer to my blog. What is causing the error “vSphere HA agent on this host could not reach isolation address” to pop up on a 2-node direct connect vSAN cluster? The answer is simple, when you have vSAN enabled HA uses the vSAN network for communication. When you have a 2-node Direct Connect the vSAN network is not connected to a switch and there are no other reachable IP addresses other than the IP addresses of the vSAN VMkernel interfaces.

When HA tries to test if the isolation address is reachable (the default gateway of the management interface) the ping will fail as a result. How you can solve this is simply by disabling the isolation response as described in this post here.

Isolation Address in a 2-node direct connect vSAN environment?

Duncan Epping · Nov 22, 2017 ·

As most of you know by now when vSAN is enabled vSphere HA uses the vSAN network for heartbeating. I recently wrote an article about the isolation address and relationship with heartbeat datastores. In the comment section, Johann asked what the settings should be for 2-Node Direct Connect with vSAN. A very valid question as an isolation is still possible, although not as likely as with a stretched cluster considering you do not have a network switch for vSAN in this configuration. Anyway, you would still like the VMs that are impacted by the isolation to be powered off and you would like the other remaining host to power them back on.

So the question remains, which IP Address do you select? Well, there’s no IP address to select in this particular case. As it is “direct connect” there are probably only 2 IP addresses on that segment (one for host 1 and another for host 2). You cannot use the default gateway either, as that is the gateway for the management interface, which is the wrong network. So what do I recommend:

Disable the Isolation Response >> set it to “leave powered on” or “disabled” (depends on the version used
Disable the use of the default gateway by setting the following HA advanced setting:
- das.usedefaultisolationaddress = false

That probably makes you wonder what will happen when a host is isolated from the rest of the cluster (other host and the witness). Well, when this happens then the VMs are still killed, but not as a result of the isolation response kicking in, but as a result of vSAN kicking in. Here’s the process:

Heartbeats are not received
Host elects itself primary
Host pings the isolation address
- If the host can’t ping the gateway of the management interface then the host declares itself isolated
- If the host can ping the gateway of the management interface then the host doesn’t declare itself isolated
Either way, the isolation response is not triggered as it is set to “Leave powered on”
vSAN will now automatically kill all VMs which have lost access to its components
- The isolated host will lose quorum
- vSAN objects will become isolated
- The advanced setting “VSAN.AutoTerminateGhostVm=1” allows vSAN to kill the “ghosted” VMs (with all components inaccessible).

In other words, don’t worry about the isolation address in a 2-node configuration, vSAN has this situation covered! Note that “VSAN.AutoTerminateGhostVm=1” only works for 2-node and Stretched vSAN configurations at this time.

UPDATE:

I triggered a failure in my lab (which is 2-node, but not direct connect), and for those who are wondering, this is what you should be seeing in your syslog.log:

syslog.log:2017-11-29T13:45:28Z killInaccessibleVms.py [INFO]: Following VMs are powered on and HA protected in this host.
syslog.log:2017-11-29T13:45:28Z killInaccessibleVms.py [INFO]: * ['vm-01', 'vm-03', 'vm-04']
syslog.log:2017-11-29T13:45:32Z killInaccessibleVms.py [INFO]: List inaccessible VMs at round 1
syslog.log:2017-11-29T13:45:32Z killInaccessibleVms.py [INFO]: * ['vim.VirtualMachine:1', 'vim.VirtualMachine:2', 'vim.VirtualMachine:3']
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: List inaccessible VMs at round 2
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: * ['vim.VirtualMachine:1', 'vim.VirtualMachine:2', 'vim.VirtualMachine:3']
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: Following VMs are found to have all objects inaccessible, and will be terminated.
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: * ['vim.VirtualMachine:1', 'vim.VirtualMachine:2', 'vim.VirtualMachine:3']
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: Start terminating VMs.
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: Successfully terminated inaccessible VM: vm-01
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: Successfully terminated inaccessible VM: vm-03
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: Successfully terminated inaccessible VM: vm-04
syslog.log:2017-11-29T13:46:06Z killInaccessibleVms.py [INFO]: Finished killing the ghost vms