A few weeks ago I had a conversation with a customer about a large vSAN ESA 2-node deployment they were planning for. One of the questions they had was if they would have a 2-node configuration with nested fault domains if they would be able to tolerate a witness failure after one of the node had gone down. I tested this for a stretched cluster, but I hadn’t tested it with a 2-node configuration. Will we actually see the votes be re-calculated after a host failure, and will the VM remain up and running when the witness fails after the votes have been recalculated?
Let’s just test it, and look at RVC at what happens in each case. Let’s look at the healthy output first, then we will look at a host failure, followed by the witness failure:
Healthy
DOM Object: 71c32365-667e-0195-1521-0200ab157625
RAID_1
Concatenation
Component: 71c32365-b063-df99-2b04-0200ab157625
votes: 2, usage: 0.0 GB, proxy component: true
RAID_0
Component: 71c32365-f49e-e599-06aa-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: true
Component: 71c32365-681e-e799-168d-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: true
Component: 71c32365-06d3-e899-b3b2-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: tru
Concatenation
Component: 71c32365-e0cb-ea99-9c44-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
RAID_0
Component: 71c32365-6ac2-ee99-1f6d-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
Component: 71c32365-e03f-f099-eb12-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
Component: 71c32365-6ad0-f199-a021-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
Witness: 71c32365-8c61-f399-48c9-0200ab157625
votes: 4, usage: 0.0 GB, proxy component: false
1 host down, as you can see the votes for the witness changed, of course the staste also changed from “active” to “absent”.
DOM Object: 71c32365-667e-0195-1521-0200ab157625
RAID_1
Concatenation (state: ABSENT (6)
Component: 71c32365-b063-df99-2b04-0200ab157625
votes: 1, proxy component: false
RAID_0
Component: 71c32365-f49e-e599-06aa-0200ab157625
votes: 1, proxy component: false
Component: 71c32365-681e-e799-168d-0200ab157625
votes: 1, proxy component: false
Component: 71c32365-06d3-e899-b3b2-0200ab157625
votes: 1, proxy component: false
Concatenation
Component: 71c32365-e0cb-ea99-9c44-0200ab157625
votes: 2, usage: 0.0 GB, proxy component: false
RAID_0
Component: 71c32365-6ac2-ee99-1f6d-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
Component: 71c32365-e03f-f099-eb12-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
Component: 71c32365-6ad0-f199-a021-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
Witness: 71c32365-8c61-f399-48c9-0200ab157625
votes: 1, usage: 0.0 GB, proxy component: false
And after I failed the witness, of course we had to check if the VM was still running and didn’t show up as inaccessible in the UI, and it did not. vSAN and the Witness Resilience feature worked as I expected it would work. (Yes, I double checked it through RVC as well, and the VM was “active”.)

Would this still work if you are not using Nested Fault domains? i.e. just using a PFTT1
You explicitly call out ESA, would this work with OSA?
Thanks
Witness resilience applies both to ESA as well as OSA, this feature was implemented for both. In terms of just using PFTT and not SFTT, this shouldn’t make a difference either, but unfortunately as my lab is being rebuild I cannot test this at the moment. (sorry slow response, somehow this message slipped through.)