• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Stretched cluster witness failure resilience in vSAN 7.0

Duncan Epping · Mar 17, 2022 · Leave a Comment

Cormac and I have been busy the past couple of weeks updating the vSAN Deep Dive to 7.0 U3. Yes, there is a lot to update and add, but we are actually going through it at a surprisingly rapid pace. I guess it helps that we had already written dozens of blog posts on the various topics we need to update or add. One of those topics is “witness failure resilience” which was introduced in vSAN 7.0 U3. I have discussed it before on this blog (here and here) but I wanted to share some of the findings with you folks as well before the book is published. (No, I do not know when the book will be available on Amazon just yet!)

In the scenario below, we failed the secondary site of our stretched cluster completely. We can examine the impact of this failure through RVC on vCenter Server. This will provide us with a better understanding of the situation and how the witness failure resilience mechanism actually works. Note that the below output has been truncated for readability reasons. Let’s take a look at the output of RVC for our VM directly after the failure.

VM R1-R1:
Disk backing:
[vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx
RAID_1
RAID_1
Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221
votes: 1, usage: 0.1 GB, proxy component: false)
Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201
votes: 1, usage: 0.1 GB, proxy component: false)
RAID_1
Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238
votes: 1, usage: 0.1 GB, proxy component: true)
Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236
votes: 1, usage: 0.1 GB, proxy component: true)
Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231
votes: 3, usage: 0.0 GB, proxy component: false)
Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214
votes: 1, usage: 0.0 GB, proxy component: false)
Witness: 0b013262-5616-95c5-8b52-005056968de9 , host: 10.202.25.228
votes: 1, usage: 0.0 GB, proxy component: false)

As can be seen, the witness component holds 3 votes, the components on the failed site (secondary) hold 2 votes, and the components on the surviving data site (preferred) hold 2 votes. After the full site failure has been detected, the votes are recalculated to ensure that a witness host failure does not impact the availability of the VMs. Below shows the output of RVC once again.

VM R1-R1:
Disk backing:
[vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx
RAID_1
RAID_1
Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221
votes: 3, usage: 0.1 GB, proxy component: false)
Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201
votes: 3, usage: 0.1 GB, proxy component: false)
RAID_1
Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238
votes: 1, usage: 0.1 GB, proxy component: false)
Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236
votes: 1, usage: 0.1 GB, proxy component: false)
Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231
votes: 1, usage: 0.0 GB, proxy component: false)
Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214
votes: 3, usage: 0.0 GB, proxy component: false)

As can be seen, the votes for the various components have changed, the data site now has 3 votes per component instead of 1, the witness on the witness host went from 3 votes to 1, and on top of that, the witness that is stored in the surviving fault domain now also has 3 votes instead of 1 vote. This now results in a situation where quorum would not be lost even if the witness component on the witness host is impacted by a failure. A very useful enhancement to vSAN 7.0 Update 3 for stretched cluster configurations if you ask me.

Unexplored Territory #011: Providing enterprise storage services with VMware vSAN featuring Pete Koehler

Duncan Epping · Mar 7, 2022 · Leave a Comment

Episode 11 of the podcast features Pete Koehler, and of course, we discuss vSAN and all the enterprise storage services it provides like HCI Mesh, vSAN Stretched Clusters and much more. Listen to it now via Spotify (https://spoti.fi/3twXzsF), or Apple (https://apple.co/3sN0EFR), or of course via the embedded player below

OneDrive stuck syncing on OSX

Duncan Epping · Feb 23, 2022 · 1 Comment

I have had this issue with OneDrive for days where it said it is syncing a file but not making any progress whatsoever. I tried all kinds of different solutions people have recommended, from remove the OneDrive app to killing the agent to deleting the .plist file. Nothing helped solve the problem. Unfortunately on OSX you can’t see which file is stuck either, so it is very difficult to troubleshoot. I managed to solve it as following in the end:

Go to “Resources” under your OneDrive app folder, for me this was:

cd /Applications/OneDrive.app/Contents/Resources

Then run the following command, not this may (will) trigger a resync, which is annoying, but did solve the sync issue:

./ResetOneDriveAppStandalone.command

If this doesn’t solve it, it could also be that there’s a hidden file called “.DS_Store” in the folder which is not syncing. Simply look at Finder and find the folders which are not syncing, use the Terminal to go to each folder, and look for the file “.DS_Store” by using “ls -a” (I always use “ls -lah” by default). It hopefully shows a list of files with that hidden file included. If that is the case, simply delete the file using “rm” and then restart the OneDrive agent.

I hope that helps others who hit the same issue.

Does the Native Key Provider require a host to have a TPM?

Duncan Epping · Feb 23, 2022 · Leave a Comment

I got this question on the VMTN forum this week, does the Native Key Provider require a host to have a TPM? (Trusted Platform Module) The documentation does discuss the use of TPM 2.0 when you enable the Native Key Provider. Let’s be clear, the vCenter Server Native Key Provider does not require a TPM! If a TPM is available on each host then it will be used by the Native Key Provider to store the secrets on. But as stated, it is not a requirement. I have asked to get the documentation appended so that it is officially documented as well, just posting it here so that it indexed by google.

Unexplored Territory #010: Terraform and declarative automation with Kyle Ruddy

Duncan Epping · Feb 22, 2022 · Leave a Comment

In episode #010 of the Unexplored Territory Podcast we talk to Kyle Ruddy, Tech Marketing guru at Hashicorp. Kyle explains how Hashicorp got started, what the difference is between imperative and declarative automation, and why Terraform (and other Hashicorp products/services) should be included in every multi-cloud architecture. Listen now via Apple (https://apple.co/34H5OcV), Spotify (https://spoti.fi/3J5MPrl), any other podcast app of your choice, or simply use the embedded player below!

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to page 4
  • Interim pages omitted …
  • Go to page 471
  • Go to Next Page »

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive" and the “vSphere Clustering Technical Deep Dive” series, and he is the host of the "In de aap gelogeerd" (Dutch) and "unexplored territory" (English) podcasts.

Upcoming Events

09-06-2022 – VMUG Belgium
16-06-2022 – VMUG Sweden

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2022 · Log in