Episode 12 is out, this week we have Jeffrey Kusters to discuss VMware Cloud! Jeffrey discusses the use cases and the different offerings ranging from AWS to Google! Listen now: Spotify (https://spoti.fi/3tsSm6p), Apple (https://apple.co/3ioaQyp).
Stretched cluster witness failure resilience in vSAN 7.0
Cormac and I have been busy the past couple of weeks updating the vSAN Deep Dive to 7.0 U3. Yes, there is a lot to update and add, but we are actually going through it at a surprisingly rapid pace. I guess it helps that we had already written dozens of blog posts on the various topics we need to update or add. One of those topics is “witness failure resilience” which was introduced in vSAN 7.0 U3. I have discussed it before on this blog (here and here) but I wanted to share some of the findings with you folks as well before the book is published. (No, I do not know when the book will be available on Amazon just yet!)
In the scenario below, we failed the secondary site of our stretched cluster completely. We can examine the impact of this failure through RVC on vCenter Server. This will provide us with a better understanding of the situation and how the witness failure resilience mechanism actually works. Note that the below output has been truncated for readability reasons. Let’s take a look at the output of RVC for our VM directly after the failure.
VM R1-R1: Disk backing: [vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx RAID_1 RAID_1 Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221 votes: 1, usage: 0.1 GB, proxy component: false) Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201 votes: 1, usage: 0.1 GB, proxy component: false) RAID_1 Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238 votes: 1, usage: 0.1 GB, proxy component: true) Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236 votes: 1, usage: 0.1 GB, proxy component: true) Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231 votes: 3, usage: 0.0 GB, proxy component: false) Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214 votes: 1, usage: 0.0 GB, proxy component: false) Witness: 0b013262-5616-95c5-8b52-005056968de9 , host: 10.202.25.228 votes: 1, usage: 0.0 GB, proxy component: false)
As can be seen, the witness component holds 3 votes, the components on the failed site (secondary) hold 2 votes, and the components on the surviving data site (preferred) hold 2 votes. After the full site failure has been detected, the votes are recalculated to ensure that a witness host failure does not impact the availability of the VMs. Below shows the output of RVC once again.
VM R1-R1: Disk backing: [vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx RAID_1 RAID_1 Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221 votes: 3, usage: 0.1 GB, proxy component: false) Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201 votes: 3, usage: 0.1 GB, proxy component: false) RAID_1 Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238 votes: 1, usage: 0.1 GB, proxy component: false) Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236 votes: 1, usage: 0.1 GB, proxy component: false) Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231 votes: 1, usage: 0.0 GB, proxy component: false) Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214 votes: 3, usage: 0.0 GB, proxy component: false)
As can be seen, the votes for the various components have changed, the data site now has 3 votes per component instead of 1, the witness on the witness host went from 3 votes to 1, and on top of that, the witness that is stored in the surviving fault domain now also has 3 votes instead of 1 vote. This now results in a situation where quorum would not be lost even if the witness component on the witness host is impacted by a failure. A very useful enhancement to vSAN 7.0 Update 3 for stretched cluster configurations if you ask me.
Unexplored Territory #011: Providing enterprise storage services with VMware vSAN featuring Pete Koehler
Episode 11 of the podcast features Pete Koehler, and of course, we discuss vSAN and all the enterprise storage services it provides like HCI Mesh, vSAN Stretched Clusters and much more. Listen to it now via Spotify (https://spoti.fi/3twXzsF), or Apple (https://apple.co/3sN0EFR), or of course via the embedded player below
OneDrive stuck syncing on OSX
I have had this issue with OneDrive for days where it said it is syncing a file but not making any progress whatsoever. I tried all kinds of different solutions people have recommended, from remove the OneDrive app to killing the agent to deleting the .plist file. Nothing helped solve the problem. Unfortunately on OSX you can’t see which file is stuck either, so it is very difficult to troubleshoot. I managed to solve it as following in the end:
Go to “Resources” under your OneDrive app folder, for me this was:
cd /Applications/OneDrive.app/Contents/Resources
Then run the following command, not this may (will) trigger a resync, which is annoying, but did solve the sync issue:
./ResetOneDriveAppStandalone.command
If this doesn’t solve it, it could also be that there’s a hidden file called “.DS_Store” in the folder which is not syncing. Simply look at Finder and find the folders which are not syncing, use the Terminal to go to each folder, and look for the file “.DS_Store” by using “ls -a” (I always use “ls -lah” by default). It hopefully shows a list of files with that hidden file included. If that is the case, simply delete the file using “rm” and then restart the OneDrive agent.
I hope that helps others who hit the same issue.
Does the Native Key Provider require a host to have a TPM?
I got this question on the VMTN forum this week, does the Native Key Provider require a host to have a TPM? (Trusted Platform Module) The documentation does discuss the use of TPM 2.0 when you enable the Native Key Provider. Let’s be clear, the vCenter Server Native Key Provider does not require a TPM! If a TPM is available on each host then it will be used by the Native Key Provider to store a secret on, which enables us to encrypt and decrypt the ESXi configuration. Again, as stated, it is not a requirement to use a TPM. I have asked to get the documentation appended so that it is officially documented as well, just posting it here so that it indexed by google.