This was one fun episode. I just love talking to people about things that are relatively new to me, and AI/ML definitely is a relatively new topic for me. Justin Murray is VMware’s Tech Marketing guru on this topic and he had some great knowledge to share with us. On top of that, Justin just has a great way of simplifying/explaining things and always is a pleasure to see present/listen to. Either listen via the embedded player below or listen via Spotify: spoti.fi/3vtxjA6, Apple: apple.co/3ErCdSC, or anywhere else you get your podcasts!
Unexplored Territory #013 – Exploring VMware Cloud on AWS with Adrian Roberts!
In episode 013 we talk to Adrian Roberts, Head of EMEA Solution Architecture for VMware Cloud on AWS at AWS. Adrian discusses the various reasons customers are looking to utilize VMware Cloud on AWS, discusses some of the challenges, and of course the opportunities that arise when you have your VMware workloads close to native AWS services. Listen to the full episode via Spotify (spoti.fi/3DDCoJX) or Apple (apple.co/3r1qAvV)
Unexplored Territory #012: Discussing VMware Cloud offerings with ITQ’s CTO Jeffrey Kusters!
Episode 12 is out, this week we have Jeffrey Kusters to discuss VMware Cloud! Jeffrey discusses the use cases and the different offerings ranging from AWS to Google! Listen now: Spotify (https://spoti.fi/3tsSm6p), Apple (https://apple.co/3ioaQyp).
Stretched cluster witness failure resilience in vSAN 7.0
Cormac and I have been busy the past couple of weeks updating the vSAN Deep Dive to 7.0 U3. Yes, there is a lot to update and add, but we are actually going through it at a surprisingly rapid pace. I guess it helps that we had already written dozens of blog posts on the various topics we need to update or add. One of those topics is “witness failure resilience” which was introduced in vSAN 7.0 U3. I have discussed it before on this blog (here and here) but I wanted to share some of the findings with you folks as well before the book is published. (No, I do not know when the book will be available on Amazon just yet!)
In the scenario below, we failed the secondary site of our stretched cluster completely. We can examine the impact of this failure through RVC on vCenter Server. This will provide us with a better understanding of the situation and how the witness failure resilience mechanism actually works. Note that the below output has been truncated for readability reasons. Let’s take a look at the output of RVC (vsan.vm_object_info) for our VM directly after the failure.
VM R1-R1: Disk backing: [vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx RAID_1 RAID_1 Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221 votes: 1, usage: 0.1 GB, proxy component: false) Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201 votes: 1, usage: 0.1 GB, proxy component: false) RAID_1 Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238 votes: 1, usage: 0.1 GB, proxy component: true) Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236 votes: 1, usage: 0.1 GB, proxy component: true) Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231 votes: 3, usage: 0.0 GB, proxy component: false) Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214 votes: 1, usage: 0.0 GB, proxy component: false) Witness: 0b013262-5616-95c5-8b52-005056968de9 , host: 10.202.25.228 votes: 1, usage: 0.0 GB, proxy component: false)
As can be seen, the witness component holds 3 votes, the components on the failed site (secondary) hold 2 votes, and the components on the surviving data site (preferred) hold 2 votes. After the full site failure has been detected, the votes are recalculated to ensure that a witness host failure does not impact the availability of the VMs. Below shows the output of RVC once again.
VM R1-R1: Disk backing: [vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx RAID_1 RAID_1 Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221 votes: 3, usage: 0.1 GB, proxy component: false) Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201 votes: 3, usage: 0.1 GB, proxy component: false) RAID_1 Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238 votes: 1, usage: 0.1 GB, proxy component: false) Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236 votes: 1, usage: 0.1 GB, proxy component: false) Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231 votes: 1, usage: 0.0 GB, proxy component: false) Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214 votes: 3, usage: 0.0 GB, proxy component: false)
As can be seen, the votes for the various components have changed, the data site now has 3 votes per component instead of 1, the witness on the witness host went from 3 votes to 1, and on top of that, the witness that is stored in the surviving fault domain now also has 3 votes instead of 1 vote. This now results in a situation where quorum would not be lost even if the witness component on the witness host is impacted by a failure. A very useful enhancement to vSAN 7.0 Update 3 for stretched cluster configurations if you ask me.
Unexplored Territory #011: Providing enterprise storage services with VMware vSAN featuring Pete Koehler
Episode 11 of the podcast features Pete Koehler, and of course, we discuss vSAN and all the enterprise storage services it provides like HCI Mesh, vSAN Stretched Clusters and much more. Listen to it now via Spotify (https://spoti.fi/3twXzsF), or Apple (https://apple.co/3sN0EFR), or of course via the embedded player below