• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Server

vSAN Stretched Cluster failure matrix

Duncan Epping · May 30, 2023 ·

The last couple of weeks I was involved internally in a discussion around the different vSAN stretched cluster failure scenarios. I wrote a lengthy email about how vSAN and HA would respond in certain scenarios. I have documented many of these over the years on my blog already, but never really published them as a whole.

In some of the scenarios below, I discuss a “partition”, a partition is a scenario where both the L3 connection to the witness is down and the inter site / inter switch link to the other site for one of the locations. So in the diagram above for instance, if I say that Site B is partitioned then it means that Site A can still communicate with the witness, but Site B cannot communicate with the Witness and cannot communicate with Site A either.

For all of the below scenarios the following applies, Site A is the preferred location and Site B is the secondary location. When it comes to the table, the first two columns refer to the policy setting for the VM as shown in the screenshot below. The third column refers to the location where the VM runs from a compute perspective. The fourth discusses the type of failure, and the fifth and sixth columns discuss the behavior witnessed.

Time to list the various scenarios, and no, it doesn’t include all failures that could occur but should discuss most scenarios which are important for a stretched cluster configuration. Do note, the below-discussed behavior will only be witnessed when the best practices, as documented here and here, are followed. Also note that the table has multiple pages, there are close to 30 scenarios described! If there are any questions feel free to leave a comment, if you feel a failure scenario is missing, also please leave a comment.

Site Disaster ToleranceFailures to TolerateVM LocationFailurevSAN behaviorHA behavior
None PreferredNo data redundancySite A or BHost failure Site AObjects are inaccessible if failed host contained one or more components of objectsVM cannot be restarted as object is inaccessible
None PreferredRAID-1/5/6Site A or BHost failure Site AObjects are accessible as there's site local resiliencyVM does not need to be restarted, unless VM was running on failed host
None PreferredNo data redundancy / RAID-1/5/6Site AFull failure Site AObjects are inaccessible as full site failedVM cannot be restarted in Site B, as all objects reside in Site A
None PreferredNo data redundancy / RAID-1/5/6Site BFull failure Site BObjects are accessible, as only Site A contains objectsVM can be restarted in Site A, as that is where all objects reside
None PreferredNo data redundancy / RAID-1/5/6Site APartition Site AObjects are accessible as all objects reside in Site AVM does not need to be restarted
None PreferredNo data redundancy / RAID-1/5/6Site BPartition Site BObjects are accessible in Site A, objects are not accessible in Site B as network is downVM is restarted in Site A, and killed by vSAN in Site B
None SecondaryNo data redundancy / RAID-1/5/6Site BPartition Site BObjects are accessible in Site BVM resides in Site B, does not need to be restarted
None PreferredNo data redundancy / RAID-1/5/6Site AWitness Host FailureNo impact, witness host is not used as data is not replicatedNo impact
None SecondaryNo data redundancy / RAID-1/5/6Site BWitness Host FailureNo impact, witness host is not used as data is not replicatedNo impact
Site MirroringNo data redundancySite A or BHost failure Site A or BComponents on failed hosts inaccessible, read and write IO across ISL as no redundancy locally, rebuild across ISLVM does not need to be restarted, unless VM was running on failed host
Site MirroringRAID-1/5/6Site A or BHost failure Site A or BComponents on failed hosts inaccessible, read IO locally due to RAID, rebuild locallyVM does not need to be restarted, unless VM was running on failed host
Site MirroringNo data redundancy / RAID-1/5/6Site AFull failure Site AObjects are inaccessible in Site A as full site failedVM restarted in Site B
Site MirroringNo data redundancy / RAID-1/5/6Site APartition Site AObjects are inaccessible in Site A as full site is partitioned and quorum is lostVM restarted in Site B
Site MirroringNo data redundancy / RAID-1/5/6Site AWitness Host FailureWitness object inaccessible, VM remains accessibleVM does not need to be restarted
Site MirroringNo data redundancy / RAID-1/5/6Site BFull failure Site AObjects are inaccessible in Site A as full site failedVM does not need to be restarted as it resides in Site B
Site MirroringNo data redundancy / RAID-1/5/6Site BPartition Site AObjects are inaccessible in Site A as full site is partitioned and quorum is lostVM does not need to be restarted as it resides in Site B
Site MirroringNo data redundancy / RAID-1/5/6Site BWitness Host FailureWitness object inaccessible, VM remains accessibleVM does not need to be restarted
Site MirroringNo data redundancy / RAID-1/5/6Site ANetwork failure between Site A and B (ISL down)Site A binds with witness, objects in Site B becomes inaccessibleVM does not need to be restarted
Site MirroringNo data redundancy / RAID-1/5/6Site BNetwork failure between Site A and B (ISL down)Site A binds with witness, objects in Site B becomes inaccessibleVM restarted in Site A
Site MirroringNo data redundancy / RAID-1/5/6Site A or Site BNetwork failure between Witness and Site A (or B)Witness object absent, VM remains accessibleVM does not need to be restarted
Site MirroringNo data redundancy / RAID-1/5/6Site AFull failure Site A, and simultaneous Witness Host FailureObjects are inaccessible in Site A and Site B due to quorum being lostVM cannot be restarted
Site MirroringNo data redundancy / RAID-1/5/6Site AFull failure Site A, followed by Witness Host Failure a few minutes laterPre vSAN 7.0 U3: Objects are inaccessible in Site A and Site B due to quorum being lostVM cannot be restarted
Site MirroringNo data redundancy / RAID-1/5/6Site AFull failure Site A, followed by Witness Host Failure a few minutes laterPost vSAN 7.0 U3: Objects are inaccessible in Site A, but accessible in Site B as votes have been recountedVM restarted in Site B
Site MirroringNo data redundancy / RAID-1/5/6Site BFull failure Site B, followed by Witness Host Failure a few minutes laterPost vSAN 7.0 U3: Objects are inaccessible in Site B, but accessible in Site A as votes have been recountedVM restarted in Site A
Site MirroringNo data redundancySite AFull failure Site A, and simultaneous host failure in Site BObjects are inaccessible in Site A, if components reside on failed host then object is inaccessible in Site BVM cannot be restarted
Site MirroringNo data redundancySite AFull failure Site A, and simultaneous host failure in Site BObjects are inaccessible in Site A, if components do not reside on failed host then object is accessible in Site BVM restarted in Site B
Site MirroringRAID-1/5/6Site AFull failure Site A, and simultaneous host failure in Site BObjects are inaccessible in Site A, accessible in Site B as there's site local resiliencyVM restarted in Site B

New book: VMware vSAN 8.0 U1 Express Storage Architecture Deep Dive!

Duncan Epping · Apr 27, 2023 ·

We already gave some hints on twitter, and during an episode of the Unexplored Territory podcast, but here it finally is… The new book, the VMware vSAN 8.0 U1 Express Storage Architecture Deep Dive! It has been a year since we released the vSAN 7.0 U3 Deep Dive book, and with this brand new vSAN architecture being introduced in vSAN 8.0 we figured it was time to do a full overhaul of the book as well. Mind you, this new book purely deals with the Express Storage Architecture, aka vSAN ESA. This also means that some of the features which are not supported by ESA are not discussed in this book, for that you will need to buy the vSAN 7.0 U3 Deep Dive book, which covers OSA. Another big change is that we brought in a third author, we asked our good friend Pete Koehler to contribute to the book. Pete had done reviews of previous books, and considering the amount of material he produced for VMware Tech Marketing for vSAN (and ESA specifically) it made a lot of sense to bring him in!

VMware’s vSAN has rapidly proven itself in environments ranging from hospitals to oil rigs to e-commerce platforms and is the market leader in the hyperconverged space. Along the way, the world of IT has rapidly changed, not just from a software point of view, but also from a hardware perspective. With vSAN 8.0 VMware brought a new architecture to market called vSAN Express Storage Architecture (ESA). This architecture is highly optimized for today’s world of datacenter resources, be it CPU, memory, networking, or NVMe based flash storage.

The authors of the vSAN Deep Dive have thoroughly updated their definitive guide to this transformative technology. Writing for vSphere administrators, architects, and consultants, Cormac Hogan, Duncan Epping , and Pete Koehler explain what vSAN ESA is, why the architecture has changed, what it now offers, and how to gain maximum value from it. The book offers expert insight into preparation, installation, configuration, policies, provisioning, clusters, architecture, and more. You’ll also find practical guidance for using all data services, stretched clusters, two-node configurations, and cloud-native storage services.

Although we pressed publish on Tuesday, sometimes it takes a while before the book is available in all Amazon stores, but it should just trickle down in the upcoming 24-48 hours. The book is priced at 9.99 USD for the ebook and 29.99 USD for a paper copy, and is sold through Amazon only. Get it while it is hot, and we would appreciate it if you would use our referral links and leave a review when you finish it. Thanks for the support, and we hope you will enjoy it!

  • paper – 29.99 USD
  • ebook – 9.99 USD

Of course, we also have the links to other major Amazon stores:

  • United Kingdom – ebook – paper
  • Germany – ebook – paper
  • Netherlands – ebook – paper
  • Canada – ebook – paper
  • France – ebook – paper
  • Spain – ebook – paper
  • India – ebook
  • Japan – ebook – paper
  • Italy – ebook – paper
  • Mexico – ebook
  • Australia – ebook – paper
  • Brazil – ebook
  • Or just do a search in your local amazon store!

VMUG Advantage Homelab Group Buy Discount offer 2023! (Also for renewals!)

Duncan Epping · Apr 19, 2023 ·

It is that time of the year again for many, time to renew your VMUG subscription. The minimum discount you will get is 12% and this can go up to 15% when the number of participant goes above 300, which drops the price down to 170 USD. What do you get when you sign up and buy a 12-month subscription?

  • 365-day Evaluation Licenses
    • Including vSphere 8, vSAN 8, Workstation 17 Pro, Fusion 13 Pro, NSX, vRealize, Horizon, and more!
  • 20-35% discount on training and certification
  • Access to “test drive“
  • Advantage members receive a $ 100 USD VMware Explore discount (not stackable)

The VMUG Advantage Program comes at a cost of 200 USD. Last year the discount was 15%, which means the price ended up being 170 USD for a full year. If you have just one training course planned per year, the VMUG Advantage Program will have already paid for itself (20% discount on a 3000+ USD training course). Yes, I have been talking about USD so far, but of course, this offer is available to all our community members globally (Europe, APJ, Africa, Middle East, etc). Now, again, the discount percentage you get will depend on the number of people signing up for this year’s promotion, but even if only 1 person signs up (you) you will immediately get a 12% discount. The ranges look as follows:

Quantity Discount Cost
1-199 12% $176
200-299 14% $172
300+ 15% $170

If more than 1000 people sign up, VMUG HQ will also do a raffle and give away some cool VMUG Advantage “swag”. Can’t wait? Sign up for the discount code here, and join the program! Note, the survey is open for 2 weeks, so from the 19th of April 2023 until the 3rd of May, after the survey closes the discount code will be distributed to all those who signed up.

UPDATE: The goal has been reached, and you can get a 15% discount when using the code: ADV15OFF. Note: This promotion is only available until May 3rd, 2023!

vSphere 8.0 U1 and vSAN 8.0 U1 what’s new podcast episodes available now!

Duncan Epping · Mar 15, 2023 ·

We (the Unexplored Territory team) have just published two brand-new episodes which discuss What’s New with vSphere 8.0 U1 and vSAN 8.0 U1. You can of course listen to them using your favorite podcast app, or you simply use the embedded players below to enjoy the content.

vSAN ESA ReadyNode configurations are more flexible than you think!

Duncan Epping · Mar 8, 2023 ·

I had a discussion at the Dutch VMUG yesterday about the ReadyNode configurations for vSAN ESA. The discussion was about how difficult it was to select a host and customize it. It was then that I realized that most people hadn’t noticed yet that there is an easier method (or lifehack as my kids would say) when it comes to selecting your server model. How does that work? Well, let me show you!

First, let’s take a look at the vSAN ESA ReadyNode Hardware Guidance Table. The table below shows you what the node capacity is for each profile from a storage, CPU, memory, and networking perspective.

Now if you look at the table you will see that as the “profile” number goes up, so does the capacity for each of the various components. This is actually what provides you with a lot of flexibility in my opinion. If we take Dell as an example, but the same applies for most vendors on the current list, and we select “vSAN-ESA-AF2” and look at the list of options we see the following:

  • PowerEdge R650
  • PowerEdge R6515
  • PowerEdge R750
  • PowerEdge R7515

Now, if we look at “vSAN-ESA-AF8” next, which is the highest profile, we see that we only can pick 1 server model, which happens to be the PowerEdge R750. If we then look at the difference between the hosts selected for each profile a few things stand out:

  • vSAN-ESA-AF2 has an Intel Xeon Silver 4314, while vSAN-ESA-AF8 has a Platinum 8358
  • vSAN-ESA-AF2 has 512GB, while vSAN-ESA-AF8 has 1024GB
  • vSAN-ESA-AF2 a 25Gbps NIC, while vSAN-ESA-AF8 has a 100Gbps NIC
  • vSAN-ESA-AF2 has five 3.2TB NVMe devices while vSAN-ESA-AF8 has twenty-four 3.2TB devices

Now if I look at the KB article which explains what you can, and cannot change, something stands out, most of the components can be modified/customized. For instance, for CPU you can go to a higher core count and/or higher base clock speed! For memory, you can go up, same for storage devices (as long as you stay within supported limits), etc etc.

In other words, what is the difference between a vSAN-ESA-AF2 and a vSAN-ESA-AF8? Basically the expected workload, the performance, the capacity. This ultimately results in a different configuration. Nothing, at this point in time, stops you from selecting the “lowest” vSAN ReadyNode Profile and spec it as an “AF4”, “AF6” or “AF8” from a CPU stance, or from a storage/memory capacity point of view. If you want to have some more flexibility, try selecting a smaller profile, select the host type, and increase the resources/components where needed!

When you start exploring the options it may seem complex, but when you look more closely you will quickly realize that it actually isn’t that complex, and that it actually provides you with a lot of flexibility, as long as you stick to the rules and pick supported components!

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 3
  • Page 4
  • Page 5
  • Page 6
  • Page 7
  • Interim pages omitted …
  • Page 335
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in