vsan stretched cluster

vSAN Stretched Cluster vs Fault Domains in a “campus” setting?

Duncan Epping · Sep 25, 2025 · 2 Comments

I got this question internally recently: Should we create a vSAN Stretched Cluster configuration or create a vSAN Fault Domains configuration when we have multiple datacenters within close proximity on our campus? In this case, we are talking about less than 1ms latency RTT between buildings, maybe a few hundred meters at most. I think it is a very valid question, and I guess it kind of depends on what you are looking to get out of the infrastructure. I wrote down the pros and cons, and wanted to share those with the rest of the world as well, as it may be useful for some of you out there. If anyone has additional pros and cons, feel free to share those in the comments!

vSAN Stretched Clusters:

Pro: You can replicate across fault domains AND protect additionally within a fault domain with R1/R5/R6 if required.
Pro: You can decide whether VMs should be stretched across Fault Domains or not, or just protected within a fault domain/site
Pro: Requires less than 5MS RTT latency, which is easily achievable in this scenario
Con/pro: you probably also need to think about DRS/HA groups (VM-to-Host)
Con: From an operational perspective, it also introduces a witness host, and sites, which may complicate things, and at the various least requires a bit more thinking
Con: Witness needs to be hosted somewhere
Con: Limited to 3 Fault Domains (2x data + 1x witness)
Con: Limited to 20+20+1 configuration

vSAN Fault Domains:

Pro: No real considerations around VM-to-host rules usually, although you can still use it to ensure certain VMs are spread across buildings
Pro: No Witness Appliance to manage, update or upgrade. No overhead of running a witness somewhere
Pro: No design considerations around “dedicated” witness sites and “data site”, each site has the same function
Pro: Can also be used with more than 3 Fault Domains or Datacenters, so could even be 6 Fault Domains, for instance
Pro: Theoretically can go up to 64 hosts
Con: No ability to protect additionally within a fault domain
Con: No ability to specify that you don’t want to replicate VMs across Fault Domains
Con/Pro: Requires sub-1ms RTT latency at all times, which is low, but will be achievable in a campus cluster, usually

vSAN Stretched Cluster Site Maintenance Mode, where is it??

Duncan Epping · Jul 2, 2025 · Leave a Comment

I had a question from a customer earlier who wanted to test the vSAN Stretched Cluster functionality that was introduced in 9.0 called Site Maintenance. Yes it is indeed what you would expect it to be, a new feature that allows you to place a whole site into maintenance mode at once. Very useful, but this customer was unable to find the button in the UI. Which, by the way, is not strange, as this capability (along with the Manual Site Takeover) capability is only available through an RPQ request at the moment, and it is also only available for vSAN OSA for now, so keep that in mind when filing an RPQ through your Broadcom/VMware contact! When you get approved, you will be informed on how you can get this functionality enabled, and then it will be added in the UI on the fault domain level as shown in the screenshot below, taken from my 9.0 lab!

vSAN Component vote recalculation with Witness Resilience, the follow up!

Duncan Epping · Mar 21, 2025 · Leave a Comment

I wrote about the Witness Resilience feature a few years ago and had a question on this topic today. I did some tests and then realized I already had an article describing how it works, but as I also tested a different scenario I figured I would write a follow up. In this case we are particularly talking about a 2-node configuration, but this would also apply to stretched cluster.

In a stretched cluster, or a 2-node, configuration when a data site goes down (or is placed into maintenance mode) a vote recalculation will automatically be done on each object/component. This is to ensure that if now the witness ends up failing, the objects/VMs will remain accessible. How that works I’ve explained here, and demonstrated for a 2-node cluster here.

But what if the Witness fails first? Well, I can explain it fairly easily, then the VMs will be inaccessible if the Witness goes down. Why is that? Well because the votes will not be recalculated in this scenario. Of course, I tested this and the screenshots below demonstrate it.

This screenshot shows the witness as Absent and both the “data” components have 1 vote. This means that if we fail one of those hosts the component will become inaccessible. Let’s do that next and then check the UI for more details.

As you can see below, the VM is now inaccessible. This is the result of the fact that there’s no longer a quorum, as 2 out of 3 votes are dead.

I hope that explains how this works.

Doing site maintenance in a vSAN Stretched Cluster configuration

Duncan Epping · Jan 15, 2025 · Leave a Comment

I thought I wrote an article about this years ago, but it appears I wrote an article about doing maintenance mode with a 2-node configuration instead. As I’ve received some questions on this topic, I figured I would write a quick article that describes the concept of site maintenance. Note that in a future version of vSAN, we will have an option in the UI that helps with this, as described here.

First and foremost, you will need to validate if all data is replicated. In some cases, we see customers pinning data (VMs) to a single location without replication, and those VMs will be directly impacted if a whole site is placed in maintenance mode. Those VMs will need to be powered off, or you will need to make sure those VMs are moved to the location that remains running if they need to stay running. Do note, if you flip “Preferred / Secondary” and there are many VMs that are site local, this could lead to a huge amount of resync traffic. If those VMs need to stay running, you may also want to reconsider your decision to replicate those VMs though!

These are the steps I would take when placing a site into maintenance mode:

Verify the vSAN Witness is up and running and healthy (see health checks)
Check compliance of VMs that are replicated
Configure DRS to “partially automated” or “Manual” instead of “Fully automated”
Manually vMotion all VMs from Site X to Site Y
Place each ESXi host in Site X into maintenance mode with the option “no data migration”
Power Off all the ESXi hosts in Site X
Enable DRS again in “fully automated” mode so that within Site Y the environment stays balanced
Do whatever needs to be done in terms of maintenance
Power On all the ESXi hosts in Site X
Exit maintenance mode for each host

Do note, that VMs will not automatically migrate back until the resync for that given VM has been fully completed. DRS and vSAN are aware of the replication state! Additionally, if VMs are actively doing IO when hosts in Site X are going into maintenance mode, the state of data stored on hosts within Site X will differ. This concern will be resolved in the future by providing a “site maintenance” feature as discussed at the start of this article.

VCF-9 Vision for a Federated Storage View and vSAN (stretched cluster) visualizations!

Duncan Epping · Nov 19, 2024 · Leave a Comment

As mentioned last week, the sessions at Explore Barcelona were not recorded. I still wanted to share with you what it is we are working on, so I decided to record and share a few demos, and some of the slides we presented. In this video, I show our vision for a Federated Storage View for both vSAN and more traditional storage systems. This federated view will not only provide insights in terms of capacity and performance capabilities, it will also provide you with a visualization of a stretched cluster configuration. This is something that I have been asking for for a while now, and it looks like this will become a reality in VCF 9 at some point. As this revolves all around visualization, I would urge you to watch the below video. And as always, if you have feedback, please leave a comment!