• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

VMware

vSAN Stretched Cluster vs Fault Domains in a “campus” setting?

Duncan Epping · Sep 25, 2025 · 2 Comments

I got this question internally recently: Should we create a vSAN Stretched Cluster configuration or create a vSAN Fault Domains configuration when we have multiple datacenters within close proximity on our campus? In this case, we are talking about less than 1ms latency RTT between buildings, maybe a few hundred meters at most. I think it is a very valid question, and I guess it kind of depends on what you are looking to get out of the infrastructure. I wrote down the pros and cons, and wanted to share those with the rest of the world as well, as it may be useful for some of you out there. If anyone has additional pros and cons, feel free to share those in the comments!

vSAN Stretched Clusters:

  • Pro: You can replicate across fault domains AND protect additionally within a fault domain with R1/R5/R6 if required.
  • Pro: You can decide whether VMs should be stretched across Fault Domains or not, or just protected within a fault domain/site
  • Pro: Requires less than 5MS RTT latency, which is easily achievable in this scenario
  • Con/pro: you probably also need to think about DRS/HA groups (VM-to-Host)
  • Con: From an operational perspective, it also introduces a witness host, and sites, which may complicate things, and at the various least requires a bit more thinking
  • Con: Witness needs to be hosted somewhere
  • Con: Limited to 3 Fault Domains (2x data + 1x witness)
  • Con: Limited to 20+20+1 configuration

vSAN Fault Domains:

  • Pro: No real considerations around VM-to-host rules usually, although you can still use it to ensure certain VMs are spread across buildings
  • Pro: No Witness Appliance to manage, update or upgrade. No overhead of running a witness somewhere
  • Pro: No design considerations around “dedicated” witness sites and “data site”, each site has the same function
  • Pro: Can also be used with more than 3 Fault Domains or Datacenters, so could even be 6 Fault Domains, for instance
  • Pro: Theoretically can go up to 64 hosts
  • Con: No ability to protect additionally within a fault domain
  • Con: No ability to specify that you don’t want to replicate VMs across Fault Domains
  • Con/Pro: Requires sub-1ms RTT latency at all times, which is low, but will be achievable in a campus cluster, usually

#101 – Discussing VCF 9.0 and Private AI Foundation with NVIDIA enhancements with Frank Denneman!

Duncan Epping · Aug 4, 2025 · Leave a Comment

It was time to get Frank Denneman back on the show to discuss the enhancements introduced in VCF 9 and Private AI Foundation with NVIDIA. Frank goes over all new functionality and all the enhancements like Agent Builder etc. Frank also mentioned various must-attend sessions at Explore, register now, as these will fill up fast:

  • ⁠Chris Wolf, keynote and breakout⁠
  • ⁠Shawn Kelly and Justin Murray: Accelerating AI Workloads⁠
  • ⁠Frank’s AI/ML sessions⁠!

You can listen to the episode on Apple Podcasts, Spotify, or anywhere else you find your podcasts! Of course, you can also simply use the embedded player below.

Does vSAN support a Franken cluster configuration?

Duncan Epping · May 28, 2025 · Leave a Comment

It is funny that this has come up a few times now, actually for the third time in a month. I had a question if you can mix AMD and Intel hosts in the same cluster. Although nothing stops you from doing this, and vSAN supports this configuration, you need to remember that you cannot live migrate (vMotion) between those hosts, which means that if you have DRS enabled you are seriously crippling the cluster as it makes balance resource much more complex.

You are creating a Franken cluster when mixing AMD and Intel. You may ask yourself, why would anyone want to do this in the first place? Well, you could do this for migration purposes for instance. If you use vSAN iSCSI Services for instance, this could be a way to migrate those iSCSI LUNs from old hosts to new host. How? Well, simply add the new hosts to the cluster, place the old hosts into maintenance, and make sure to migrate storage. Do note, all the VMs (or containers) will have to be powered off, and powered on again manually on the new hosts, as a result of moving from Intel to AMD (or the other way around).

If you do end up doing this for migration purposes, please ensure it is for the shortest time possible. Please avoid running with a Franken cluster for multiple days, weeks, or, god forbid, months. Nothing good will come out of it, and your VMs may become little monsters!

Can I have an AF-4 ReadyNode for vSAN ESA with less memory?

Duncan Epping · Feb 18, 2025 · Leave a Comment

I got this question the other day, and it was around the amount of memory the AF-4 ReadyNode configuration needs to have in order for it to be supported. I can understand where the question comes from, but what most people don’t seem to understand is that there’s a set of minimal requirements, and that the ReadyNode profiles are as the KB states a “guidance”. The listed configurations are a guidance. This guidance is based on the anticipated resource consumption for a given set of VMs. Of course, this could be very different for your workload. That is why this article that describes the hardware guidance now clearly states the following:

To maintain a configuration supported by VMware Global Services (GS), all ReadyNodes certified for vSAN ESA must meet or exceed the resources of the smallest configuration (vSAN-ESA-AF-0 for vSAN HCI or vSAN-Max-XS for vSAN Max).

This not only applies to memory, but also to other components, as long as you meet the minimum specified below.

Can I have an AF-4 ReadyNode for vSAN ESA with less memory?

Can I disable the vSAN service if the cluster is running production workloads?

Duncan Epping · Feb 7, 2025 · Leave a Comment

I just had a discussion with someone who had to disable the vSAN service, while the cluster was running a production workload. They had all their VMs running on 3rd party storage, so vSAN was empty, but when they went to the vSAN Configuration UI the “Turn Off” option was grayed out. The reason this option is grayed out is that vSphere HA was enabled. This is usually the case for most customers. (Probably 99.9%.) If you need to turn off vSAN, make sure to temporarily disable vSphere HA first, and of course enable it again after you turned off vSAN! This ensures that HA is reconfigured to use the Management Network instead of the vSAN Network.

Another thing to consider, it could be that you manually configured the “HA Isolation Address” for the vSAN Network, make sure to also change that to an IP address on the Management Network again. Lastly, if there’s still anything stored on vSAN, this will be inaccessible when you disable the vSAN service. Of course, if nothing is running on vSAN, then there will be no impact to the workload.

Can I disable the vSAN service if the cluster is running production workloads?

  • Page 1
  • Page 2
  • Page 3
  • Interim pages omitted …
  • Page 123
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in