• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

VMware Cloud Foundation

My favorite VMware Explore 2025 sessions!

Duncan Epping · Jul 3, 2025 · Leave a Comment

Yes, it is that time of the year again… VMware Explore season! As I write this, I am in the middle of developing the content for Explore, as I have two sessions approved myself. I created a so-called targeted agenda, so if you want to attend any of the below sessions, just go here.

The two sessions I am presenting can be found here. But for those who don’t want to click, they are:

  • Three Times the Performance, Half the Latency: VMware vSAN Express Storage Architecture Deep Dive for VMware Geeks [CLOB1067LV] Featuring Pete Koehler and Duncan Epping
  • Six Innovations Redefining Storage and Disaster Recovery for VMware Cloud Foundation [CLOB1028LV] Featuring Rakesh Radhakrishnan and Duncan Epping

If you are attending Explore, and are planning on attending those sessions, make sure to register as soon as you can as they were “sold out” in previous years!

Now, for the rest of the content catalog, these are the sessions I hope to be able to attend:

  • Deploying Minimal VMware Cloud Foundation 9.0 Lab [CLOB1201LV] hosted by Alan Renouf and William Lam
  • The Isolated Clean Room Blueprint for On-Premises Based Cyber Recoveries [CLOB1267LV] hosted by Michael McLaughlin
  • A Deep Dive into Memory Tiering with NVMe [CLOB1122LV] hosted by Dave Morera
  • Bridging the Gap: Managing Virtual Machines in a Kubernetes World [CLOB1938LV] hosted by Kat Brookfield
  • Design and Architect: Multi-cluster Management for Kubernetes at Scale with VMware Cloud Foundation 9.0 [CLOB1471LV] hosted by Kris Inglis and Hugo Phan
  • 10 Amazing New Things with VMware Live Recovery [CLOB1943LV] hosted by Jatin Jindal and Nabil Quadri
  • Building Secure Private AI Deep Dive [INVB1432LV] hosted by Chris McCain
  • 5 Key Capabilities of Next-Gen Data Protection and Ransomware Recovery with vSAN for All VCF Workloads [CLOB1265LV] hosted by Rakesh Radhakrishnan and Sazzala Reddy
  • Design and Architect: Best Practices for Deploying VMware Cloud Foundation 9.0 [CLOB1427LV] hosted by Prak Kalra and Sushil Suvarna
  • Design and Architect: Managing and Operating at Scale with VCF 9.0 [CLOB1487LV] hosted by Ivaylo Ivanov and Michael Kolos
  • Real-World Lessons in Rightsizing VMware Cloud Foundation for On-Premises AI Workloads [INVB1300LV] hosted by Frank Denneman and Johan van Amersfoort

If you feel that a session is missing, feel free to leave a comment!

Introducing vSAN 9.0!

Duncan Epping · Jun 18, 2025 · 8 Comments

As most have probably seen, Broadcom has just announced VMware Cloud Foundation 9.0. Of course, this means that there’s also a brand new shiny version of vSAN available, namely vSAN 9.0. Most of the new functionality released was already previewed at VMware Explore by Rakesh and I, but I feel it is worth going over some key new functionality anyway. I am not going to go over every single item, as I know folks like Pete Koehler will do this on the VMware Blog anyway.

The first big item I feel is worth discussing is vSAN ESA Global Deduplication. This is probably the main feature of the release. Now I have to be fair, it is released as “limited availability” only at this stage, and that basically means you need to request access to the feature. The reason for this is that in the first release it is not supported yet in combination with for instance stretched clustering, so Broadcom/VMware will want to make sure you meet the requirements for this feature before it is enabled. Hopefully that will change soon though!

Introducing vSAN 9.0!

Now what is so special about this feature compared to vSAN OSA Deduplication? Well first of all, vSAN OSA is on a per diskgroup basis, whereas vSAN ESA is global deduplication. This should result in a much higher deduplication ratio, as the chances of finding duplicates are simply much higher across hosts than within a single disk group. Dedupe is also post-process, which removes the risk of a potential performance impact. On top of that, the layout of the data is done in such a way that vSAN ESA can still efficiently read large contiguous blocks of data, even when it is deduplicated.

The next feature, which is worth discussing, is specifically introduced for vSAN Storage Clusters (the architecture formerly known as vSAN Max) and is all about network separation. This new capability allows you to differentiate between Client traffic and Server traffic for a vSAN Storage Cluster. Which means that you could have east-west traffic within a rack for instance to a top-of-rack 100GbE switch, but do north-south to connecting clusters via a 10GbE switch, or any other speed. It not only provides a huge amount of flexibility, but it also improves efficiency, performance and security at the same time by isolating these traffic streams from each other as visualized below.

Then, the next big-ticket item isn’t necessarily a vSAN feature, but rather a vSAN Data Protection and VMware Live Recovery feature. Starting with 9.0 it will be possible to replicate VMs between clusters using the snapshot technology, which is part of vSAN ESA. This provides the big benefit of being able to go 200 snapshots deep at no (significant, single-digit) performance loss. On top of that, vSAN Data Protection can do this at a 1 minute RPO and leverages the already familiar UI and protection group capabilities that were introduced in 8.x. Big difference though being that you no longer have to download the vSAN Data Protection appliance, but that everything is now available as part of the VLR appliance.

The last thing I want to discuss is the vSAN Stretched Cluster functionality we introduced. I’ve already discussed this previously as a preview, but now it is available for stretched cluster customers to test out (Note, you do have to file an RPQ for both). vSAN Stretched Cluster Site Maintenance Mode is available starting with vSAN 9 for OSA via RPQ, and it allows you to place a whole site into maintenance while maintaining data consistency within the site. This solves a major operational hurdle for customers, as previously, customers would have to place a site into maintenance mode one host at a time. If you had a 10+10+1 configuration, that indeed meant you had to place 10 hosts into maintenance mode sequentially. This issue is now solved via a simple UI button!

Lastly, also for vSAN Stretched Clustering we are introducing, in “limited availability,” the vSAN Stretched Cluster Manual Take Over functionality. This will help customers who have lost a site that was placed into maintenance to regain access to their data. Of course, the idea here is though that over time this feature will also help customers to regain access to data when a data site and the witness fails simultaneously. It is a fairly delicate and complicated process, so as you can imagine, this is “limited availability” for now, as it requires some education/explanation of how this works and what the potential impact is of running the manual take over command.

I hope that provides an overview of some of the key functionality. I am also recording a podcast with Pete Koehler where we will discuss these capabilities soon, I will add the link to the podcast, and to the videos, when they are released.

#098 – VMware Cloud Foundation 5.x in a single box for homelabs featuring William Lam!

Duncan Epping · Jun 3, 2025 · Leave a Comment

I noticed an excellent blog by William Lam not too long ago, which discussed how to bring up VMware Cloud Foundation 5.x on a single box for home labs. William created a Github page that goes over the whole process, and provided all the tweaks and scripts needed to get it done. I wanted to discuss this process with William, as I believe many folks in the VMware/Broadcom community will be interested in deploying this at home, or at work, to go through that full VCF experience but without needing a larger lab environment. You can listen to the episode on Spotify (bit.ly/43WFSpA), Apple (bit.ly/4jqbYyx), or any other podcasting platform. Or simply use the embedded player below! Thanks William, for a fantastic episode.

Does vSAN support a Franken cluster configuration?

Duncan Epping · May 28, 2025 · Leave a Comment

It is funny that this has come up a few times now, actually for the third time in a month. I had a question if you can mix AMD and Intel hosts in the same cluster. Although nothing stops you from doing this, and vSAN supports this configuration, you need to remember that you cannot live migrate (vMotion) between those hosts, which means that if you have DRS enabled you are seriously crippling the cluster as it makes balance resource much more complex.

You are creating a Franken cluster when mixing AMD and Intel. You may ask yourself, why would anyone want to do this in the first place? Well, you could do this for migration purposes for instance. If you use vSAN iSCSI Services for instance, this could be a way to migrate those iSCSI LUNs from old hosts to new host. How? Well, simply add the new hosts to the cluster, place the old hosts into maintenance, and make sure to migrate storage. Do note, all the VMs (or containers) will have to be powered off, and powered on again manually on the new hosts, as a result of moving from Intel to AMD (or the other way around).

If you do end up doing this for migration purposes, please ensure it is for the shortest time possible. Please avoid running with a Franken cluster for multiple days, weeks, or, god forbid, months. Nothing good will come out of it, and your VMs may become little monsters!

vSAN Component vote recalculation with Witness Resilience, the follow up!

Duncan Epping · Mar 21, 2025 · Leave a Comment

I wrote about the Witness Resilience feature a few years ago and had a question on this topic today. I did some tests and then realized I already had an article describing how it works, but as I also tested a different scenario I figured I would write a follow up. In this case we are particularly talking about a 2-node configuration, but this would also apply to stretched cluster.

In a stretched cluster, or a 2-node, configuration when a data site goes down (or is placed into maintenance mode) a vote recalculation will automatically be done on each object/component. This is to ensure that if now the witness ends up failing, the objects/VMs will remain accessible. How that works I’ve explained here, and demonstrated for a 2-node cluster here.

But what if the Witness fails first? Well, I can explain it fairly easily, then the VMs will be inaccessible if the Witness goes down. Why is that? Well because the votes will not be recalculated in this scenario. Of course, I tested this and the screenshots below demonstrate it.

This screenshot shows the witness as Absent and both the “data” components have 1 vote. This means that if we fail one of those hosts the component will become inaccessible. Let’s do that next and then check the UI for more details.

As you can see below, the VM is now inaccessible. This is the result of the fact that there’s no longer a quorum, as 2 out of 3 votes are dead.

I hope that explains how this works.

  • Page 1
  • Page 2
  • Page 3
  • Interim pages omitted …
  • Page 9
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 ยท Log in