8.0 u1

Doing network/ISL maintenance in a vSAN stretched cluster configuration!

Duncan Epping · Nov 21, 2023 ·

I got a question earlier about the maintenance of an ISL in a vSAN Stretched Cluster configuration which had me thinking for a while. The question was what would you do with your workload during maintenance. I guess the easiest of course is to power off all VMs and simply shutdown the cluster, for which vSAN has a UI option, and there’s a KB you can follow. Now, of course, there could also be a situation where the VMs need to remain running. But how does this work when you end up losing the connection between all three locations? Normally this would lead to a situation where all VMs will become “inaccessible” as you will end up losing quorum.

As said, this had me thinking, you could take advantage of the “vSAN Witness Resiliency” mechanism which was introduced in vSAN 7.0 U3. How would this work?

Well, it is actually pretty straight forward, if all hosts of 1 site are in maintenance mode, failed, or powered off, the votes of the witness object for each VM/Object will be recalculated within 3 minutes. When this recalculation has completed the witness can go down without having any impact on the VM. We introduced this capability to increase resiliency in a double-failure scenario, but we can (ab)use this functionality also during maintenance. Of course I had to test this, so the first step I took was placing all hosts in 1 location into maintenance mode (no data evac). This resulted in all my VMs being vMotioned to the other site.

Now next I checked with RVC if my votes were recalculated or not. As stated, depending on the number of VMs this can take around 3 minutes in total, but usually will probably be quicker. After the recalculation had been completed I powered off the Witness, and this was the result as shown below, all VMs were still running.

Of course, I had to double check on the commandline using RVC (you can use the command “vsan.vm_object_info” to check a particular object for instance) to ensure that indeed the components of those VMs were still “ACTIVE” instead of “ABSENT”, and there you go!

Now when maintenance has been completed, you simply do the reverse, you power on the witness, and then you power on the hosts in the other location. After the “resync” has been completed the VMs will be rebalanced again by DRS. Note, DRS rebalancing (or should rules being applied) will only happen when the resync of the VM has been completed.

Scalable Snapshots demo with the vSAN 8.0 Express Storage Architecture

Duncan Epping · Sep 5, 2023 ·

Starting with vSAN 8 a brand new architecture was introduced called “Express Storage Architecture”. Over the last year or so a lot of information has been shared about ESA and the benefits of ESA. One of the things which ESA introduces is much-improved snapshot scalability.

With vSAN OSA, and with VMFS, when you create a snapshot you typically immediately see a performance degradation. This is because both VMFS and vSAN OSA still operate using the redo-log based snapshot mechanism. This means that with vSAN OSA when you create a snapshot a new object is created and writes are re-directed. It also means that reads will be coming from various files, if you have one or more snapshots. This mechanism is, unfortunately, not very effective. Let me borrow a diagram that is part of a post John Nicholson wrote to demonstrate that old logic.

With vSAN 8 ESA the mechanism has changed and no longer does vSAN, or vSphere for that matter, create an additional object. vSAN ESA handles this on a meta-data level. In other words, instead of redirecting writes and traversing files for reads, vSAN now leverages a highly efficient B-Tree structure and pointers to keep track of which block is associated with which snapshot.

Not only is this more efficient from a capacity perspective, but more importantly it is very efficient from a performance standpoint. I ran half a dozen tests in my lab, and what I saw was a below 2% performance impact between a VM without a snapshot and a VM with one or multiple snapshots. I could NOT see a significant difference between the first or the fifth snapshot. I do want to point out that my lab is not officially certified to run vSAN ESA, nevertheless, I was very impressed with the results.

During the last run, I actually recorded the whole exercise. In this demo, I show the creation of one snapshot, while the VM is running a benchmark (HCIBench). Now, during the testing, I created not one but various snapshots and of course, I deleted all of them as well. You have all probably experienced extensive stun times during the deletion of a snapshot at times, and this is where vSAN ESA shines. The stun times have been reduced by 100 times, and that is something I am sure each of you will appreciate. Why have they been reduced drastically? Well, simply because we no longer have to copy data from one vSAN object to another. This makes a huge difference, not just for stun times, but also for performance in general (latency, IOPS, throughput). If you are interested, have a look at the demo!

New book: VMware vSAN 8.0 U1 Express Storage Architecture Deep Dive!

Duncan Epping · Apr 27, 2023 ·

We already gave some hints on twitter, and during an episode of the Unexplored Territory podcast, but here it finally is… The new book, the VMware vSAN 8.0 U1 Express Storage Architecture Deep Dive! It has been a year since we released the vSAN 7.0 U3 Deep Dive book, and with this brand new vSAN architecture being introduced in vSAN 8.0 we figured it was time to do a full overhaul of the book as well. Mind you, this new book purely deals with the Express Storage Architecture, aka vSAN ESA. This also means that some of the features which are not supported by ESA are not discussed in this book, for that you will need to buy the vSAN 7.0 U3 Deep Dive book, which covers OSA. Another big change is that we brought in a third author, we asked our good friend Pete Koehler to contribute to the book. Pete had done reviews of previous books, and considering the amount of material he produced for VMware Tech Marketing for vSAN (and ESA specifically) it made a lot of sense to bring him in!

VMware’s vSAN has rapidly proven itself in environments ranging from hospitals to oil rigs to e-commerce platforms and is the market leader in the hyperconverged space. Along the way, the world of IT has rapidly changed, not just from a software point of view, but also from a hardware perspective. With vSAN 8.0 VMware brought a new architecture to market called vSAN Express Storage Architecture (ESA). This architecture is highly optimized for today’s world of datacenter resources, be it CPU, memory, networking, or NVMe based flash storage.

The authors of the vSAN Deep Dive have thoroughly updated their definitive guide to this transformative technology. Writing for vSphere administrators, architects, and consultants, Cormac Hogan, Duncan Epping , and Pete Koehler explain what vSAN ESA is, why the architecture has changed, what it now offers, and how to gain maximum value from it. The book offers expert insight into preparation, installation, configuration, policies, provisioning, clusters, architecture, and more. You’ll also find practical guidance for using all data services, stretched clusters, two-node configurations, and cloud-native storage services.

Although we pressed publish on Tuesday, sometimes it takes a while before the book is available in all Amazon stores, but it should just trickle down in the upcoming 24-48 hours. The book is priced at 9.99 USD for the ebook and 29.99 USD for a paper copy, and is sold through Amazon only. Get it while it is hot, and we would appreciate it if you would use our referral links and leave a review when you finish it. Thanks for the support, and we hope you will enjoy it!

paper – 29.99 USD
ebook – 9.99 USD

Of course, we also have the links to other major Amazon stores:

United Kingdom – ebook – paper
Germany – ebook – paper
Netherlands – ebook – paper
Canada – ebook – paper
France – ebook – paper
Spain – ebook – paper
India – ebook
Japan – ebook – paper
Italy – ebook – paper
Mexico – ebook
Australia – ebook – paper
Brazil – ebook
Or just do a search in your local amazon store!

vSAN 8.0 U1 ESA – Auto Policy Management

Duncan Epping · Mar 28, 2023 ·

One of the features that is introduced in vSAN 8.0 U1 for ESA is Auto-Policy Management. I personally love this feature, as it will help a lot of customers make the right decision in terms of what the default policy should be on their vSAN Datastore. Now, Pete Koehler wrote a very extensive blog post, and I don’t want to copy his work and simply rewrite it, so I suggest you read his blog for the full details on this brand new feature.

I do realize that some of you are just as lazy as I am, so here’s a short summary of what Auto-Policy Management is. Auto-Policy Management, when enabled, creates a new vSAN VM storage policy based on the capabilities enabled on your cluster and the size of your cluster. After creating the policy, the policy is also assigned to the datastore as the “default policy” so that any VMs which are provisioned without the selection of a policy get this optimized policy assigned. What influences the policy characteristics? Well: size of the cluster, stretched vs normal, host rebuild reserve enabled/disabled. All those factors will determine what kind of policy is created and associated with the datastore. If over time your cluster configuration changes, well then Skyline Health will inform you that changes are required to have an optimal policy again. Wonder what that looks like? Watch the demo below!

vSAN 8.0 U1 – Disaggregated Storage Enhancements!

Duncan Epping · Mar 16, 2023 ·

With vSAN 8.0 U1 a lot of new features and enhancements are introduced. There are many blog posts out there describing the long list of enhancements, but in this post, I want to focus on HCI Mesh or Disaggregated vSAN specifically. (Also read this post by Cato!) For this feature, which in the UI is referred to as “Datastore Sharing”, there are 3 key enhancements introduced in vSAN 8.0 U1. There are enhancements for both the Original Storage Architecture (OSA), as well as the Express Storage Architecture (ESA).

With vSAN 8.0 the initial version of ESA was launched, and it did not support the use of Datastore Sharing. Starting with vSAN 8.0 U1 though, vSAN ESA is now also capable of sharing its storage with other clusters in the environment. To be more precise, a vSAN ESA cluster can now mount the datastore of another vSAN ESA cluster. What we also support is a “compute only” cluster mounting the vSAN ESA datastore remotely. So for those planning on implementing vSAN ESA, I think that is a very welcome enhancement!

For OSA there are also two enhancements for Datastore Sharing. The first I want to discuss is cross-vCenter Server datastore sharing. This feature is especially useful with customers who have a larger estate and are managing multiple clusters via different vCenter Server instances. You simply now have the option to connect the vCenter Server instances from a storage point of view, and then you can simply select the remote datastore in the cluster managed by a different vCenter Server instance. Let me just show you how this actually works in the next demo.

The second enhancement for OSA specifically is support for Stretched Cluster configurations. Starting with vSAN 8.0 U1 it is now possible to mount a vSAN Datastore which is stretched across locations. Your “client” cluster” can be “stretched”, “standard”, or compute-only even. We support all of those combinations. On top of that, the interface enables you to specify which location should be paired with which location, or fault domain. In other words, if you look at the diagram below, I can ensure that the hosts in Site A connect via the “local” network” to the remote datastore as part of Site A. This avoids IO traversing the intersite link, which can make a big difference in terms of latency and available bandwidth for other I/O etc.

I can imagine that the concepts are difficult to grasp without seeing the vSphere Client, so I spend some time in the lab to create a demo for you that walks you through the steps of how to configure this. In the lab I created a vSAN Stretched Cluster, and a standard cluster, and I am going to mount the vSAN stretched Datastore to the host in the standard cluster. Enjoy!