express storage architecture

Are Nested Fault Domains supported with 2-node configurations with vSAN 8.0 ESA?

Duncan Epping · Oct 28, 2022 ·

Short answer, yes 2-node configurations with vSAN 8.0 ESA support Nested Fault Domains. Meaning that when you have a 2-node configuration you can also protect your data within each host with RAID-1, RAID-5, or RAID-6! The configuration of this is pretty straightforward. You create a policy with “Host Mirroring” and select the protection you want in each host. The screenshot below demonstrates this.

In the above example, I mirror the data across hosts and then have a RAID-5 configuration within each host. Now when I create a RAID-5 configuration within each host I will get the new vSAN ESA 2+1 configuration. (2 data blocks, 1 parity block) If you have 6 devices or more in your host, you can also create a RAID-6 configuration, which is 4+2. (4 data blocks, 2 parity blocks) This provides a lot of flexibility and can lower the overhead when desired compared to RAID-1. (RAID-1 = 100% overhead, RAID-5 = 50% overhead for 2+1, RAID-6 = 50% overhead) When you use RAID-5 and RAID-6 and look at the layout of the data it will look as shown in the next two screenshots, the first screenshot shows the RAID-5 configuration, and the second the RAID-6 configuration.

One thing you may wonder when looking at the screenshots is why they also have a RAID-1 configuration for the VMDK object, this is the “performance leg” that vSAN ESA implements. For RAID-5, which is “FTT=1”, this means you get 2 components. For RAID-6, which is FTT=2, this means you will get 3 components so you can tolerate 2 failures.

I hope that helps answer some of the questions folks had on this subject!

vSAN 8.0 ESA and Compression by default?

Duncan Epping · Oct 24, 2022 ·

Starting with vSAN 8 ESA (Express Storage Architecture) how data services have been implemented has changed significantly compared to the Original Storage Architecture. In vSAN OSA compression (and deduplication) happens before the data is stored on disk on each of the hosts the data is stored on. With vSAN 8.0 ESA this has changed completely. With vSAN ESA compression actually happens all the way at the top of the architecture, as shown and explained in the diagram/slide below.

Now, the big benefit of course of this is that if you compress the data first, and you compress the data from let’s say 4K to 2K then only 2K needs to be sent over the network to all hosts where the data is being stored. Not only that, if data needs to encrypted then only 2KB needs to be encrypted, and of course when you checksum the data then also only 2KB needs to be checksummed. So what are the savings here when encryption is enabled on ESA vs OSA?

Less data to send over the network
Less data to encrypt when encryption is enabled
Less data to checksum
Compression only takes place on the source host, and not on the destination hosts, so a lower number of CPU cycles is used for each IO

Also, with vSAN ESA the granularity in terms of compression is also different than with vSAN OSA. With OSA vSAN would compress from 4KB down to 2KB and that is it. If it couldn’t compress down to 2KB then it would not compress the block. With vSAN ESA that has changed. vSAN ESA will aim to always compress, but of course, it needs to make sense. No point in compressing 4KB down to 3.8KB. And yes, vSAN ESA will also go beyond 2KB if possible. As mentioned above in the screenshot, theoretically it is possible to reach an 8:1 compression ratio per 4KB block.

Now, the other difference is that you enable/disable compression through policy. How do you do this? When you create a policy for ESA you have the options in Storage Rules as shown in the screenshot below, “No Preference” (Default), “No space efficiency”, “Compression only”, and “Deduplication and compression”.

Now, let’s be clear, “Deduplication and compression” is not an option for ESA as “Deduplication” has not been implemented just yet. When you configure either of the other three options the outcome is as follows:

No preference – Compression enabled
No space efficiency – Compression disabled
Compression only – Compression enabled

Can you validate this? Yes, you can, and of course, I did to show you how that works. I created three policies with the above options selected for each respectively. I also created three VMs, with each of them having the appropriate policy selected as you can see in the screenshot below. VM_CompDisabled has the policy “Comp_Disabled” associated with it, and of course, that policy had “No space efficiency” selected.

Now, where can you see if the object actually has compression enabled or disabled? Well, this is where it becomes a bit more complex, unfortunately (yes, I filed a feature request). If you want to validate it you will have to go to the command line and check the object itself. Simply copy the UUID you see in the screenshot above and use the command “cmmds-tool find -t DOM_OBJECT -u <UUID>“. When you run this command you will receive a lengthy output, and that output will contain the following string,("compressionState" i1), when compression is disabled.

One more thing I want to mention, as compression does not happen on a “physical” disk layer, or on the “disk group layer” as they do with vSAN OSA. If you switch between policies where compression is enabled/disabled, you will not see a massive rewrite of data occurring. When you switch from Disabled to Enabled only newly written data will be compressed! Same applies to when you switch back. Only newly written data will be impacted by the policy change.

For those who prefer to hear/see me going through the UI to disable compression on a per-VM basis, make sure to watch the below demo!

Running vSAN ESA? Change the default storage policy to RAID-5/6!

Duncan Epping · Oct 14, 2022 ·

Most of you have read all about vSAN ESA by now. If you have not, you can find my article here, and a dozen articles on core.vmware.com by the Tech Marketing team. What is going to make a huge difference with the Express Storage Architecture is that you get RAID-5 efficiency at RAID-1 performance. This is discussed by Pete Koehler in this blog post in-depth, so no point in me reiterating it. On top of that, the animated gif below demonstrates how it actually works and shows why it not only performance well, but also why it is so efficient from a capacity stance. As we only have a single tier of flash, the system uses it in a smart way and introduces additional layers so that both reads and writes are efficient.

Now one thing I do want to point out is that if you create your ESA cluster, you will need to verify the default storage policy assigned to the vSAN Datastore. In my case this was the regular vSAN Storage Policy, which means RAID-1 configuration for the performance leg, and RAID-1 for the capacity leg. Now, I want to get the most of my system from a capacity perspective and I want to test this new level of performance for RAID-5, even though I only have 4 hosts (which gives me a 2+1 RAID-5 set).

Of course you can select a policy everytime you deploy a VM, but I prefer to keep things simple, so I change the default storage policy on my datastore. Simply click on the Datastore icon in the vSphere Client, then select your vSANDatastore and click on “Configure” and “General”. Next click on “Edit” where is says “Default Storage Policy” and then select the policy you want to have applied to VMs by default for this datastore. As shown below, for me that is RAID-5!

Introducing vSAN 8 – Express Storage Architecture (ESA)

Duncan Epping · Aug 30, 2022 ·

I debated whether I would write this blog now or wait a few weeks, as I know that the internet will be flooded with articles. But as it helps me as well to write down these things, I figured why not. So what is this new version of vSAN? vSAN Express Storage Architecture (vSAN ESA) introduces a new architecture for vSAN specifically with vSAN 8.0. This new architecture was developed to cater to this wave of new flash devices that we have seen over the past years, and we expect to see in the upcoming years. Not just storage, it also takes the huge improvements in terms of networking throughput and bandwidth into consideration. On top of that, we’ve also seen huge increases in available CPU and Memory capacity, hence it was time for a change.

Does that mean the “original” architecture is gone? No, vSAN Original Storage Architecture (OSA) still exists today and will exist for the foreseeable future. VMware understands that customers have made significant investments, so it will not disappear. Also, vSAN 8 brings fixes and new functionality for users of the current vSAN architecture (the logical cache capacity has been increased to 1.6TB instead of 600GB for instance.) VMware also understands that not every customer is ready to adopt this “single tier architecture”, which is what vSAN ESA delivers in the first release, but mind that this architecture also caters to other implementations (two-tier) in the future. What does this mean? When you create a vSAN cluster, you get to pick the architecture that you want to deploy for that environment (ESA or OSA), it is that simple! And of course, you do that based on the type of devices you have available. Or even better, you look at the requirements of your apps and you base your decision of OSA vs ESA and the type of hardware you need on those requirements. Again, to reiterate, vSAN Express Storage Architecture provides a flexible architecture that will use a single tier in vSAN 8 taking modern-day hardware (and future innovations) into consideration.

Before we look at the architecture, why would a customer care, what does vSAN ESA bring?

Simplified storage device provisioning
Lower CPU usage per processed IO
Adaptive RAID-5 and RAID-6 at the performance of RAID-1
Up to 4x better data compression
Snapshots with minimal performance impact

When you create a vSAN ESA cluster the first thing that probably stands out is that you no longer need to create disk groups, which speaks to the “Simplified storage device provisioning” bullet point. With the OSA implementation, you create a disk group with a caching device and capacity devices, but with ESA that is no longer needed. This is the first thing I noticed. You now simply select all devices and they will be part of your vSAN datastore. It doesn’t mean though that there’s no caching mechanism, but it just has been implemented differently. With vSAN ESA, all devices contribute to capacity and all devices contribute to performance. It has the added benefit that if one device fails that it doesn’t impact anything else but what is stored on that device. With OSA, of course, it could impact the whole disk group that the device belonged to.

So now that we know that we no longer have disk groups with caching disks, how do we ensure we still get the performance customers expect? Well, there were a couple of things that were introduced that helped with that. First of all, a new log-structured file system was introduced. This file system helps with coalescing writes and enables fast acknowledgments of the IOs. This new layer will also enable direct compression of the data (enabled by default, and can be disabled via policy) and packaging of full stripes for the capacity “leg”. Capacity what? Yes, this is a big change that is introduced as well. With vSAN ESA you have a capacity leg and a performance leg. Let me show you what that looks like, and kudos to Pete Koehler for the great diagram!

As the above diagram indicates, you have a performance leg which is RAID-1 and then there’s a capacity leg which can be RAID-1 but will typically be RAID-5 or RAID-6. Depending on the size of your cluster of course. Another thing that will depend on the size of the cluster, this the size of your RAID-5 configuration, that is where the adaptable RAID-5 comes into play. It is an interesting solution, and it enables customers to use RAID-5 implementations starting with only 3 hosts all the way up to 6 hosts or more. If you have 3-5 hosts then you will get a 2+1 configuration, meaning 2 components for data and 1 for parity. When you have 6 hosts or larger you will get a 4+1 configuration. This is different from the original implementation as there you would always get 3+1. For RAID-6 the implementation is 4+2 by the way.

I’ve already briefly mentioned it, but compression is now enabled by default. The reason for it is that the cost of compression is really low with the current implementation as compression happens all the way at the top. That means that when a write is performed the blocks actually are sent over the network compressed as well to their destination and they are stored immediately. So no need to unpack and compress again. The other interesting thing is that the implementation of compression has also changed, leading to an improved efficiency that can go up to an 8:1 data reduction. The same applies to encryption implementation, it also happens at the top, so you get data-at-rest and data-in-transit encryption automatically when it is enabled. Enabling encryption still happens at the cluster level though, where compression can now be enabled/disabled on a per VM basis.

Another big change is the snapshot implementation. We’ve seen a few changes in snapshot implementation over the years, but this one is a major change. I guess the big change is that when you create a snapshot vSAN does not create a separate object. This means that the snapshot basically exists within the current object layout. Big benefit, of course, being that the object count doesn’t skyrocket when you create many snapshots, another added benefit is the performance of this implementation. Consolidation of a snapshot for instance when tested went 100x faster, this means much lower stun times, which I know everyone can appreciate. Not only is it much much faster to consolidate, but also normal IO is much faster during consolidation and during snapshot creation. I love it!

The last thing I want to mention is that from a networking perspective vSAN ESA not only performs much better, but it also is much more efficient. Allowing for ever faster resyncs, and faster virtual machine I/O. On top of that, because compression has been implemented the way it has been implemented it simply also means there’s more bandwidth remaining.

For those who prefer to hear the vSAN 8 ESA story through a podcast, make sure to check the Unexplored Territory Podcast next week, as we will have Pete Koehler answering all questions about vSAN ESA. Also, on core.vmware.com you will find ALL details of this new architecture in the upcoming weeks, and also make sure to read this official blog post on vmware.com.