• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

vSAN 8.0 ESA – Dude, where’s my vSAN disk group?

Duncan Epping · Nov 29, 2022 · 5 Comments

Last week I was talking to a customer and he mentioned that he deployed vSAN 8.0 in his lab and he was shocked that when he wanted to define disk groups he noticed that they don’t exist anymore. Well, not in vSAN 8.0 ESA (Express Storage Architecture) that is. They do still exist in the Original Storage Architecture! The big change with vSAN 8.0 ESA is that the “bottleneck” in the previous architecture has been removed. No longer will you select a single device for caching for a particular disk group, and no longer do you designate devices purely for capacity.

With vSAN 8.0 ESA all your devices will be part of a single storage pool, and all those devices will contribute to both storage capacity as well as storage performance! The added benefit of course is the fact that writes and reads will be distributed across all devices, removing a potential choking point, and also removing a single point of failure. Why? Well with vSAN OSA when the caching device fails the whole disk group becomes unavailable. With ESA that is no longer the case as there’s no caching device!

So how does vSAN ESA provide both optimal efficiency for capacity as well as optimal performance? Well, it does this by introducing additional layers. The idea is that vSAN will provide write performance at the level of RAID-1 but space efficiency at the level of RAID-5 or RAID-6. That would be the best of both worlds. It would need to do this however while taking into consideration that we are also dealing with different types of flash devices than you normally would be with vSAN OSA. In other words, writes will also need to be optimized for the types of devices used (TLC), and it will also need to be future-proof for devices that may be supported later on (QLC).

One of the key elements in this new architecture is the introduction of the “log-structured filesystem” and the “durable log”. Let’s look at the below diagram first.

What we do with vSAN ESA is that all data is written to the log-structured file system first in the durable log. This ensures that data is persistently stored. This is what the “performance leg” provides. The performance leg literally stores the writes first. That could be 4KB blocks, or 32KB blocks, or whatever. It stores the data first, collects a full stripe write (512KB), and then writes the data to the capacity leg. Why these 2-layers? Well, the performance leg is a RAID-1 configuration, so it is optimal for write performance, while in general, the capacity leg will be RAID-5 or RAID-6, which is optimal for space efficiency. By creating this small performance leg component that holds the durable log, vSAN is capable of immediately acknowledging the writes as it is persisted in the log, and then when there’s a full stripe write it efficiently as RAID-5 or RAID-6.

Now of course, in the UI you will be able to see those new performance leg components and the capacity leg components. They are not marked as “performance” or “capacity” but they are easily recognizable. I created a quick demo that talks you through the above. If you are interested, check it out!

Unexplored Territory Podcast 31 – VMware Edge Compute Stack? Featuring Marilyn Basanta!

Duncan Epping · Nov 21, 2022 · Leave a Comment

In episode 30 we spoke with Alan Renouf about the potential future of edge deployments, aka Project Keswick. We figured we also need to cover what is available today in the form of VMware Edge Compute Stack, so we invited Marilyn Basanta who is the Senior Director at VMware for Edge! Marilyn explains what the VMware Edge Compute Stack looks like, what customer use cases she encounters in the field, and how VMware Edge Compute Stack can help you run and deploy applications securely and efficiently in remote, and sometimes strange, locations. You can listen via Spotify – https://spoti.fi/3WWNIKu , Apple – https://apple.co/3hEFu9L, or use the embedded player below!

Can you exceed the number of FT enabled vCPUs per host or number of FT enabled vCPUs per VM?

Duncan Epping · Nov 18, 2022 · Leave a Comment

Not sure why, but the last couple of weeks I have had several questions about FT (Fault Tolerance). The questions where around the limits, what is the limit per VM, what is the limit per host, and can I somehow exceed these? All of this is documented by VMware, but somehow seems to be either difficult to find or difficult to understand. Let me write a short summary that hopefully clarifies things.

First of all, the license you use dictates the maximum number of vCPUs a VM can have when enabling FT on that VM:

  • vSphere Standard and Enterprise. Allows up to 2 vCPUs
  • vSphere Enterprise Plus. Allows up to 8 vCPUs

Now, there are also two other things that come into play. You can have a maximum of 4 FT enabled VMs per host, and a maximum of 8 FT enabled vCPUs per host. You can change these settings, this is fully supported as I already discussed in this blog post. There is however a caveat, while VMware has tested with a higher number of FT enabled VMs per host than 4, and with a high number of FT enabled vCPUs, there’s no guarantee that you will get acceptable performance. The more you increase these default values, the bigger the chance that there will be a performance impact.

When FT is enabled a significant amount of communication between hosts (Primary / Shadow VM) needs to occur to ensure the VMs are in lockstep. This overhead can cause a slowdown, and this is the reason why we have those limitations in place. If you have sufficient networking bandwidth and CPU capacity then you can increase these numbers. Note, typically VMware development does not test beyond the maximum specified numbers. If performance is impacted, or you receive unexpected errors/results, and you contact support then support may request to lower the numbers as that impact can unfortunately not be solved in a different way. I hope that clarifies it.

vSAN 8.0 ESA – Introducing Adaptive RAID-5

Duncan Epping · Nov 15, 2022 · 2 Comments

Starting with vSAN 8.0 ESA (Express Storage Architecture) VMware has introduced an adaptive RAID-5 mechanism. What does this mean? Essentially, vSAN deploys a particular RAID-5 configuration depending on the size of the cluster! There are two options, let’s list them out and discuss them individually.

  • RAID-5, 2+1, 3-5 hosts
  • RAID-5, 4+1, 6 hosts or more

As mentioned in the above list, depending on the cluster size, you will see a particular RAID-5 configuration. Clusters of up to 5 hosts will see a 2+1 configuration when RAID-5 is selected. For those wondering, the below diagram will show what this looks like. 2+1 configurations will have a 150% overhead, meaning that when you store 100GB of data, this will consume 150GB of capacity.

Now, when you have a larger cluster, meaning 6 hosts or more, vSAN will deploy a 4+1 configuration. The big benefit of this is that the “capacity overhead” goes down from 150% to 125%, in other words, 100GB of data will consume 125GB of capacity.

What is great about this solution is that vSAN will monitor the cluster size. If you have 6 hosts and a host fails, or a host is placed into maintenance mode etc, vSAN will automatically scale down the RAID-5 configuration from 4+1 to 2+1 after a time period of 24 hours. I of course had to make sure that it actually works, so I created a quick demo that shows vSAN changing the RAID-5 configuration from 4+1 to 2+1, and then back again to 4+1 when we reintroduce a host into the cluster.

One more thing I need to point out. The Adaptive RAID-5 functionality also works in a stretched cluster. So if you have a 3+3+1 stretched cluster you will see a 2+1 RAID-5 set. If you have a 6+6+1 cluster (or more in each location) then you will see a 4+1 set. Also, if you place a few hosts into maintenance mode or hosts have failed then you will see the configuration change from 4+1 to 2+1, and the other way around when hosts return for duty!

For more details, watch the demo, or read this excellent post by Pete Koehler on the VMware website.

Are Nested Fault Domains supported with 2-node configurations with vSAN 8.0 ESA?

Duncan Epping · Oct 28, 2022 · 7 Comments

Short answer, yes 2-node configurations with vSAN 8.0 ESA support Nested Fault Domains. Meaning that when you have a 2-node configuration you can also protect your data within each host with RAID-1, RAID-5, or RAID-6! The configuration of this is pretty straightforward. You create a policy with “Host Mirroring” and select the protection you want in each host. The screenshot below demonstrates this.

In the above example, I mirror the data across hosts and then have a RAID-5 configuration within each host. Now when I create a RAID-5 configuration within each host I will get the new vSAN ESA 2+1 configuration. (2 data blocks, 1 parity block) If you have 6 devices or more in your host, you can also create a RAID-6 configuration, which is 4+2. (4 data blocks, 2 parity blocks) This provides a lot of flexibility and can lower the overhead when desired. (RAID-1 = 100% overhead, RAID-5 = 50% overhead, RAID-6 = 25% overhead) When you use RAID-5 and RAID-6 and look at the layout of the data it will look as shown in the next two screenshots, the first screenshot shows the RAID-5 configuration, and the second the RAID-6 configuration.

vSAN ESA 2-node nested fault domain raid-5

vSAN ESA 2-node nested fault domain raid-6

One thing you may wonder when looking at the screenshots is why they also have a RAID-1 configuration for the VMDK object, this is the “performance leg” that vSAN ESA implements. For RAID-5, which is “FTT=1”, this means you get 2 components. For RAID-6, which is FTT=2, this means you will get 3 components so you can tolerate 2 failures.

I hope that helps answer some of the questions folks had on this subject!

 

  • « Go to Previous Page
  • Go to page 1
  • Interim pages omitted …
  • Go to page 3
  • Go to page 4
  • Go to page 5
  • Go to page 6
  • Go to page 7
  • Interim pages omitted …
  • Go to page 481
  • Go to Next Page »

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

May 24th – VMUG Poland
Aug 21st – VMware Explore
Sept – VMUG Slovenia (virtual)
Oct – VMUG Sweden
Nov 6th – VMware Explore
Nov 23rd – UK VMUG
Dec 7th – Swiss German VMUG

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2023 · Log in