Last week I was talking to a customer and he mentioned that he deployed vSAN 8.0 in his lab and he was shocked that when he wanted to define disk groups he noticed that they don’t exist anymore. Well, not in vSAN 8.0 ESA (Express Storage Architecture) that is. They do still exist in the Original Storage Architecture! The big change with vSAN 8.0 ESA is that the “bottleneck” in the previous architecture has been removed. No longer will you select a single device for caching for a particular disk group, and no longer do you designate devices purely for capacity.
With vSAN 8.0 ESA all your devices will be part of a single storage pool, and all those devices will contribute to both storage capacity as well as storage performance! The added benefit of course is the fact that writes and reads will be distributed across all devices, removing a potential choking point, and also removing a single point of failure. Why? Well with vSAN OSA when the caching device fails the whole disk group becomes unavailable. With ESA that is no longer the case as there’s no caching device!
So how does vSAN ESA provide both optimal efficiency for capacity as well as optimal performance? Well, it does this by introducing additional layers. The idea is that vSAN will provide write performance at the level of RAID-1 but space efficiency at the level of RAID-5 or RAID-6. That would be the best of both worlds. It would need to do this however while taking into consideration that we are also dealing with different types of flash devices than you normally would be with vSAN OSA. In other words, writes will also need to be optimized for the types of devices used (TLC), and it will also need to be future-proof for devices that may be supported later on (QLC).
One of the key elements in this new architecture is the introduction of the “log-structured filesystem” and the “durable log”. Let’s look at the below diagram first.
What we do with vSAN ESA is that all data is written to the log-structured file system first in the durable log. This ensures that data is persistently stored. This is what the “performance leg” provides. The performance leg literally stores the writes first. That could be 4KB blocks, or 32KB blocks, or whatever. It stores the data first, collects a full stripe write (512KB), and then writes the data to the capacity leg. Why these 2-layers? Well, the performance leg is a RAID-1 configuration, so it is optimal for write performance, while in general, the capacity leg will be RAID-5 or RAID-6, which is optimal for space efficiency. By creating this small performance leg component that holds the durable log, vSAN is capable of immediately acknowledging the writes as it is persisted in the log, and then when there’s a full stripe write it efficiently as RAID-5 or RAID-6.
Now of course, in the UI you will be able to see those new performance leg components and the capacity leg components. They are not marked as “performance” or “capacity” but they are easily recognizable. I created a quick demo that talks you through the above. If you are interested, check it out!
Great article
Cisco did this many years ago log structured 🙂 at least they should change the name.
Log Structured File Systems have been around for decades. Mendel (VMware founder) actually invented it roughly 40 years ago. There’s nothing new about it indeed conceptually. Of course implementation details matter, and implementation (and the results ultimately) will wildy vary. for me, the new architecture is a huge step forward, which I think many customers will appreciate!
Great aticle Duncan, maybe in a next one it’s worth looking into Metadata, Metadata log, btree , splinterdb …
I will need to figure out what/if I can share any details to be honest.