vSAN

Virtual SAN: Generic storage platform of the future

Duncan Epping · Nov 20, 2015 ·

Over the last couple of weeks I have been presenting at various events on the topic of Virtual SAN. One of the sections in my deck is a bit about the future of Virtual SAN and where it is heading towards. Someone tweeted one of the diagrams in my slides recently which got picked up by Christian Mohn who provided his thoughts on the diagram and what it may mean for the future. I figured I would share my story behind this slide, which is actually a new version of a slide that was originally presented by Christos and also discussed in one of his blog posts. First, lets start with the diagram:

If you look at VSAN today and ask people what VSAN is today then most will answer: a “virtual machine” storage system. But VSAN to me is much more than that. VSAN is a generic object storage platform, which today is used to primarily store virtual machines. But these objects can be anything if you ask me, and on top of that can be presented as anything.

So what is it VMware is working towards, what is our vision? VSAN was designed to serve as a generic object storage platform from the start, and is being extended to serve as a platform to different types of data by providing an abstraction layer. In the diagram you see “REST” and “FILE” and things like Mesos and Docker, it isn’t difficult to imagine what types of workloads we envision to run on top of VSAN and what types of access you have to resources managed by VSAN. This could be through a native Rest API that is part of the platform which can be used by developers directly to store their objects on or through the use of a specific driver for direct “block” access for instance.

Combine that with the prototype of the distributed filesystem which was demonstrated at VMworld and I think it is fair to say that the possibilities are endless. VSAN isn’t just a storage system for virtual machines, it is a generic object based storage platform which leverages local resources and will be able to share those in a clustered fashion in any shape or form in the future. Christian definitely had a point, in which shape or form all of this will be delivered has to be seen though, this is not something I can (or want) to speculate on. Whether that is through Photon Platform, or something else is in my opinion besides the point. Even today VSAN has no dependencies on vCenter Server and can be fully configured, managed and monitoring using the APIs and/or the different command-line interface options we offer. Agility and choice have always been the key design principles for the platform.

Where things will go exactly and when this will happen is still to be seen. But if you ask me, exciting times are ahead for sure, and I can’t wait to see how everything plays out.

Virtual SAN and support for SATADOM

Duncan Epping · Nov 2, 2015 ·

It seems that a lot of people haven’t picked up on this… With Virtual SAN in the past, or better said with vSphere, booting from SATADOM was not supported. This had to do with the default location of the scratch partition, the number of expected writes to the SATADOM device and simply the fact that we did not know how fast the device would wear out.

For those who don’t know, SATADOM devices are basically flash chips on a SATA module which usually directly goes on the motherboard. Great solution as it is as fast as SSD, as small as SD/USB which means you don’t lose a disk slot.

After many tests over the last year it was concluded that SATADOM can be fully supported for vSphere and Virtual SAN but that there are some requirements for the device itself:

When you boot a Virtual SAN host from a SATADOM device, you must use:
- single-level cell (SLC) device
- The size of the boot device must be at least 16 GB.

Again, key reason for this is that all the trace logs and vSphere logs (etc) end up on this device and we don’t want it to wear out and cause all sorts of unexpected behaviour. As our documentation says: It is important that the SATADOM device meets the specifications outlined in this guide!

Anyway, now you know… more options when it comes to booting ESXi supported, which especially is handy when you want to use your disk slots for Virtual SAN!

Stretched Clusters: Disable failover of specific VMs during full site failure

Duncan Epping · Oct 21, 2015 ·

Last week at VMworld when presenting on Virtual SAN Stretched Clusters someone asked me if it was possible to “disable the fail-over of VMs during a full site failure while allowing a restart during a host failure”. I thought about it and said “no, that is not possible today”. Yes you can “disable HA restarts” on a per VM basis, but you can’t do that for a particular type of failure.

The last statement is correct, HA does not allow you to disable restarts for a site failure. You can fully disable HA for a particular VM though. But when back at my hotel I started thinking about this question and realized that there is a work around to achieve this. I didn’t note down the name of the customer who asked the question, so hopefully you will read this.

When it comes to a stretched cluster configuration typically you will use VM/Host rules. These rules will “dictate” where VMs will run, and typically you use the “should” rule as you want to make sure VMs can run anywhere when there is a failure. However, you can also create “must” rules, and yes this means that the rules will not be violated and that those VMs can only run within that site. If a host fails within a site then the impacted VMs will be restarted within the site. If the site fails then the “must rule” will prevent the VMs from being restarted on the hosts in the other location. The must rules are pushed down to the “compatibility list” that HA maintains, which will never be violated by HA.

Simple work-around to prevent VMs from being restarted in another site.

SMP-FT support for Virtual SAN ROBO configurations

Duncan Epping · Oct 12, 2015 ·

When we announced Virtual SAN 2-node ROBO configurations at VMworld we received a lot of great feedback and responses. A lot of people asked if SMP-FT was supported in that configuration. Apparently many of the customers using ROBO still have legacy applications which can use some form of extra protection against a host failure etc. The Virtual SAN team had not anticipated this and had not tested this explicit scenario unfortunately so our response had to be: not supported today.

We took the feedback to the engineering and QA team and these guys managed to do full end-to-end tests for SMP-FT on 2-node Virtual SAN ROBO configurations. Proud to announce that as of today this is now fully supported with Virtual SAN 6.1! I want to point out that still all SMP-FT requirements do apply, which means 10GbE for SMPT-FT! Nevertheless, if you have the need to provide that extra level of availability for certain workloads, now you can!

Dell FX2 platform certified for VSAN with storage blades!

Duncan Epping · Oct 8, 2015 ·

A couple of weeks ago the Dell FX2 disk controller was added to the Virtual SAN Compatibility Guide and shortly after the Ready Node configurations were added. For those who haven’t looked at the Dell FX2 platform, it is (in my opinion) hyper-converged on steroids. Not only can it provide you with 4 compute nodes in 2U it also packs a 10GbE switch and can hold two storage blades with each 16 disks in it. What? Yes indeed, that is a lot of horse power in a single system.

I am working with a customer right now who is designing a new cluster configuration leveraging the Dell FX2 platform. In this case they are planning on 16 hosts in total. In their case after assessing their current workloads they are going with the FC430 E5-2670 v3 series with 12 cores (dual processor). Each host will have 256GB of memory and uses SD to boot from.

From a storage perspective they are looking to use the FD332 storage blades. Two per FX2 chassis, fully maxed out with 32 drives in total, which is 8 drives per host. All-flash by the way, leveraging 1.6TB devices for the capacity tier and 400GB devices for the write cache. Yes that is 38.4TB raw capacity per FX2 chassis, times 4… ~153TB.Not a coincidence that the configuration is very similar to the “AF-6 Series – Dell FX2 Platform”, they prefer to use a certified and tested solution instead of picking their own components, which makes sense if you ask me.

One of the key reasons for them to go with all-flash is the beta which is coming up. They want to get their hands dirty with functionality like deduplication, checksumming and RAID-5/6 (aka erasure coding) as soon as possible. All 4 chassis will run in one site first for testing purposes for now and they are considering after the initial tests to deploy them across two sites in a stretched configuration. They asked me what the big benefit was of RAID-5 or RAID-6 over the network (aka erasure coding) and it definitely is the lower raw capacity requirements it will lead to. If you look at the current FTT=1 implementation it means that a 20GB disk requires an additional 20GB for availability reasons, which means 40GB in total. With an RAID-5 implementation instead of RAID-1 this 20GB disk would only require 26.6GB of disk space, that is a savings of almost 14GB immediately. And that is before any type of space efficiency (dedupe) is enabled. Anyway, back to the FX2.

So far only “all-flash” has made it to VSAN Ready Node list, and of course components are also listed as in the disk controller “FD332-PERC” (single and dual ROC) and I’ve seen the 1.8″ flash devices also on the list. Waiting to see what one of these boxes would cost in an all-flash configuration, and hoping to also see a hybrid configuration soon. I’m a fan of the Dell FX2 systems, that is for sure.