software defined storage

Part 2: Is VSA the future of Software Defined Storage? (Customer use case)

Duncan Epping · Nov 12, 2019 ·

About 6.5 years ago I wrote this blog post around the future of Software-Defined Storage and if the VSA (virtual storage appliance) is the future for it. Last week at VMworld a customer reminded me of this article. Not because they read the article and pointed me back at it, but because they implemented what I described in this post, almost to the letter.

This customer had an interesting implementation, which kind of resembles the diagram I added to the blog post, note I added a part to the diagram which I originally left out but had mentioned in the blog (yes that is why the diagram looks like it is ancient… it is):

I want to share with you what the customer is doing because there are still plenty of customers that do not realize that this is supported. Note that this is supported by both vSAN as well as VMware Cloud Foundation, providing you a future proof, scalable, and flexible full-stack HCI architecture which does not need to be implemented in a rip and replace approach!

This customer basically leverages almost all functionality of our Software-Defined Storage offering. They have vSAN with locally attached storage devices (all NVMe) for certain workloads. They have storage arrays with vVols enabled for particular workloads. They have a VAIO Filter Driver which they use for replication. They also heavily rely on our APIs for monitoring and reporting, and as you can imagine they are a big believer in Policy-Based Management, as that is what helps them with placing workloads on a particular type of storage.

Now you may ask yourself, why on earth would they have vSAN and vVols sitting next to each other? Well, they had a significant investment in storage already, the storage solution was fully vVols capable and when they started using vSAN for certain projects they simply fell in love with Storage Policy-Based Management and decided to get it enabled for their storage systems as well. Even though the plan is to go all-in on vSAN over time, the interesting part here, in my opinion, is the “openness” of the platform. Want to go all-in on vSAN? Go ahead! Want to have traditional storage next to HCI? Go ahead! Want to use software-based data services? Go ahead! You can mix and match, and it is fully supported.

Anyway, just wanted to share that bit, and figured it would also be fun to bring up this 6.5 years old article again. One more thing, I think it is also good to realize how long these transitions tend to take. If you would have asked me in 2013 when we would see customers using this approach my guess would have been 2-3 years. Almost 6.5 years later we are starting to see this being seriously looked at. Of course, platforms have to mature, but also customers have to get comfortable with the idea. Change simply takes a lot of time.

All-Flash HCI is taking over fast…

Duncan Epping · Nov 23, 2016 ·

Two weeks ago I tweeted about All-Flash HCI taking over fast, maybe I should have said All-Flash vSAN as I am not sure every vendor is seeing the same trend. Reason for it being of course is the price of flash dropping while capacity goes up. At the same time with vSAN 6.5 we introduced “all-flash for everyone” by dropping the “all-flash” license option down to vSAN Standard.

I love getting these emails about huge vSAN environments… this week alone 900TB and 2PB raw capacity in a single all-flash vSAN cluster

— Duncan Epping (@DuncanYB) November 10, 2016

So the question naturally came, can you share what these customers are deploying and using, I shared those later via tweets, but I figured it would make sense to share it here as well. When it comes to vSAN there are two layers of flash used, one for capacity and the other for caching (write buffer to be more precise). For the write buffer I am starting to see a trend, the 800GB and 1600 NVMe devices are becoming more and more popular. Also the write-intensive SAS connected SSDs are often used. I guess it largely depends on the budget which you pick, needless to say but NVMe has my preference when budget allows for it.

For the capacity tier there are many different options, most people I talk to are looking at the read intensive 1.92TB and 3.84TB SSDs. SAS connected are a typical choice for these environments, but it does come at a price. The SATA connected S3510 1.6TB (available at under 1 euro per GB even) seems to be a choice many people make who have a tighter budget, these devices are relatively cheap compares to the SAS connected devices. With the downside being the shallow queue depth though, but if you are planning on having multiple devices per server than this probably isn’t a problem. (Something I would like to see at some point is a comparison between SAS and SATA connected for real life workloads for drives with similar performance capabilities to see if there actually is an impact.)

With prices still coming down and capacity still going up it will be interesting to see how the market shifts in the upcoming 12-18 months. If you ask me hybrid is almost dead, of course there are still situations where it may make sense (low $ per GB requirements), but in most cases all-flash just makes more sense these days.

I would interested in hearing from you as well, if you are doing all-flash HCI/vSAN, what are the specs and why are you selecting specific devices/controllers/types?

Goodbye SAN Huggers!

Duncan Epping · Jun 20, 2016 ·

This week I presented at the German VMUG and a day after the event someone commented on my session. Well not really on my session, but more on my title. My title was “Goodbye SAN Huggers“. Mildly provocative indeed. “SAN Huggers” is a play on the term “Server Hugger“. That term has been around for the longest time and refers to people who prefer to be able to point out their servers. People who prefer the physical world, where every application ran on one server and every server was equal to one physical box.

SAN Huggers are pretty much the same type of people. They prefer the physical world. A world where they define RAID Groups, Storage Pools and LUNs. A world where a bunch of servers end up on the LUN they created. Those LUNs have certain data services enabled and if you need other data services, well then you simply move your servers around! SAN Huggers like to maintain strict control, and to me personally they are in the same position the Server Huggers were 12-15 years ago. It is time to let go however!

Now let it be clear, 12-15 years ago when virtualization changed the world of IT and VMware exploded literally and server huggers felt threatened by the rise of virtualization servers did not go away. Server Administrators did not disappear. Server Administrators evolved. Many took on additional responsibilities, in most cases that would be the responsibility over VMware ESX / Virtual Infrastructure. The same applies to storage.

When I say goodbye SAN Huggers, I am not referring to “Virtual SAN” taking over the world. (Although I do think that Hyper-Converged will eat the traditional storage system’s lunch for a large portion.) I am talking about how the world of storage is (and has been) evolving. Literally my next slide typically has a quote on it that states the following: “Hardware evolution started the storage revolution“. The story around this slide makes it clear what I mean when I say Goodbye SAN Huggers.

The hardware evolution has literally changed the storage landscape. Software Defined Storage is quickly taking over the world, but in my opinion the key reason for this is the evolution from a hardware perspective. In the past we had to group harddisk to provide a single unit that could deliver sufficient capacity, performance and increase availability at the same time. That was achieved using RAID constructs, and with the introduction of virtualization and high demanding workloads storage systems had to resort to wide striping, introduced larger caches, disk pools etc to deliver the capabilities required.

In todays world a lot of these constructs are no longer needed. The evolution in the world of hardware allowed for the introduction of Software Defined Storage. First and foremost flash, whether PCIe, NVMe or SAS/SATA based. These devices removed the need for complex constructs to provide thousands of IOPS. A single flash device today, even consumer grade, can provide higher performance than many of the storage systems we have all managed over the years. Not even talking about enterprise grade flash devices where 100k IOPS (with sub millisecond latency) is more or less the norm. Than there is the network, 10GbE, 25GbeE, 40GbE and even higher. Low latency and high bandwidth comes at (relative) low cost, and add to that the ever growing CPU capabilities, cores and speed combined with faster bus speeds and high (and fast) memory configurations. Hardware is no longer a constraint, the revolution is now, enter the world of Software Defined Storage.

And this, this is where I typically introduce: Virtual SAN, Virtual Volumes and the vSphere APIs for IO Filtering (vSphere Data Services delivered through filter drivers). Functionality provided by VMware which enables efficient operations, simplicity and flexibility. All through the use of policy, which can simply be attached to your workloads, be it a virtual machine or virtual disk even. The days of creating LUNs, RAID groups and needing wide striping or huge amounts of devices to get a decent user experience are gone. We can say goodbye to the physical world, we can say goodbye to the SAN Hugger. We can move forward and evolve, not just our datacenters but also our personal growth and areas of interest and expertise as a result.

What’s new for Virtual SAN 6.2?

Duncan Epping · Feb 10, 2016 ·

Yes, finally… the Virtual SAN 6.2 release has just been announced. Needless to say, but I am very excited about this release. This is the release that I have personally been waiting for. Why? Well I think the list of new functionality will make that obvious. There are a couple of clear themes in this release, and I think it is fair to say that data services / data efficiency is most important. Lets take a look at the list of what is new first and then discuss them one by one

Deduplication and Compression
RAID-5/6 (Erasure Coding)
Sparse Swap Files
Checksum / disk scrubbing
Quality of Service / Limits
In mem read caching
Integrated Performance Metrics
Enhanced Health Service
Application support

That is indeed a good list of new functionality, just 6 months after the previous release that brought you Stretched Clustering, 2 node Robo etc. I’ve already discussed some of these as part of the Beta announcements, but lets go over them one by one so we have all the details in one place. By the way, there also is an official VMware paper available here.

Deduplication and Compression has probably been the number one ask from customers when it comes to features requests for Virtual SAN since version 1.0. The Deduplication and Compression is a feature which can be enabled on an all-flash configuration only. Deduplication and Compression always go hand-in-hand and is enabled on a cluster level. Note that Deduplication and Compression are referred to as nearline dedupe / compression, which basically means that deduplication and compression happens during destaging from the caching tier to the deduplication tier.

Now lets dig a bit deeper. More specifically, deduplication granularity is 4KB and will happen first and is then followed by an attempt to compress the unique block. This block will only be stored compressed when it can be compressed down to 2KB or smaller. The domain for deduplication is the disk group in each host. Of course the question then remains, what kind of space savings can be expected? It depends is the answer. In our environments, and our testing, have shown space savings between 2x and 7x. Where 7x arefull clone desktops (optimal situation) and 2x is a SQL database. Results in other words will depend on your workoad.

Next on the list is RAID-5/6 or Erasure Coding as it is also referred to. In the UI by the way, this is configurable through the VM Storage Policies and you do this through defining the “Fault Tolerance Method” (FTM). When you configure this you have two options: RAID-1 (Mirroring) and RAID-5/6 (Erasure Coding). Depending on how FTT (failures to tolerate) is configured when RAID-5/6 is selected you will end up with a 3+1 (RAID-5) configuration for FTT=1 and 4+2 for FTT=2.

Note that “3+1” means you will have 3 data blocks and 1 parity block, in the case of 4+2 this means 4 data blocks and 2 parity blocks. Note that again this functionality is only available for all-flash configurations. There is a huge benefit to using it by the way:

Lets take the example of a 100GB Disk:

100GB disk with FTT =1 & FTM=RAID-1 set –> 200GB disk space needed
100GB disk with FTT =1 & FTM=RAID-5/6 set –> 130.33GB disk space needed
100GB disk with FTT =2 & FTM=RAID-1 set –> 300GB disk space needed
100GB disk with FTT =2 & FTM=RAID-5/6 set –> 150GB disk space needed

As demonstrated, the space savings are enormous, especially with FTT=2 the 2x savings can and will make a big difference. Having that said, do note that the minimum number of hosts required also change. For RAID-5 this is 4 (remember 3+1) and 6 for RAID-6 (remember 4+2). The following two screenshots demonstrate how easy it is to configure it and what the layout looks of the data in the web client.

Sparse Swap Files is a new feature that can only be enabled by setting an advanced setting. It is one of those features that is a direct result of a customer feature request for cost optimization. As most of you hopefully know, when you create VM with 4GB of memory a 4GB swap file will be created on a datastore at the same time. This is to ensure memory pages can be assigned to that VM even when you are overcommitting and there is no physical memory available. With VSAN when this file is created it is created “thick” at 100% of the memory size. In other words, a 4GB swap file will take up 4GB which can’t be used by any other object/component on the VSAN datastore. When you have a handful of VMs there is nothing to worry about, but if you have thousands of VMs then this adds up quickly. By setting the advanced host setting “SwapThickProvisionedDisabled” the swap file will be provisioned thin and disk space will only be claimed when the swap file is consumed. Needless to say, but we only recommend using this when you are not overcommitting on memory. Having no space for swap and needed to write to swap wouldn’t make your workloads happy.

Next up is the Checksum / disk scrubbing functionality. As of VSAN 6.2 for every write (4KB) a checksum is calculated and stored separately from the data (5-byte). Note that this happens even before the write occurs to the caching tier so even an SSD corruption would not impact data integrity. On a read of course the checksum is validated and if there is a checksum error it will be corrected automatically. Also, in order to ensure that over time stale data does not decay in any shape or form, there is a disk scrubbing process which reads the blocks and corrects when needed. Intel crc32c is leveraged to optimize the checksum process. And note that it is enabled by default for ALL virtual machines as of this release, but if desired it can be disabled as well through policy for VMs which do not require this functionality.

Another big ask, primarily by service providers, was Quality of Service functionality. There are many aspects of QoS but one of the major asks was definitely the capability to limit VMs or Virtual Disks to a certain number of IOPS through policy. This simply to prevent a single VM from consuming all available resources of a host. One thing to note is that when you set a limit of 1000 IOPS VSAN uses a block size of 32KB by default. Meaning that when pushing 64KB writes the 1000 IOPS limits is actual 500. When you are doing 4KB writes (or reads for that matter) however, we still count with 32KB blocks as this is a normalized value. Keep this in mind when setting the limit.

When it comes to caching there was also a nice “little” enhancement. As of 6.2 VSAN also has a small in-memory read cache. Small in this case means 0.4% of a host’s memory capacity up to a max of 1GB. Note that this in-memory cache is a client side cache, meaning that the blocks of a VM are cached on the host where the VM is located.

Besides all these great performance and efficiency enhancements of course a lot of work has also been done around the operational aspects. As of VSAN 6.2 no longer do you as an admin need to dive in to the VSAN observer, but you can just open up the Web Client to see all performance statistics you want to see about VSAN. It provides a great level of detail ranging from how a cluster is behaving down to the individual disk. What I personally feel is very interesting about this performance monitoring solution is that all the data is stored on VSAN itself. When you enable the performance service you simply select the VSAN storage policy and you are set. All data is stored on VSAN and also all the calculations are done by your hosts. Yes indeed, a distributed and decentralized performance monitoring solution, where the Web Client is just showing the data it is provided.

Of course all new functionality, where applicable, has health check tests. This is one of those things that I got used to so fast, and already take for granted. The Health Check will make your life as an admin so much easier, not just the regular tests but also the pro-active tests which you can run whenever you desire.

Last but not least I want to call out the work that has been done around application support, I think especially the support for core SAP applications is something that stands out!

If you ask me, but of course I am heavily biased, this release is the best release so far and contains all the functionality many of you have been asking for. I hope that you are as excited about it as I am, and will consider VSAN for new projects or when current storage is about to be replaced.

VSAN made storage management a non issue for the 1st time

Duncan Epping · Sep 28, 2015 ·

When ever I talk to customers about Virtual SAN the question that comes up usually is why Virtual SAN? Some of you may expect it to be performance, or the scale-out aspect, or the resiliency… None of that is the biggest differentiator in my opinion, management truly is. Or should I say the fact that you can literally forget about it after you have configured it? Yes, of course that is something you expect every vendor to say about their own product. I think the reply of one of the users during the VSAN Chat that was held last week is the biggest testimony I can provide: “VSAN made storage management a non-issue for the first time for the vSphere cluster admin”. (see tweet below)

@vmwarevsan VSAN made storage management a non-issue for this first time vSphere cluster admin! #vsanchat http://t.co/5arKbzCdjz

— Aaron Kay (@num1k) September 22, 2015

When we released the first version of Virtual SAN I strongly believed we had a winner on our hands. It was so simple to configure, you don’t need to be a VCP to enable VSAN, it is two clicks. Of course VSAN is a bit more than just that tick box on a cluster level that says “enable”. You want to make sure it performs well, all drivers/firmware combinations are certified, the network is correctly configured etc. Fortunately we also have a solution for that, this isn’t a manual process.

No, you simply go to the VSAN healthcheck section on your VSAN cluster object and validate everything is green. Besides simply looking at those green checks, you can also run certain pro-active tests that will allow you to test for instance multicast performance, VM creation, VSAN performance etc. It all comes as part of vCenter Server as of the 6.0 U1 release. On top of that there is more planned. At VMworld we already hinted at it, advanced performance management inside vCenter based on a distributed and decentralized model. You can expect that at some point in the near future, and of course we have the vROps pack for Virtual SAN if you prefer that!

No, if you ask me, the biggest differentiator definitely is management… simplicity is the key theme, and I guarantee that things will only improve with each release.