6.2

All-Flash HCI is taking over fast…

Duncan Epping · Nov 23, 2016 ·

Two weeks ago I tweeted about All-Flash HCI taking over fast, maybe I should have said All-Flash vSAN as I am not sure every vendor is seeing the same trend. Reason for it being of course is the price of flash dropping while capacity goes up. At the same time with vSAN 6.5 we introduced “all-flash for everyone” by dropping the “all-flash” license option down to vSAN Standard.

I love getting these emails about huge vSAN environments… this week alone 900TB and 2PB raw capacity in a single all-flash vSAN cluster

— Duncan Epping (@DuncanYB) November 10, 2016

So the question naturally came, can you share what these customers are deploying and using, I shared those later via tweets, but I figured it would make sense to share it here as well. When it comes to vSAN there are two layers of flash used, one for capacity and the other for caching (write buffer to be more precise). For the write buffer I am starting to see a trend, the 800GB and 1600 NVMe devices are becoming more and more popular. Also the write-intensive SAS connected SSDs are often used. I guess it largely depends on the budget which you pick, needless to say but NVMe has my preference when budget allows for it.

For the capacity tier there are many different options, most people I talk to are looking at the read intensive 1.92TB and 3.84TB SSDs. SAS connected are a typical choice for these environments, but it does come at a price. The SATA connected S3510 1.6TB (available at under 1 euro per GB even) seems to be a choice many people make who have a tighter budget, these devices are relatively cheap compares to the SAS connected devices. With the downside being the shallow queue depth though, but if you are planning on having multiple devices per server than this probably isn’t a problem. (Something I would like to see at some point is a comparison between SAS and SATA connected for real life workloads for drives with similar performance capabilities to see if there actually is an impact.)

With prices still coming down and capacity still going up it will be interesting to see how the market shifts in the upcoming 12-18 months. If you ask me hybrid is almost dead, of course there are still situations where it may make sense (low $ per GB requirements), but in most cases all-flash just makes more sense these days.

I would interested in hearing from you as well, if you are doing all-flash HCI/vSAN, what are the specs and why are you selecting specific devices/controllers/types?

Benchmarking an HCI solution with legacy tools

Duncan Epping · Nov 17, 2016 ·

I was driving back home from Germany on the autobahn this week when thinking about 5-6 conversations I have had the past couple of weeks about performance tests for HCI systems. (Hence the pic on the rightside being very appropriate ;-)) What stood out during these conversations is that many folks are repeating the tests they’ve once conducted on their legacy array and then compare the results 1:1 to their HCI system. Fairly often people even use a legacy tool like Atto disk benchmark. Atto is a great tool for testing the speed of your drive in your laptop, or maybe even a RAID configuration, but the name already more or less reveals its limitation: “disk benchmark”. It wasn’t designed to show the capabilities and strengths of a distributed / hyper-converged platform.

Now I am not trying to pick on Atto as similar problems exist with tools like IOMeter for instance. I see people doing a single VM IOMeter test with a single disk. In most hyper-converged offerings that doesn’t result in a spectacular outcome, why? Well simply because that is not what the solution is designed for. Sure, there are ways to demonstrate what your system is capable off with legacy tools, simply create multiple VMs with multiple disks. Or even with a single VM you can produce better results when picking the right policy as vSAN allows you to stripe data across 12 devices for instance (which can be across hosts, diskgroups etc). Without selecting the right policy or having multiple VMs, you may not be hitting the limits of your system, but simply the limits of your VM virtual disk controller, host disk controller, single device capabilities etc.

But there is even a better option, pick the right toolset and select the right workload(Surely only doing 4k blocks isn’t representative of your prod environment). VMware has developed a benchmarking solution that works with both traditional as well as with hyper-converged offerings called HCIBench. HCIBench can be downloaded for free, and used for free, through the VMware Flings website. Instead of that single VM single disk test, you will now be able to test many VMs with multiple disks to show how a scale-out storage system behaves. It will provide you great insights of the capabilities of your storage system, whether that is vSAN or any other HCI solution, or even a legacy storage system for that matter. Just like the world of storage has evolved, so has the world of benchmarking.

#STO7904 VSAN Management Current and Futures by @cdickmann

Duncan Epping · Aug 31, 2016 ·

Christian Dickmann (VSAN Development Architect) talking about VSAN Management futures in this session, first of all a big fat disclaimer, all of these features may or may not ever make it in to a release and no promises of timelines were made. This session all revolved around VSAN’s mission: Providing Radically Simple HCI with Choice. Keep that in mind when reading the rest of article. Also, this session literally just finished a second ago, I wanted to publish it asap so if there are any typos, my apologies.

First Christian went over the current VSAN Management Experience, discussing the creation of a VSAN Cluster, health monitoring and performance monitoring. VSAN is already dead simple from a storage point of view, but there is room for improvement from an operational point of view, and mostly in the vSphere space. Install / Update / Upgrades of drivers, firmware, ESXi, vCenter etc.

1st demo: HCI Installer

In this demo a deployment of the vCenter Server Appliance is shown. We connect to an ESXi server fist. Then you provide all the normal vCenter Server details like password. Where do you want to deploy the appliance? How about on VSAN? Well you can actually create the VSAN Datastore during the deployment of the VCSA. You specify VSAN details and go ahead. During the install/configuration process VSAN will simply be configured using a single host cluster. When vCenter is installed and configured you simply add the rest of the hosts to the cluster. Very cool if you ask me!

2nd demo: Simple VMkernel interface creation

In this demo the creation of VMkernel interfaces is shown. Creation of the interfaces is dead simple as you can simply specify the IP ranges and it does this for every host using the specified details. Literally 4 hosts and interfaces were creates in seconds.

3rd demo: Firmware Upgrade

In this demo in the VSAN Healthcheck it is shown that the firmware of the disk controller is out of date. When you say update, vendor specific tools are downloaded and installed first. When this is completed you can remediate your cluster and install drivers and firmware for all nodes in your cluster, all done through the UI (webclient) and literally in minutes in a rolling fashion. I wish I had this when I had to upgrade my lab in the past.

4th demo: VUM Integration

80% of vSphere customers use VUM so integrating VSAN upgrades and updates with VUM makes a lot of sense. During the upgrade process VUM will validate which version of vSphere/VSAN is supported for your environment. If for whatever reason the latest version is not supported for your configuration it will make a recommendation to use a different version. When you remediate VSAN provides the image needed and there is no need even to create baselines etc. All of this manual work is done by VSAN for you. Upgrades literally become 1 or 2 clicks, and all risks are mitigated by validation of hardware/software against the compatibility matrix.

5th demo: Automation

In this demo Christian showed how to automate the deployment of 10 ROBO clusters end to end using PowerCLI. One by one all the different locations are being created. Every single aspect is fully automated, including even the deployment of the witness appliance. The second demo was the upgrade of the VSAN on-disk format using python. In a fully automated fashion all clusters are upgraded in a rolling fashing. No magic here, all using public APIs.

6th demo: VSAN Analytics

Apparently with the 6.2 Christian found out that Admin’s don’t read all KB articles VMware releases, based on the issue experienced with a disk controller he decided to solve this problem. Can we pro-actively inform you about this problem? Yes we can, using a “cloud connected” VSAN Healthcheck we know what you are using and we can inform you about KBs and potential issues and recommendations that may apply to you. And that is what was shown in this demo, a known issue is bubbled up through the healthcheck and the KB details are provided. Mitigating is simply a matter of applying the recommendation. This is still a manual step, and probably will stay as Christian emphasized as you as the administrator need to have control and should make the decision whether you want to apply the steps/patches or not.

Concluding, in literally 40 minutes Christian showed how the VSAN team is planning on simplifying your life. Not just from a storage perspective, but for your complete vSphere infrastructure. I am hoping I can share the demos at some point in the future as they are worth watching. Thanks Christian for sharing, great job!

Disk format version 4.0 update to 2.0 suggested

Duncan Epping · Jun 15, 2016 ·

I’ve seen some people reporting a strange message in the Virtual SAN UI. The UI states the following: Disk format version 4.0 (update to 2.0 suggested). This is what that looks like (stole the pic from VMTN, thanks Phillip.)

A bit strange considering you apparently have 4.0 why would you go to 2.0 then? Well you are actually on 2.0 and are supposed to go to 3.0. The reason this happens is because, most likely, not all hosts within you cluster are on the same version of Virtual SAN, or vCenter Server was not updated to the last version and ESXi has a higher version. So far I have seen this being reported when people upgrade to vSphere 6.0 Update 2. If you are upgrading, make sure to upgrade all hosts to ESXi 6.0 Update 2, but before you do, upgrade the vCenter Server to 6.0 Update 2 first!

600GB write buffer limit for VSAN?

Duncan Epping · May 17, 2016 ·

I get this question on a regular basis and it has been explained many many times, I figured I would dedicate a blog to it. Now, Cormac has written a very lengthy blog on the topic and I am not going to repeat it, I will simply point you to the math he has provided around it. I do however want to provide a quick summary:

When you have an all-flash VSAN configuration the current write buffer limit is 600GB. (only for all-flash) As a result many seem to think that when a 800GB device is being used for the write buffer that 200GB will go unused. This simply is not the case. We have a rule of thumb of 10% cache to capacity ratio. This rule of thumb has been developed with both performance and endurance in mind as described by Cormac in the link above. The 200GB that is above the 600GB limit of the write buffer is actively used by the flash device for endurance. Note that an SSD usually is over-provisioned by default, most of them have extra cells for endurance and write performance. Which makes the experience more predictable and at the same time more reliable, the same applies in this case with the Virtual SAN write buffer.

The image at the top right side shows how this works. This SSD has 800GB as advertised capacity. The “write buffer” is limited to 600GB however the white space is considered “dynamic over provisioning” capacity as it will be actively used by the SSD automatically (SSDs do this by default). Then there is an additional x % of over provisioning by default on all SSDs, which in the example is 28% (typical for enterprise grade) and even after that there usually is an extra 7% for garbage collection and other SSD internals. If you want to know more about why this is and how this works, Seagate has a nice blog.

So lets recap, as a consumer/admin the 600GB write buffer limit should not be a concern. Although the write buffer is limited in terms of buffer capacity, the flash cells will not go unused and the rule of thumb as such remains unchanged: 10% cache to capacity ratio. Lets hope this puts this (non) discussion finally to rest.