VAAI support in vSphere Standard and up as of 6.0!

After some internal discussions over the last months it was decided to move VAAI (vSphere APIs for Array Integration) and Multi-Pathing down to vSphere Standard as of 6.0. Main reason for this was that Virtual Volumes, by many considered as the natural evolution of VAAI, is also part of vSphere Standard. So if you have vSphere Standard and a VAAI capable array and looking to move to 6.0, make sure to check the configuration of your hosts and use this great functionality! Note that VAAI did indeed already work in lower editions, but from a licensing point of view you weren’t entitled to it… I guess many folks never really looked at enabling / disabling it explicitly, but for those who did… now you can use it. More details on what is included with which license can be found here: http://www.vmware.com/au/products/vsphere/compare.html

VAAI support in vSphere Standard

Host Profile noncompliant when using local SAS drives with vSphere 6?

A couple of years ago I wrote an article titled “Host Profile noncompliant when using local SAS drives with vSphere 5?” I was informed by one of our developers that we actually solved this problem in vSphere 6. It is not something I had see yet so I figured I would look at what we did to prevent this from happening and it appears there are two ways to solve it. In 5.x we would solve it by disabling the whole tree, which is kind of a nasty workaround if you ask me. In 6.0 we fixed it in a far better way.

When you create a new host profile and edit is you now have some extra options. One of those options being able to tell if a disk is a shared cluster resource or not. By disabling this for your local SAS drives you avoid the scenario where your host profile shows up as noncompliant on each of your hosts.

There is another way of solving this. You can use “esxcli” to mark your devices correctly and then create the host profile. (SSH in to the host.)

First list all devices using the following command, I took a screenshot of my outcome but yours will look slightly different of course.

esxcli storage core device list

Now that you know your naa identifier for the device you can make the change by issueing the following command and setting “Is Shared Clusterwide” to false:

esxcli storage core device setconfig -d naa.1234 --shared-clusterwide=false

Now you can create the host profile. Hopefully you will find the cool little enhancement in esxcli and host profiles useful, I certainly do!

Startup introduction: Springpath

Last week I was briefed by Springpath and they launched their company officially yesterday, although they have been around for a long time. Springpath was founded by Mallik Mahalingam and Krishna Yadappanavar. For those who don’t know them, Mallik was responsible for VXLAN (See the IETF draft) and Krishna was one of the folks who was responsible for VMFS. (Together with Satyam who started Pernix Data) I believe it was early 2013 or end of 2012 when Mallik reached out to me and he wanted to validate some of his thinking around the software defined storage space, I agreed to meet up and we discussed the state at that time and where some of the gaps were. Since May 2012 they operated in stealth (under the name Storvisor) and landed a total of 34 million dollars from investors like Sequoia, NEA and Redpoint. Well established VC names indeed, but what did they develop?

Springpath is what most folks would refer to as a Server SAN solution, some may also refer to it as “hyper-converged”. I don’t label them as hyper-converged as Springpath doesn’t sell a hardware solution, they sell software and have a strict hardware compatibility list. The list of server vendors on the HCL seemed to cover the majority of big players out there though, I was told Dell, HP, Cisco and SuperMicro are on the list and that others are being worked on as we speak. This approach offers a bit more flexibility according to Springpath for customers as they can chose their own preferred vendor and leverage the server vendor relationship they already have for discounts but also maintain similar operational processes.

Springpath’s primary focus in the first release is vSphere, which knowing the background of these guys makes a lot of sense, and comes in the shape of a virtual appliance. This virtual appliance is installed on top of the hypervisor and grabs local spindles and flash. With a minimum of three nodes you then can create a shared datastore which is served back to vSphere as an NFS mount. There are of course also plans to support Hyper-V and when they do the appliance will provide SMB capabilities and for KVM it will use NFS. But that is on the roadmap right now, but not too far out according to Mallik. (Note that support for Hyper-V, KVM etc will all be released in a different version. KVM and Docker is in Beta as we speak, if you are interested go to their website and drop them an email!) There is even talk about supporting the Springpath solution to run as a Docker container and providing shared storage for Docker itself. All these different platforms should be able to leverage the same shared data platform according to Springpath, the diagram below shows this architecture.

They demonstrated the configuration / installation of their stack and I must say I was impressed with how simple it was. They showed a simple UI which allowed them to configure the IP details etc, but they also showed how they could simply drop a JSON file in there with all the config details which would then be used to deploy the storage environment. When fully configured the whole environment can be managed from the Web Client, no need for a separate UI or anything like that. All integrated within the Web Client, and for Hyper-V and other platforms they had similar plans… no separate client but all manageable through the familiar interfaces those platforms already offer. [Read more…]

Slow backup of VM on VSAN Datastore

Someone at out internal field conference asked me a question around why doing a full back up of a virtual machine on a VSAN datastore is slower then when doing the same exercise for that virtual machine on a traditional storage array. Note that the test that was conducted here was done with a single virtual machine. The best way to explain why this is is by taking a look at the architecture of VSAN. First, let me mention that the full backup of the VM on a traditional array was done on a storage system that had many disks backing the datastore on which the virtual machine was located.

Virtual SAN, as hopefully all of you know, creates a shared datastore out of host local resources. This datastore is formed out of disk and flash. Another thing to understand is that Virtual SAN is an object store. Each object typically is stored in a resilient fashion and as such on two hosts, hence 3 hosts is the minimum. Now, by default the component of an object is not striped which means that components are stored in most cases on a single physical spindle, for an object this means that as you can see in the diagram below that the disk (object) has two components and without stripes is stored on 2 physical disks.

Now lets get back to the original question. Why did the backup on VSAN take longer then with a traditional storage system? It is fairly simple to explain looking at the above info. In the case of the traditional storage array you are reading from multiple disks (10+) but with VSAN you are only reading from 2 disks. As you can imagine when reading from disk performance / throughput results will differ depending on the number of resources the total number of disks it is reading from can provide. In this test, as there it is just a single virtual machine being backed up, the VSAN result will be different as it has a lower number of disks (resources) to its disposal and on top of that is the VM is new there is no data cached so the flash layer is not used. Now, depending on your workload you can of course decide to stripe the components, but also… when it comes to backup you can also decided to increase the number of concurrent backups… if you increase the number of concurrent backups then the results will get closer as more disks are being leveraged across all VMs. I hope that helps explaining  why results can be different, but hopefully everyone understands that when you test things like this that parallelism is important or provide the right level of stripe width.

Software Defined Storage, which phase are you in?!

Working within R&D at VMware means you typically work with technology which is 1 – 2 years out, and discuss futures of products which are 2-3 years. Especially in the storage  space a lot has changed. Not just innovations within the hypervisor by VMware like Storage DRS, Storage IO Control, VMFS-5, VM Storage Policies (SPBM), vSphere Flash Read Cache, Virtual SAN etc. But also by partners who do software based solutions like PernixData (FVP), Atlantis (ILIO) and SANDisk FlashSoft. Of course there is the whole Server SAN / Hyper-converged movement with Nutanix, Scale-IO, Pivot3, SimpliVity and others. Then there is the whole slew of new storage systems some which are scale out and all-flash, others which focus more on simplicity, here we are talking about Nimble, Tintri, Pure Storage, Xtreme-IO, Coho Data, Solid Fire and many many more.

Looking at it from my perspective, I would say there are multiple phases when it comes to the SDS journey:

  • Phase 0 – Legacy storage with NFS / VMFS
  • Phase 1 – Legacy storage with NFS / VMFS + Storage IO Control and Storage DRS
  • Phase 2 – Hybrid solutions (Legacy storage + acceleration solutions or hybrid storage)
  • Phase 3 – Object granular policy driven (scale out) storage

<edit>

Maybe I should have abstracted a bit more:

  • Phase 0 – Legacy storage
  • Phase 1 – Legacy storage + basic hypervisor extensions
  • Phase 2 – Hybrid solutions with hypervisor extensions
  • Phase 3 – Fully hypervisor / OS integrated storage stack

</edit>

I have written about Software Defined Storage multiple times in the last couple of years, have worked with various solutions which are considered to be “Software Defined Storage”. I have a certain view of what the world looks like. However, when I talk to some of our customers reality is different, some seem very happy with what they have in Phase 0. Although all of the above is the way of the future, and for some may be reality today, I do realise that Phase 1, 2 and 3 may be far away for many. I would like to invite all of you to share:

  1. Which phase you are in, and where you would like to go to?
  2. What you are struggling with most today that is driving you to look at new solutions?