vvol

Some recent updates to HA deepdive

Duncan Epping · Apr 15, 2016 ·

Just a short post to point out that I updated the VVol section in the HA Deepdive. If you downloaded it, make sure to download the latest version. Note that I have added a version number to the intro and a changelog at the end so you can see what changes. Also, I recommend subscribing to it, as I plan to do some more updates in the upcoming months. For the update I’ve been playing with a Nimble (virtual) array all day today and it allowed me to create some cool screenshots of how HA works in a VVol environment. I was also seriously impressed by how easy it was to setup the Nimble (virtual) array and how simple VVol was to configure for them. Not just that, but the number of policy options Nimble exposes, I was amazed. Below is just an example of some of the things you can configure!

The screenshot below shows the Virtual Volumes created for a VM, this is the view from a Storage perspective:

What does support for vMotion with active/active (a)sync mean?

Duncan Epping · Mar 23, 2015 ·

Having seen so many cool features being released over the last 10 years by VMware you sometimes wonder what more they can do. It is amazing to see what level of integration we’ve see between the different datacenter components. Many of you have seen the announcements around Long Distance vMotion support by now.

When I saw this slide something stood out to me instantly and that is this part:

Replication Support
- Active/Active only
  - Synchronous
  - Asynchronous

What does this mean? Well first of all “active/active” refers to “stretched storage” aka vSphere Metro Storage Cluster. So when it comes to long distance vMotion some changes have been introduced for sync stretched storage. (** note that “active/active” storage is not required for long distance vMotion**)With stretched storage writes can come from both sides at any time to a volume and will be replicated synchronously. Some optimizations have been done to the vMotion process to avoid writes during switchover to avoid any delay during the process as a result of replication traffic.

For active/active asyncronous the story is a bit different. Here again we are talking about “stretched storage” but in this case the asynchronous flavour. One important aspect which was not mentioned in the deck is that async requires Virtual Volumes. Now, at the time of writing there is no vendor yet who has a VVol capable solution that offers active/active async. But more important, is this process any different than the sync process? Yes it is!

During the migration of a virtual machine which uses virtual volumes, with an “active/active async” configuration backing it, the array is informed that a migration of the virtual machine is taking place and is requested to switch from asynchronous replication to synchronous. This to ensure that the destination is in-sync with the source when the VM is switched over from side A to side B. Besides switching from async to sync when the migration has completed the array is informed that the migration has completed. This allows the array to switch the “bias” of the VM for instance, especially in a stretched environment this is important to ensure availability.

I can’t wait for the first vendor to announce support for this awesome feature!

vVols primer

Duncan Epping · Mar 9, 2015 ·

I was digging through my blog for a link to a vVols primer article and I realized I never wrote one. I did an article which described what Virtual Volumes (VVol) is in 2012 but that is it. I am certain that Virtual Volumes is a feature that will be heavily used with vSphere 6.0 and beyond, so it was time to write a primer. What is vVols about? What will they bring to the table?

First and foremost, vVols was developed to make your life (vSphere admin) and that of the storage administrator easier. This is done by providing a framework that enables the vSphere administrator to assign policies to virtual machines or virtual disks. In these policies capabilities of the storage array can be defined. These capabilities can be things like snapshotting, deduplication, raid-level, thin / thick provisioning etc. What is offered to the vSphere administrator is up to the Storage administrator, and of course up to what the storage system can offer to begin with. When a virtual machine is deployed and a policy is assigned then the storage system will enable certain functionality of the array based on what was specified in the policy. So no longer a need to assign capabilities to a LUN which holds many VMs, but rather a per VM or even per VMDK level control. So how does this work? Well lets take a look at an architectural diagram first.

The diagram shows a couple of components which are important in the VVol architecture. Lets list them out:

Protocol Endpoints aka PE
Virtual Datastore and a Storage Container
Vendor Provider / VASA
Policies
vVols

Lets take a look at all of these three in the above order. Protocol Endpoints, what are they?

Protocol Endpoints are literally the access point to your storage system. All IO to vVols is proxied through a Protocol Endpoint and you can have 1 or more of these per storage system, if your storage system supports having multiple of course. (Implementations of different vendors will vary.) PEs are compatible with different protocols (FC, FCoE, iSCSI, NFS) and if you ask me that whole discussion with vVols will come to an end. You could see a Protocol Endpoint as a “mount point” or a device, and yes they will count towards your maximum number of devices per host (256). (Virtual Volumes it self won’t count towards that!)

Next up is the Storage Container. This is the place where you store your virtual machines, or better said where your vVols end up. The Storage Container is a storage system logical construct and is represented within vSphere as a “virtual datastore”. You need 1 per storage system, but you can have many when desired. To this Storage Container you can apply capabilities. So if you like your virtual volumes to be able to use array based snapshots then the storage administrator will need to assign that capability to the storage container. Note that a storage administrator can grow a storage container without even informing you. A storage container isn’t formatted with VMFS or anything like that, so you don’t need to increase the volume in order to use the space.

But how does vSphere know which container is capable of doing what? In order to discover a storage container and its capabilities we need to be able to talk to the storage system first. This is done through the vSphere APIs for Storage Awareness. You simply point vSphere to the Vendor Provider and the vendor provider will report to vSphere what’s available, this includes both the storage containers as well as the capabilities they possess. Note that a single Vendor Provider can be managing multiple storage systems which in its turn can have multiple storage containers with many capabilities. These vendor providers can also come in different flavours, for some storage systems it is part of their software but for others it will come as a virtual appliance that sits on top of vSphere.

Now that vSphere knows which systems there are, what containers are available with which capabilities you can start creating policies. These policies can be a combination of capabilities and will ultimately be assigned to virtual machines or virtual disks even. You can imagine that in some cases you would like Quality of Service enabled to ensure performance for a VM while in other cases it isn’t as relevant but you need to have a snapshot every hour. All of this is enabled through these policies. No longer will you be maintaining that spreadsheet with all your LUNs and which data service were enabled and what not, no you simply assign a policy. (Yes, a proper naming scheme will be helpful when defining policies.) When requirements change for a VM you don’t move the VM around, no you change the policy and the storage system will do what is required in order to make the VM (and its disks) compliant again with the policy. Not the VM really, but the vVols.

The great thing about vVols is the fact that you know have a granular control over your workloads. Some storage systems will even allow you to assign IO profiles to your VM to ensure optimal performance. Also, when you delete a VM the vVols will be deleted and the space will automatically be reclaimed by the storage system, no more fiddling with vmkfstools. Another great thing about virtual volumes is that even when you delete something within your VM this space can also be reclaimed by the storage system. When your storage system supports T10 UNMAP that is.

That is in short how vVols work and what they bring. You as the vSphere administrator create policies and assign those to VMs, while the storage administrator manages capacity and capabilities. Easy right?!

vVols and queueing

Duncan Epping · Feb 23, 2015 ·

I was reading an article last week by Ray Lucchesi on Virtual Volumes (vVols) and queueing. In that article (and podcast) Ray (and friends on the podcast) describe vVols and the benefits they bring but also a potential danger. I have written about vVols before and if you don’t know what it is or does then I recommend reading those articles. I have been wondering as well, how all of this works, as I also felt that there could easily be a bottleneck. I had some conversations over the last couple of weeks and I figured I would share it with you instead of just leaving a comment on Ray’s blog. Lets look at an architectural diagram first:

In the diagram above (which I borrowed from the vSphere Storage blog, thanks Rolo) you see two important constructs which are part of the overall vVols architecture namely the Storage Container aka Virtual Datastore and the Protocol Endpoint (PE). The Storage Container is where the vVols will be stored. The IO though is proxied through the Protocol Endpoint. You can imagine that if we would not do this and expose every single vVol directly to vSphere that you would have 1000s of devices connected to vSphere, and as you know vSphere has a 256 device limit at the moment. This would never scale, and as such the Protocol Endpoint is used as an access point to a vVols capable storage system.

Now think about a VMFS volume and look at the vVols architectural diagram again. Yes, there is a potential bottleneck indeed. However, what the diagram does not show is that you can have multiple Protocol Endpoints. Ray mentions the following in his post: “I am also not aware of any VASA 2.0 requirement that restricts the number of PEs for a storage system’s support of a single vSphere cluster”. And I can confirm that VMware did not limit the number of Protocol Endpoints in any shape or form. I read the specifications and it literally states 1 PE at a minimum and preferably more. Note that vendor implementations of vVols may differ, I have seen implementations that describe many PEs per storage system, but also implementations which have 1 PE per storage system. And in the case of 1 PE per storage system can that be a bottleneck?

The queue depth of the Protocol Endpoint isn’t limited to 32 like a regular LUN when multiple VMs are contending for IO (“disk.schednumreqoutstanding”) or 64 (typical device queue depth) but set to 128 by default. This can be increased when required however. Before you do, please consult your storage vendor. There are a couple of variables that need to be taken in to account like the max device queue depth for instance and then there also is the HBA max queue depth as well. (For NFS queue depth is no concern typically.) The potential constraint when there is only (uncommon) a single PE can be mitigated. What is important here is that vVols itself does not impose any constraints.

Also, note that some storage vendors have an implementation where the array actually can make the distinction between regular IO and control/management related IO. Regular IO in those cases doesn’t proxy through the PE, which means you will not fill up the queue of the PE. Pretty smart.

I am hoping that clears up some of the misunderstandings out there.

Re: Re: The Rack Endgame: A New Storage Architecture For the Data Center

Duncan Epping · Sep 5, 2014 ·

I was reading Frank Denneman’s article with regards to new datacenter architectures. This in its turn was a response to Stephen Fosket’s article about how the physical architecture of datacenter hardware should change. I recommend reading both articles as that will give a bit more background, plus they are excellent reads by itself. (gotta love these blogging debates) Lets start with an out take of both articles which summarizes blog posts for those who don’t want to read the full article.

Stephen:
Top-of-rack flash and bottom-of-rack disk makes a ton of sense in a world of virtualized, distributed storage. It fits with enterprise paradigms yet delivers real architectural change that could “move the needle” in a way that no centralized shared storage system ever will. SAN and NAS aren’t going away immediately, but this new storage architecture will be an attractive next-generation direction!

If you look at what Stephen describes I think it is more or less in line with what Intel is working towards. The Intel Rack Scale Architecture aims to disaggregate traditional server components and then aggregate by type of resource backed by a super performing and optimized rack fabric. Rack fabric enabled by the new photonic architecture Intel is currently working on. This is not long term future, this is what Intel showcased last year and said to be available in 2015 / 2016.

Frank:
The hypervisor is rich with information, including a collection of tightly knit resource schedulers. It is the perfect place to introduce policy-based management engines. The hypervisor becomes a single control plane that manages both the resource as well as the demand. A single construct to automate instructions in a single language providing a correct Quality of Service model at application granularity levels. You can control resource demand and distribution from one single pane of management. No need to wait on the completion of the development cycles from each vendor.

There’s a bit in Frank’s article as well where he talks about Virtual Volumes and VAAI and how long it took for all storage vendors to adopt VAAI and how he believes that the same may apply to Virtual Volumes and Frank aims more towards the hypervisor being the aggregator instead of doing it through changes in the physical space.

So what about Frank’s arguments? Well Frank has a point with regards to VAAI adoption and the fact that some vendors took a long time to implement these. However, reality is though that Virtual Volumes is going full steam ahead. With many storage vendors demoing it at VMworld in San Francisco last week I have the distinct feeling that things will be different this time. Maybe timing is part of it, as it seems that many customers or on a crosspoint and want to optimize their datacenter operations / architecture by adopting SDDC, of which policy based storage management happens to be a big chunk.

I agree with Frank that the hypervisor is positioned perfect to be that control plane. However, in order to be that control plane for the future there needs to be a way to connect “things” to it which allows for far better scale and more flexibility. VMware, if you ask me, has done that for many parts of the datacenter but one aspect that stills needs to be overhauled for sure is storage. VAAI was a great start, but with VMFS there simply are too many constraints and it doesn’t cater for granular controls.

I feel that the datacenter will need to change on both ends in order to take that next step in the evolution to the SDDC. Intel Rack Scale architecture will allow for far greater scale and efficiency then seen ever before. But it will only be successful when the layer that sits on top has the ability to take all of these disaggregated resources, turn them in to large shared pools and allows to assign resources in a policy driven (and programmable) manner. Not just assign resources but also allow you to specify what the level of availability (HA, DR but also QoS) should be for whatever consumes those resources. Granularity is important here and of course it shouldn’t stop with availability but applies to any other (data) service that one may require.

So where does what fit in? If you look at some of the initiatives that were revealed at VMworld like Virtual Volumes, Virtual SAN and vSphere APIs for IO Filters you can see where the world is moving towards fast. You can see how vSphere is truly becoming that control plane for all resources and how it will be able to provide you end-to-end policy driven management. In order to make all of this reality the current platform will need to change. Changes that allow for more granularity /flexibility and higher scalability and that is where all these (new) initiatives come in to play. Some partners may take longer to adopt than others, especially those that require fundamental changes to the architecture of underlaying platforms (storage systems for instance), but just like with VAAI I am certain that over time this will happen as customers will drive this change by making decisions based on availability of functionality.

Exciting times ahead if you ask me.