vvols

VVols design and procurement considerations

Duncan Epping · Feb 21, 2017 ·

Over the past couple of months I have had more and more discussions with customers and partners about VVols. It seems that Policy Based Management and the VVol granular capabilities are really starting to sink in, and more and more customers are starting to see the benefit of using vSphere as the management plane. The other option of course is pre-defining what is enabled on a datastore/LUN level and use spreadsheets and complex naming schemes to determine where a VM should land, far from optimal. I am not going to discuss the VVols basics at this point, if you need to know more about that simply do a search on VVol.

When having these discussions a bunch of things typically come up, these all have to do with design and procurement considerations when it comes to VVol capable storage. VMware provided a framework, and API, and based on this each vendor has developed their own implementation. These vary from vendor to vendor, as not all storage systems are created equal. So what do you have to think about when designing a VVols environment or when you are procuring new VVol capable storage? Below you find a list of questions to ask, with a short explanation of why this may be important. I will try to add new questions and considerations when I come up with them.

What level of software is needed for my storage system to support VVol?

In many cases, especially existing legacy storage systems, an upgrade is needed of the software to support VVols, ask:

What does this upgrade entail?
What is the risk?

When it is clear what you need to support VVols from a software point of view, ask:

What are the constraints and limits?
- How many Protocol Endpoints can I have per storage system?
  - Do you support all protocols? (FC, NFS, iSCSI etc)
  - Is the IO proxied via the Protocol Endpoint? If it is, is their an impact with a large number of VMs?
    - Some systems can make a distinction between traffic type and for normal IO will not go through the PE, which means you don’t hit any PE limitations (queue depth being one)
- How many Storage Pools can you have per storage system?
  - In some cases (legacy storage systems) the storage pool equals an existing physical construct on the array, what is it and what is the impact of this?
    - What kind of options do I select during the creation of the pool? Anything you select on a per Pool level means that when you change policy VVols may have to migrate to other pools, I prefer to avoid data movement. In some cases for instance “replication” is enabled on a storage pool level, I prefer to have this as a policy option
- How many VVols can I have per storage system? (How many VMs do you have, and how many VVols do you expect to have per VM?)
  - In some cases, usually legacy storage systems, the number of VVols per array is limited. I have seen as “low” as 2000, with 3 VVols per VM at a mininum (typical 5) you can imagine this restricts the number of VMs you can run on single storage system

And then there is the control / management plane:

How is the VASA (vSphere APIs for Storage Awareness) Provider implemented?
- There are two options here, either it comes as part of the storage system or it is provided as a virtual machine.
Then as part of that there’s also the decision around the availability model of the VASA Provider:
- Is it a single instance?
- Active/Standby?
- Active/Active?
- Scale-out?

Note, as it stands today, in order to power-on a VM or create a VM the VASA Provider needs to be available. Hence the availability model is probably of importance, depending on the type of environment you are designing. Also, some prefer to avoid having it implemented on the storage system, as any update means touching the storage system. Others prefer to have it as part of the storage system as it removes the need to have a separate VM that needs to be managed and maintained.

Last but not least, policy capabilities:

What is exposed through policy?
- Availability? (RAID type / number of copies of object)
- QoS?
  - Reservations
  - Limits
- Replication?
- Snapshot (scheduling)?
- Encryption?
- Application type?
- Thin provisioning?

I hope this helps having the conversation with your storage vendor, developing your design or guide the conversation during the procurement process. If anyone has additional considerations please leave a comment so I can add it to the list when applicable.

Virtually Speaking Podcast episode 32 – VVol 2.0

Duncan Epping · Dec 6, 2016 ·

Just wanted to share the Virtually Speaking Podcast with you, this episode (32) is on the topic of VVol 2.0 and features Pete Flecha, Ben Meadowcroft (PM for VVol) and I. Make sure to listen to it, it has some good info on where VVol is today and where it may be going in the near future!

Where do I run my VASA Vendor Provider for vVols?

Duncan Epping · Jan 6, 2016 ·

I was talking to someone before the start of the holiday season about running the Vendor Provider (VP) for vVols as a VM and what the best practices are around that. I was thinking about the implication of the VP not being available and came to the conclusion that when the VP is unavailable a bunch of things stop working out of which “bind” is probably most important.

The “bind” operation is what allows vSphere to access a given Virtual Volume (vVol), and this operation is issued during a power-on of a VM. This is how the vVols FAQ describes it:

When a vVol is created, it is not immediately accessible for IO. To Access vVol, vSphere needs to issue a “Bind” operation to a VASA Provider (VP), which creates IO access point for a vVol on a Protocol Endpoint (PE) chosen by a VP. A single PE can be the IO access point for multiple vVolss. “Unbind” Operation will remove this IO access point for a given vVol.

This means that when the VP is unavailable, you can’t power-on VMs at that particular time. For many storage systems that problem is mitigated by having the VP as part of their storage system itself, and of course there is the option to have multiple VPs as part of your solution, either in active/active or in active/standby configuration. In the case of VSAN for instance, each host has a VASA provider out of which one is active and others are standby, if the active fails the standby will take over automatically. So to be clear, it is up to the vendor to decide what type of availability to provide for the VP, some have decided to go for a single instance and rely on vSphere HA to restart the appliance, others have created active/standby etc.

But back to vVols, what if you own a storage system that requires an external VASA VP as a VM?

Run your VP VMs in a management cluster, if the hosts in the “production” cluster are impacted and VMs are restarted then at least the VP VMs should be up and running in your management cluster
Use multiple VP VMs if and when possible, if active/active or active / standby is supported make sure to run your VPs in that configuration
Do not use vVols for the VP itself, you don’t want to have any (circular) dependency between the availability of the VP and being able to power-on the VP itself
If there is no availability story for the VP, depending on the configuration of the appliance vSphere FT should be considered.

One more thing, if you are considering buying new storage, I think one question you definitely need to ask your vendor is what their story is around the VP. Is it a VM or is it part of the storage system itself? Is there an availability story for the VP, and if so is this “active/active” or “active/standby”? If not, what do they have on their roadmap around this? You are probably also asking yourself what VMware has planned to solve this problem, well there are a couple of things cooking and I can’t say too much about it. One important effort though is the inclusion of bind/unbind in the T10 SCSI standard, but as you can imagine, those things take time. (Which would allow us to power-on new VMs even when the VP is unavailable as it would be a SCSI command.) Until then, when you design a vVol environment, take the above into account when it comes to your Vendor Provider aka VP!

vVols and queueing

Duncan Epping · Feb 23, 2015 ·

I was reading an article last week by Ray Lucchesi on Virtual Volumes (vVols) and queueing. In that article (and podcast) Ray (and friends on the podcast) describe vVols and the benefits they bring but also a potential danger. I have written about vVols before and if you don’t know what it is or does then I recommend reading those articles. I have been wondering as well, how all of this works, as I also felt that there could easily be a bottleneck. I had some conversations over the last couple of weeks and I figured I would share it with you instead of just leaving a comment on Ray’s blog. Lets look at an architectural diagram first:

In the diagram above (which I borrowed from the vSphere Storage blog, thanks Rolo) you see two important constructs which are part of the overall vVols architecture namely the Storage Container aka Virtual Datastore and the Protocol Endpoint (PE). The Storage Container is where the vVols will be stored. The IO though is proxied through the Protocol Endpoint. You can imagine that if we would not do this and expose every single vVol directly to vSphere that you would have 1000s of devices connected to vSphere, and as you know vSphere has a 256 device limit at the moment. This would never scale, and as such the Protocol Endpoint is used as an access point to a vVols capable storage system.

Now think about a VMFS volume and look at the vVols architectural diagram again. Yes, there is a potential bottleneck indeed. However, what the diagram does not show is that you can have multiple Protocol Endpoints. Ray mentions the following in his post: “I am also not aware of any VASA 2.0 requirement that restricts the number of PEs for a storage system’s support of a single vSphere cluster”. And I can confirm that VMware did not limit the number of Protocol Endpoints in any shape or form. I read the specifications and it literally states 1 PE at a minimum and preferably more. Note that vendor implementations of vVols may differ, I have seen implementations that describe many PEs per storage system, but also implementations which have 1 PE per storage system. And in the case of 1 PE per storage system can that be a bottleneck?

The queue depth of the Protocol Endpoint isn’t limited to 32 like a regular LUN when multiple VMs are contending for IO (“disk.schednumreqoutstanding”) or 64 (typical device queue depth) but set to 128 by default. This can be increased when required however. Before you do, please consult your storage vendor. There are a couple of variables that need to be taken in to account like the max device queue depth for instance and then there also is the HBA max queue depth as well. (For NFS queue depth is no concern typically.) The potential constraint when there is only (uncommon) a single PE can be mitigated. What is important here is that vVols itself does not impose any constraints.

Also, note that some storage vendors have an implementation where the array actually can make the distinction between regular IO and control/management related IO. Regular IO in those cases doesn’t proxy through the PE, which means you will not fill up the queue of the PE. Pretty smart.

I am hoping that clears up some of the misunderstandings out there.

VMware Storage APIs for VM and Application Granular Data Management

Duncan Epping · Aug 7, 2012 ·

Last year at VMworld there was a lot of talk about VMware vStorage APIs for VM and Application Granular Data Management aka Virtual Volumes aka VVOL / VVOLs. The video of this session was just posted up on youtube and I am guessing people will have questions about it after watching it. What I wanted to do in this article is explain what VMware is trying to solve and how VMware intends to solve it. I tried to keep this article as close to the information provided during the session as possible. Note that this session was a Technology Preview, in no shape or form did VMware commit to ever delivering this solution let alone mention a timeline. Before we go any further… if you want to hear more and are attending VMworld, sign up for this session by Vijay Ramachandra and Tom Phelan!

INF-STO2223 – Tech Preview: vSphere Integration with Existing Storage Infrastructure

Background

The Storage Integration effort started with the vSphere API for Array Integration, also known as VAAI. VAAI was aimed to offload data operations to the array to reduce the load and overhead on the hypervisor but more importantly to allow for greater scale and better performance. In vSphere 5.0 the vSphere Storage APIs for Storage Awareness (aka VASA) were introduced which allowed for an out-of-band communications channel to discover storage characteristics. For those who are interested in VASA, I would like to recommend reading Cormac’s excellent article where he explains what it is and he shows how VMware partners have implemented it.

Although these APIs have bridged a huge gap they do not solve all of the problems customers are facing today.

What is VMware trying to solve?

In general VMware is trying to increase agility and flexibility of the VMware storage stack through providing a general framework where any current and future data operations can be implemented with minimal effort for both VMware and partners. Customers have asked for a solution which allows them to differentiate services to their customers on a per application level. Currently when provisioning LUNs, typically large LUNs, this is impossible.

Another area of improvement is granularity. For instance, it is desired to have per VM level fail-over or for instance to allow deduplication on a per VMDK level. This is currently impossible with VMFS. A VMFS volume is usually a LUN and data management happens at a LUN / Volume granularity. In other words a LUN is the level at which you operate from a storage perspective but this is shared by many VMDKs or VMs which might have different requirements.

As stated in mentioned in last years VMworld presentation the currently wish list is:

Ability to offload to storage system on a per VMDK level
Snapshots / cloning / replication / deduplication etc
A framework where any current or future storage system operation can be leveraged
No disruption to the existing VM creation workflows
Highly scalable

These 4 should maximize your ROI on hardware investment, reduce operational effort associated with storage management and virtual machine deployment. It will also allow you to enforce application level SLAs by specifying policies on a VMDK or VM level instead of a datastore level. The granularity that it will allow for is in my opinion the most important part here!

How does VMware intend to solve it?

During the research phase many different options were looked at. Many of these however did not take full advantage of the capabilities of the storage system and they introduced more complexity around data layout. The best way of solving this problem is leveraging well known objects… volumes / LUNs.

These objects are referred to as VM Volumes, but also sometimes referred to as vVOLs. A VM Volume is a VMDK (or it derivative) stored natively inside a storage system. Each VM Volume will have a representative on the storage system. By creating a volume for each VMDK you can set policies on the lowest possible level. Not only that, the SAN vs NAS debate is over. This however does implies that when every VMDK is a storage object there could be thousands of VM Volumes. Will this require a complete redesign of storage systems to allow for this kind of scalability? Just think about the current 256 LUNs per host limit for instance. Will this limit the amount of VMs per host/cluster?

In order to solve this potential problem a new concept is introduced which is called an “IO De-multiplexer” or “IO Demux”. This is one single device which will exist on a storage system and it represents a logical I/O channel from the ESXi hosts to the entire storage system. Multi-pathing and path policies will be defined on a per IO Demux basis, which typically would be once. Behind this IO Demux device there could be 1000s of VM volumes.

This however introduces a new challenge. Where in the past the storage administrator was in control, now the VM administrator could possible create hundreds of large disks without discussing it with the storage admin. To solve this problem a new concept called Capacity Pools is introduced. A Capacity Pool is an allocation of physical storage space and a set of allowed services for any part of that storage space. Services could be replication, cloning, backup etc. This would be allowed until the set threshold is exceeded. It is the intention to allow Capacity Pools to span multiple storage systems and even across physical sites.

In order to allow to set specific QoS parameters another new concept is introduced called Profiles. Profiles are a set of QoS parameters (performance and data services) which apply to a VM Volume, or even a Capacity Pool. The storage administrator can create these profiles and assign these to the Capacity Pools which will allow the tenant of this pool to assign these policies to his VM Volumes.

As you can imagine this shifts responsibilities within the organization between teams, however it will allow for greater granularity, scale, flexibility and most importantly business agility.

Summarizing

Many customers have found it difficult to manage storage in virtualized environments. VMFS volumes typically contain dozens of virtual machines and VMDKs making differentiation on a per application level very difficult. VM Volumes will allow for more granular data management by leveraging the strength of a storage system, the volume manager. VM Volumes will simplify data and virtual infrastructure management by shifting responsibilities between teams and removing multiple layers of complexity.