Server

VMware Storage APIs for VM and Application Granular Data Management

Duncan Epping · Aug 7, 2012 ·

Last year at VMworld there was a lot of talk about VMware vStorage APIs for VM and Application Granular Data Management aka Virtual Volumes aka VVOL / VVOLs. The video of this session was just posted up on youtube and I am guessing people will have questions about it after watching it. What I wanted to do in this article is explain what VMware is trying to solve and how VMware intends to solve it. I tried to keep this article as close to the information provided during the session as possible. Note that this session was a Technology Preview, in no shape or form did VMware commit to ever delivering this solution let alone mention a timeline. Before we go any further… if you want to hear more and are attending VMworld, sign up for this session by Vijay Ramachandra and Tom Phelan!

INF-STO2223 – Tech Preview: vSphere Integration with Existing Storage Infrastructure

Background

The Storage Integration effort started with the vSphere API for Array Integration, also known as VAAI. VAAI was aimed to offload data operations to the array to reduce the load and overhead on the hypervisor but more importantly to allow for greater scale and better performance. In vSphere 5.0 the vSphere Storage APIs for Storage Awareness (aka VASA) were introduced which allowed for an out-of-band communications channel to discover storage characteristics. For those who are interested in VASA, I would like to recommend reading Cormac’s excellent article where he explains what it is and he shows how VMware partners have implemented it.

Although these APIs have bridged a huge gap they do not solve all of the problems customers are facing today.

What is VMware trying to solve?

In general VMware is trying to increase agility and flexibility of the VMware storage stack through providing a general framework where any current and future data operations can be implemented with minimal effort for both VMware and partners. Customers have asked for a solution which allows them to differentiate services to their customers on a per application level. Currently when provisioning LUNs, typically large LUNs, this is impossible.

Another area of improvement is granularity. For instance, it is desired to have per VM level fail-over or for instance to allow deduplication on a per VMDK level. This is currently impossible with VMFS. A VMFS volume is usually a LUN and data management happens at a LUN / Volume granularity. In other words a LUN is the level at which you operate from a storage perspective but this is shared by many VMDKs or VMs which might have different requirements.

As stated in mentioned in last years VMworld presentation the currently wish list is:

Ability to offload to storage system on a per VMDK level
Snapshots / cloning / replication / deduplication etc
A framework where any current or future storage system operation can be leveraged
No disruption to the existing VM creation workflows
Highly scalable

These 4 should maximize your ROI on hardware investment, reduce operational effort associated with storage management and virtual machine deployment. It will also allow you to enforce application level SLAs by specifying policies on a VMDK or VM level instead of a datastore level. The granularity that it will allow for is in my opinion the most important part here!

How does VMware intend to solve it?

During the research phase many different options were looked at. Many of these however did not take full advantage of the capabilities of the storage system and they introduced more complexity around data layout. The best way of solving this problem is leveraging well known objects… volumes / LUNs.

These objects are referred to as VM Volumes, but also sometimes referred to as vVOLs. A VM Volume is a VMDK (or it derivative) stored natively inside a storage system. Each VM Volume will have a representative on the storage system. By creating a volume for each VMDK you can set policies on the lowest possible level. Not only that, the SAN vs NAS debate is over. This however does implies that when every VMDK is a storage object there could be thousands of VM Volumes. Will this require a complete redesign of storage systems to allow for this kind of scalability? Just think about the current 256 LUNs per host limit for instance. Will this limit the amount of VMs per host/cluster?

In order to solve this potential problem a new concept is introduced which is called an “IO De-multiplexer” or “IO Demux”. This is one single device which will exist on a storage system and it represents a logical I/O channel from the ESXi hosts to the entire storage system. Multi-pathing and path policies will be defined on a per IO Demux basis, which typically would be once. Behind this IO Demux device there could be 1000s of VM volumes.

This however introduces a new challenge. Where in the past the storage administrator was in control, now the VM administrator could possible create hundreds of large disks without discussing it with the storage admin. To solve this problem a new concept called Capacity Pools is introduced. A Capacity Pool is an allocation of physical storage space and a set of allowed services for any part of that storage space. Services could be replication, cloning, backup etc. This would be allowed until the set threshold is exceeded. It is the intention to allow Capacity Pools to span multiple storage systems and even across physical sites.

In order to allow to set specific QoS parameters another new concept is introduced called Profiles. Profiles are a set of QoS parameters (performance and data services) which apply to a VM Volume, or even a Capacity Pool. The storage administrator can create these profiles and assign these to the Capacity Pools which will allow the tenant of this pool to assign these policies to his VM Volumes.

As you can imagine this shifts responsibilities within the organization between teams, however it will allow for greater granularity, scale, flexibility and most importantly business agility.

Summarizing

Many customers have found it difficult to manage storage in virtualized environments. VMFS volumes typically contain dozens of virtual machines and VMDKs making differentiation on a per application level very difficult. VM Volumes will allow for more granular data management by leveraging the strength of a storage system, the volume manager. VM Volumes will simplify data and virtual infrastructure management by shifting responsibilities between teams and removing multiple layers of complexity.

NetApp is now officially vMSC certified

Duncan Epping · Jul 27, 2012 ·

As I had many people asking about this over the last couple of months I figured I would share it. I just noticed that NetApp is now finally officially vSphere Metro Storage Cluster certified (see SAN HCL). NetApp has certified their platform for the following array types:

NFS
iSCSI

Yes indeed, FC is currently not listed… But for me the great news is that NFS is listed! A KB article has been published with all the details… make sure to read it if you are looking to deploy a stretched cluster with NetApp and vSphere 5.0.

Deploying Hadoop with Serengeti…

Duncan Epping · Jul 26, 2012 ·

I read Richard McDougall’s blog post about Project Serengeti. Richard describes how you can deploy a Hadoop cluster in literally 10 minutes using Serengeti. I am not a Hadoop expert so I am probably the best qualified to test the 10 minute claim.

First of all, download the OVA. I would also suggest downloading the user guide, I needed it for the password / username to login to the Serengeti VM. (Which is: serengeti / password) So what do you need to deploy a Hadoop cluster in your vSphere environment? This is what I did:

Import the OVA
- I decided to Provide a static IP instead of DHCP, I don’t have DHCP in my datacenter
Upgrade VMware Tools
- Just right click the VM and upgrade the tools automatically, works as a charm. note that this is not a requirement!
Login to the console
- ssh serengeti@10.27.51.21
- username/password: serengeti/password
Go to the Serengeti CLI by typing
- serengeti
I don’t run DHCP so in order for the Hadoop nodes to get an IP-address I will need to tell Hadoop which IPs to use. I had to remove the default network first and create a new one
- network list
- network delete –name defaultNetwork
- network add –name defaultNetwork –portGroup “VM Network” –ip 10.27.51.165-200 –dns 10.27.51.122 –gateway 10.27.51.254 –mask 255.255.255.0
Create the Hadoop cluster by running the following commands
- cluster create –name myHadoop
Now you will see a whole set of new virtual machines being created and your Hadoop cluster is created

How long did that take me? Indeed ~10 minutes… I tested if the cluster worked as follows, first SSH in to the Hadoop client node and then do the following:

cd /usr/local/share/pig-0.9.2/test/e2e/pig/lib/
hadoop jar hadoop-examples.jar teragen 1000000000 tera-data

Now you should see the worker nodes in vSphere ramping up to 100% CPU and memory utilization. It works for me… So why did I deploy it? Really to see if Richard was right and it would only take me 10 minutes? That wasn’t the reason of course. I wanted to see how it deployed and how it leveraged vSphere components. Note that this is not a GA release yet. It is version 0.5 and is probably still under heavy development. It is exciting to see though which direction we are heading in to and I am looking forward to the integration points with vSphere and vCloud Director (eventually).

One of the integration points that had my interest was the HA part that is mentioned in Richard’s post. Hortonworks was responsible for that and apparently they plug in to the vSphere HA VM and Application Monitoring. I haven’t been able to test that yet, but when I do I will update you on this. If you are at the point of testing this yourself, please note (and this is not well documented) you will have to enable “VM and Application Monitoring” explicitly as this is not enabled by default on vSphere HA cluster.

Right click your cluster
Click “Edit Settings”
Click the “VM Monitoring” tab and make sure “VM and Application Monitoring” is turned on.
Also check the settings for the individual VMs they will need to be set to “include” where applicable

That is it for now… again I am by no means a Hadoop expert and am not going to try to pretend, just exploring and broadening my horizon.

VMworld session added: INF-VSP1168 Architecting a Cloud Infrastructure

Duncan Epping · Jul 24, 2012 ·

I just got informed that I have been added to the panel session “INF-VSP1168 – Architecting a Cloud Infrastructure” as a speaker. This session will be moderated by Chris Colotti and will have Aidan Dalgleish, David Hill and I as panel members. The session is currently scheduled for Monday, Aug 27, 10:30 AM – 11:30 AM. Almost 500 signed up so far! So if you want to be part of it, I would suggest signing up asap as I am guessing it will sell out!

This session will discuss the various design considerations when architecting the foundation for every solid cloud environment: vSphere 5. We will start with sizing and scaling and provide tips and deep technical facts through out the session. Different examples will be used to show the impact design considerations can have on the availability of your services.

David and I ran this session at Partner Exchange with help from Frank Denneman and Chris Colotti. We received a lot of great feedback and we are hoping it will be no different at VMworld. See you on Monday at 10:30!

Understanding VXLAN and the value prop in just 4 minutes…

Duncan Epping · Jul 23, 2012 ·

I already shared this video through twitter, but I love it so much I figured I would blog it as well. In this video VXLAN is explained in clear understandable language in just four minutes. We need more videos like these, fast and easy to digest!