Yesterday I was at a Software Defined Datacenter event organized by IBM and VMware. The famous Cormac Hogan presented on Software Defined Storage and I very much enjoyed hearing about the VMware vision and of course Cormac’s take on this. Coincidentally, last week I read this article by long-time community guru Jason Boche on VAAI and number of VMs, and after a discussion with a customer yesterday (at the event) about their operational procedures for provisioning new workloads I figured it was time to write down my thoughts.
I have seen many different definitions so far for Software Defined Storage and I guess there is a source of truth in all of them. Before I explain what it means to me, let me describe commonly faced challenges people have today.
In a lot of environments managing storage and associated workloads is a tedious task. It is not uncommon to see large spreadsheets with a long list of LUNs, IDs, Capabilities, Groupings and whatever more is relevant to them and their workloads. These spreadsheets are typically used to decide where to place a virtual machine or virtual disk. Based on the requirements of the application a specific destination will be selected. On top of that, a selection will need to be made based on currently available disk space of a datastore and of course the current IO load. You do not want to randomly place your virtual machine and find out two days later that you are running out of disk space… Well, that is if you have a relatively mature provisioning process. Of course it is also not uncommon to just pick a random datastore and hope for the best.
To be honest, I can understand many people randomly provision virtual machines. Keeping track of virtual disks, datastores, performance, disk space and other characteristics… it is simply too much and boring. Didn’t we invent computer systems to do these repeatable boring tasks for us? That leads us to the question where and how Software Defined Storage should help you?
A common theme recurring in many “Software Defined” solutions presented by VMware is:
Abstract, Pool, Automate.
This also applies to Software Defined Storage in my opinion. These are three basic requirements that a Software Defined Storage solution should meet. But what does this mean and how does it help you? Let me try to make some sense out of that nice three word marketing slogan:
Software Defined Storage should enable you to provision workloads to a pool of virtualized physical resources based on service level agreements (defined in a policy) in an automated fashion.
I understand that is a mouth full, so lets elaborate a bit more. Think about the challenges I described above… or what Jason described with regards to “VMs per Volume” and how there are various different components that can impact your service level. A Software Defined Storage (SDS) solution should be able to intelligently place virtual disks (virtual machines / vApps) based on selected policy for the object (virtual disk / machine / appliance). These policies typically contain characteristics of the provided service level. On top of that a Software Defined Storage solution should take risks / constraints in to account. Meaning that you don’t want your workload to be deployed to a volume which is running out of disk space for instance.
What about those characteristics, what are those? Characteristics could be anything, just two simple examples to make it a bit more obvious:
- Does your application require recover-ability after a disaster? –> SDS selects destination which is replicated, or instructs storage system to create replicated object for the VM
- Does your application require a certain level of performance? –> SDS selects destination that can provide this performance, or instructs storage system to reserve storage resources for the VM
Now this all sounds a bit vague, but I am purposely trying to avoid using product or feature names. Software Defined Storage is not about a particular feature, product or storage system. Although I dropped the word policy, note that enabling Profile Driven Storage within vCenter Server does not provide you a Software Defined Storage solution. It shouldn’t matter either (to a certain extent) if you are using EMC, NetApp, Nimbus, a VMware software solution or any of the other thousands of different storage systems out there. Any of those systems, or even a combination of them, should work in the software defined world. To be clear, in my opinion (today) there isn’t such a thing as a Software Defined Storage product, it is a strategy. It is a way of operating that particular part of your datacenter.
To be fair, there is a huge difference between various solutions. There are products and features out there that will enable you to build a solution like this and transform the way you manage your storage and provision new workloads. Products and features that will allow you to create a flexible offering. VMware has been and is working hard to be a part of this space, vSphere Replication / Storage DRS / Storage IO Control / Virsto / Profile Driven Storage are part of the “now”, but just the beginning… Virtual Volumes, Virtual Flash and Distributed Storage have all been previewed at VMworld and are potentially what is next. Who knows what else is in the pipeline or what other vendors are working on.
If you ask me, there are exciting times ahead. Software Defined Storage is a big part of the Software Defined Data Center story and you can bet this will change datacenter architecture and operations.
** There are two excellent articles on this topic the first by Bill Earl, and the second by Christos Karamanolis, make sure to read their perspective. **
Well-said, Duncan. I agree it’s an exciting time to be in this industry (again), and especially interesting for us storage folks.
I can’t resist to +1 on this one. Well said Duncan.
Everyone out there right now that claims to be SDS is in fact still somewhere (stuck) in Storage Virtualisation. SDS is the next level of virtualised storage. And indeed, as you mentioned; the ones that will use the protocols/API’s handed to you from VMware (vVols, Distributed Storage, …) will be able to call themselves Software Defined Storage.
Bold statement: VMware will be the Software Defined Storage, the vendors/products are going to be “compatible” or not.
Yes, but …
I think there’s much more to the discussion that providing better control-plane integration and a more convenient consumption model for storage services. To be fair, much of what you describe can be done today, with existing (shipping!) technology. Could it be done better? Of course — but that doesn’t necessarily justify the existence of a new model such as software-defined storage.
I did my best to get into the deeper implications here: http://chucksblog.emc.com/chucks_blog/2013/02/software-defined-storage-and-the-potential-for-disruption.html feedback always appreciated ….
— Chuck
Not sure which technology you are talking about Chuck. But from what I have seen on the market today, nothing comes close to what I describe. There are parts of the machine available, but that is about it.
Maybe you could give an example.
Hi Duncan
To do this, I need to delve into EMC’s portfolio — not to pimp our products necessarily, but provide a concrete example as you requested.
Take the recent VMAX Cloud Edition as one example. Storage service offerings are abstracted at the catalog level (gold, silver, etc.) which are essentially a predefined mix of performance and protection attributes. They’re not configured that way, they’re purchased that way.
Automation (driven by either the VMware adminstrator, the storage administrator, or some other administrator) accepts the new storage service request, and drives a customer-tailored workflow to review/approve and then implement. Ports are transparently made available, LUNs are invisibly masked, etc. and the user gets the service they requested, plus an admin portal to see what they’ve consumed, how it’s performing, costs associated, etc. There’s decent integration into the VMware-centric experience today, with more coming I am told.
There are other similar examples from the EMC portfolio, as well as those of other vendors.
So, unless I’m missing something, basically the gist of what you’re proposing, just using a modestly different approach. We could certainly argue the merits of one approach over another, but — in my mind — there needs to be a much stronger case for software-defined storage before we can convince people to abandon one way of doing things in favor of another.
My two cents.
Well said, Duncan. I think that your comments that the storage vendors out there have a strategy, but not a product is correct. The good news is that several of the storage vendors are making strong pushes to become integrated into a SDS environment. One interesting potential outcome of this that may not be as obvious is that once the storage vendors have all the tools built to provide the level of abstraction and communication needed to fully participate in a SDS environment, then those same tools will most likely apply to any hypervisor or hosted environment. In some cases they may become the mechanism to provide heterogeneous access and data mobility between hypervisors.
Many storage vendors are indeed going full steam ahead to solve the problem. Hopefully they do it top down instead of bottom up from an architecture perspective. Ensuring the APIs / Integration will be there will be more important than anything else with SDS
Software defined Storage. IMHO, it is Software + physical “store form” (SSD/Disk/Flash/Tape/etc), and nothing else. I should be able to get any pieces of “store form”, from any supplier, add it to the pool, and the Software does the intelligence. The challenge for proprietary, expensive hardware is customers can’t afford another box for learning/testing, and they are complex to install, configure, master. As a result a silo is born as we need a dedicated team to manage what essentially is a subsystem in the larger scheme of infrastructure.
Having said the above, SDS will have its own challenges, limitation, complexity too, so we will continue seeing both for a long time. One is not better than the other (at this time of writing as prediction is difficult).
The “winner” is the storage people & professional. It’s exciting times, and it makes work more meaningful!
PS: I work for VMware, but the above is my thought as an engineer.
For me, SDS will arrive when I no longer configure disks within the storage array. I look forward (and it is comming) to the deployment of storage being the registration of the array within vCenter (or similar). From vCenter a storage policy is created – 5000 IOPs, 2TB, 60/40 RW, tiering (yes/no/etc). Then vCenter orchestrates the creation of the raid groups, volumes, datastores, and SDRS. vCenter also monitors the storage array performance and updates the configuration to deliver the performance (either proactively or through alerting). As mentioned above, the storage array will be “VMware Compatible”.
Also, when workloads are removed, the disks should be released back to the pool for redeployment.
Agreed! Thanks for the comment,
Hello i like the idea of a vm by volume. I was thinking about that long time ago when i first try zfs.
Just think about it, every time you create a VM a new zfs will be created and automatically mounted to all esxi. With this wfs you can choose the compression level deduplication or not snapshot schedule …
It would be awesome.
But you still have an issue here, it’s performance, for example if i stay with zfs, you can’t create filesystem like this (every time it will add metada for example, and we can say the limit will be nearly at 10 000 file system so 10 000 VMs).
I’m sure you can already do that by script
The other issue now is everything goes to one pool. So if one VM hit badly the pool you still have the issue even if you have one VM by volume.
You still can create different pool but the issue will appear soon or later, the best will be to have a pool and you define the number of raid group you hit on this pool.
Many storage vendors are rewriting their stack/software to ensure they don’t hit the limits you mention. Especially with Virtual Volumes, which is what you refer to above, this will be a challenge. It is not unthinkable to have 1000s if not 10.000s of objects.
hum with zfs it’s not really a virtual volume. Actually it’s a real file system with it’s own properties and you just have to make an nfs share of this file system.
But you can create a zvol on it if you prefer iscsi.
The main goal is to have one file system by VM.
I understand that… I am just saying the concept that you describe is close to virtual volumes and I guess there is a reason other vendors are rewriting their stack.
Duncan,
I think what you described is also possible to do with traditional SAN storage arrays, to me, Software Defined Storage is the capability to create a SAN/NAS storage array but in software. Separate the physical infrastructure from the storage infrastructure.
Below is a link to a blog that I wrote comparing traditional SAN vs Software Defined Storage
http://blog.zadarastorage.com/2013/01/software-defined-storage-vs-traditional.html
Agreed that this can be done with legacy environments as well. That is where the “abstract” part comes in. Whether it is a legacy SAN / NAS or a new storage appliance shouldn’t matter.
> event organized by IBM and VMware
Hey, how’d that get past EMC public relations!? VMware have always treated IBM like one of those crazy uncles you see each time at Christmas. 🙂
HP StoreVirtual 4000 VSA is the leader in Software Defined Storage.. 🙂
You sound like you’re describing Dell Compellent.