I have this search column open on twitter with the term “software defined storage”. One thing that kept popping up in the last couple of days was a tweet from various IBM people around how SDS will change flash. Or let me quote the tweet:
“What does software-defined storage mean for the future of #flash?”
It is part of a twitter chat scheduled for today, initiated by IBM. It might be just me misreading the tweets or the IBM folks look at SDS and flash in a completely different way than I do. Yes SDS is a nice buzzword these days. I guess with the billion dollar investment in flash IBM has announced they are going all-in with regards to marketing. If you ask me they should have flipped it and the tweet should have stated: “What does flash mean for the future of Software Defined Storage?” Or to make it even sound more marketing is flash the saviour of Software Defined Storage?
Flash is a disruptive technology, and changing the way we architect our datacenters. Not only did it already allow many storage vendors to introduce additional tiers of storage it also allowed them to add an additional layer of caching in their storage devices. Some vendors even created all flash based storage systems offering thousands of IOps (some will claim millions), performance issues are a thing of the past with those devices. On top of that host local flash is the enabler of scale-out virtual storage appliances. Without flash those type of solutions would not be possible, well at least not with a decent performance.
Since a couple of years host side flash is also becoming more common. Especially since several companies jumped in to the huge gap there was and started offering caching solutions for virtualized infrastructures. These solutions allow companies who cannot move to hybrid or all-flash solutions to increase the performance of their virtual infrastructure without changing their storage platform. Basically what these solutions do is make a distinction between “data at rest” and “data in motion”. Data in motion should reside in cache, if configured properly, and data in rest should reside on your array. These solutions once again will change the way we architect our datacenters. They provide a significant performance increase removing many of the performance constraints linked to traditional storage systems; your storage system can once again focus on what it is good at… storing data / capacity / resiliency.
I think I have answered the questions, but for those who have difficulties reading between the lines, how does flash change the future of software defined storage? Flash is the enabler of many new storage devices and solutions. Be it a virtual storage appliance in a converged stack, an all-flash array, or host-side IO accelerators. Through flash new opportunities arise, new options for virtualizing existing (I/O intensive) workloads. With it many new storage solutions were developed from the ground up. Storage solutions that run on standard x86 hardware, storage solutions with tight integration with the various platforms, solutions which offer things like end-to-end QoS capabilities and a multitude of data services. These solutions can change your datacenter strategy; be a part of your software defined storage strategy to take that next step forward in optimizing your operational efficiency.
Although flash is not a must for a software defined storage strategy, I would say that it is here to stay and that it is a driving force behind many software defined storage solutions!
Marc Farley says
There are few concepts as vacuous as software defined storage, considering that storage has always been tied to host system software. But I understand the desire and/or need to discuss it given the marketing context we are in.
Flash on the other hand, as you point out so well, is the real deal – the disruptive technology that is enabling whole different ways of storing data. I’m not sure I agree with your distinction between data at rest and data in motion, however. The non-volatile characteristics of flash makes data in flight recoverable by the definition you use and that’s incongruous. Data in flight is data at risk. Timeouts exist to protect systems from the risk of data in flight not succeeding.
Flash is a big enabler of InfiniBand – a technology that I never thought would be relevant but seems to be the I/O tech of choice for Big Data appliances. If IB catches on, then yes, there will be a lot more SDS – and most of it will be host driven. I’m not sure that SDS will be around in two years, but flash will be in ever increasing numbers and use cases.
Duncan Epping says
Not sure I am following your comment with regards to “data in flight” to be honest…
Marc Farley says
Thanks Duncan, my comment about data at rest and in flight (or in motion) were hastily written. Defining data at rest as being data in a storage array is probably not so useful. Data written to non-volatile storage, such as flash, regardless of its location in a storage stack can be considered at rest. The difference between what is a cache and what is a storage tier are blurring . My opinion is that if it’s non-volatile and recoverable it should be considered at rest. The terms “data at rest” and “data in flight” might be better used for scenarios when data is moved across storage management boundaries by processes such as SVMotion or Storage Live Migration – or when uploading/downloading data to/from cloud storage.
I was referring to these terms purely from a “host-side caching” perspective. Reason for this is that there are only a few highly available host-side write caching solutions out there, so I would never consider this to be a new tier of storage in many of those cases.
I can see where you are coming from, and the world is definitely changing. Thanks for taking the time to comment!
Alexey Miasoedov says
Duncan, can you tell who except the PernixData offers host-side write-back caching? How does the PernixData solve the array snapshot consistency?
Tim Davoren says
Well the topics you discuss here are certainly the flavour of the month. However I have a couple of comments for what it’s worth I relation to both flash and SDS.
1. Flash is only disruptive now because of cost. In and of itself flash is old (albeit incrementally develo
Sorry…hazard of posting from iPhone!!
1. Flash is an old tech (with continually refinement) that has recently attained a cost in production that makes it a compelling choice. However it is still weak in a number of storage characteristics (longevity, price:capacity ratio). The true enabler for flash is intelligent software that leverages its strengths whilst hiding its weaknesses (tiering, dedupe, etc)
2. SDS…well one could argue that storage has always been defined by software…whether that’s a file system itself or under lying raid algorithms…what I think the marketeers mean by SDS is twofold: firstly SDS means key traditional functions of storage (like redundancy, snapshots, cache, etc) are now able to be instantiated in portable software containers that abstract away the control plane from software programmable ASICs and other defined hardware bits. Secondly, the flexibility implied by the above means that storage services are available as on demand/just-in-time features rather than hard coded at point of storage repository creation.
I agree, and it was definitely not my intention to say that flash is new at all, but due to the change price/gb ratio over the last years it is definitely becoming more interesting by the second.
Thanks for the comment,
Nice write up regarding flash and its relation to SDS and good pts regarding flash and how it is part of the SDS strategy and will enable people to provide performance abstraction. The main thing people over look and often do not digest is that: Flash is a means to an end. Not an End in itself
Meaning that if your end goal is: a high performance environment for your apps (VDI,SQL etc) then you absolutely should look at flash as a way to reach this goal. But flash is just a single component that can’t not be be fully utilized if your entire stack of networking, storage and servers are not designed to take advantage of the performance and latency it provides. Thus you need to make sure the performance you get from your flash (host-side or SAN/NAS) is not lost over the network in transport.
This aspect is something many people over look and should be critical when viewing any type of flash solution as a flash card alone won’t increase the performance of your SDS if you don’t look at the entire stack.
Where do you believe these technologies will go? As a user of both host side and storage side cache mechanisms I can only look to the future for greater opportunities. Many of the flash implementations still handled IO like spinning disk for the most part, just at much faster rates.
They will go back to the beginning. As we have segmented the datacenter into components (computes, storage, transport) the technology has become significantly more complex to deal with the distribution.
Next generation will look like the “hyper-converged” solutions like Nutanix/Simplivity/etc. When the compute/storage/transport are on the same PCI bus performance management is simply resolved. The complexity comes in making data highly available (replication) and managing large data.
Flash is the next step in the slowest performer in the datacenter, but SDS is unrelated. Being able to dynamically carve and manage storage capacity and performance as an extension of the workload or instance doesn’t rely on the hardware. It may improve the flexibility of the solution and extend its scope, but it does not define, enable, nor drive SDS.
I am looking forward to grid style datacenters.