No one ever got fired for buying IBM/HP/DELL/EMC etc

Last week on twitter there was a discussion about hyper-converged solutions and how these were not what someone who works in an enterprise environment would buy for their tier 1 workloads. I asked the question: well what about buying Pure Storage, Tintri, Nimble or Solid Fire systems? All non-hyper converged solutions, but relatively new. Answer was straight forward: not buying those either, big risk. Then the classic comment came:

No one ever got fired for buying IBM (Dell, HP, NetApp, EMC… pick one)

Brilliant marketing slogan by the way (IBM) which has stuck around since the 70s and is now being used by many others. I wondered though… Did anyone ever get fired for buying Pure Storage? Or for buying Tintri? What about Nutanix? Or VMware Virtual SAN? Hold on, maybe someone got fired for buying Nimble, yeah probably Nimble then. No of course not, even after a dozen google searches nothing shows up. Why you may ask yourself, well because typically people don’t get fired for buying a certain solution. People get fired for being incompetent / lazy / stupid. In the case of infrastructure and workloads that translates in to managing and placing workloads incorrectly or misconfiguring infrastructure. Fatal mistakes which result in dataloss or long periods of downtime, that is what gets you fired.

Sure, buying from a startup may impose some risks. But I would hope that everyone reading this weighs those risks against the benefits, that is what you do as an architect in my opinion. You assess risks and you determine how to mitigate those within your budget. (Yes of course taking requirements and constraints in to account as well.)

Now when it comes to these newer storage solutions, and “new” is relative in this case as some have been around for over 5 years, I would argue that the risk is in most cases negligible. Will those newer storage systems be free of bugs? No, but neither will your legacy storage system be. Some of those systems have been around for over a decade and are now used in scenarios they were never designed for, which means that new problems may be exposed. I am not saying that legacy storage systems will break under your workload, but are you taking that risk in to account? Probably not, why not? Because hardly anyone talks about that risk.

If you (still) don’t feel comfortable with that “new” storage system (yet) but they do appear to give you that edge or bigger bang for the buck simply ask the sales rep a couple of questions which will help building trust:

  • How many systems are sold world wide similar to what you are looking to buy and for similar platforms
    • If they sold thousands, but none of them is using vSphere for instance then what are the chances of you hitting that driver problem firsts? If they sold thousand it will be useful to know…
  • How many customers for that particular model
    • Wouldn’t be the first time a vendors sells thousands of boxes to a single customer for a very specific use case and it works great for them, just not in your particular use case.
    • But if they have many customers, maybe ask…
  • If you can talk to a couple of customers
    • Best thing you can ask for in my opinion, reference call or visit. This is when you find out if what is promised actually is reality.

I do believe that the majority of infrastructure related startups are great companies with great technology. Personally I see a bigger threat in terms of sustainability, rather than technology. Not every startup is going to be around 10 years from now. But if you look at all the different storage (or infra) startups which are out there today, and then look at how they are doing in the market it shouldn’t be too difficult to figure out who is in it for the long run. Whether you buy from a well-established vendor or from a relatively new storage company, it is all about your workload. What are the requirements and how can those requirements be satisfied by that platform. Assess the risks and weigh them against the benefit and make a decision based on that. Don’t make decisions based on a marketing slogan that has been around since the 70s. The world looks different now, technology is moving faster than ever before, being stuck in the 70s is not going to help you or your company compete in this day and age.

Requirements Driven Data Center

I’ve been thinking about the term Software Defined Data Center for a while now. It is a great term “software defined” but it seems that many agree that things have been defined by software for a long time now. When talking about SDDC with customers it is typically referred to as the ability to abstract, pool and automate all aspects of an infrastructure. To me these are very important factors, but not the most important, well at least not for me as they don’t necessarily speak to the agility and flexibility a solution like this should bring. But what is an even more important aspect?

I’ve had some time to think about this lately and to me what is truly important is the ability to define requirements for a service and have the infrastructure cater to those needs. I know this sounds really fluffy, but ultimately the service doesn’t care what is running underneath, and typically the business owner and the application owners also don’t when all requirements can be met. Key is delivering a service with consistency and predictability. Even more important consistency and repeatability increase availability and predictability, and nothing is more important for the user experience.

When it comes to user experience and providing a positive one of course it is key to figure out first what you want and what you need first. Typically this information comes from your business partner and/or application owner. When you know what those requirements are then they can be translated to technical specifications and ultimately drive where the workloads end up. A good example of how this works or would look like is VMware Virtual Volumes. VVols is essentially requirements driven placement of workloads. Not just placement, but of course also all other aspects when it comes to satisfying requirements that determine user experience like QoS, availability, recoverability and whatever more is desired for your workload.

With Virtual Volumes placement of a VM (or VMDK) is based on how the policy is constructed and what is defined in it. The Storage Policy Based  Management engine gives you the flexibility to define policies anyway you like, of course it is limited to what your storage system is capable of delivering but from the vSphere platform point of view you can do what you like and make many different variations. If you specify that the object needs to thin provisioned, or has a specific IO profile, or needs to be deduplicated or… then those requirements are passed down to the storage system and the system makes its placement decisions based on that and will ensure that the demands can be met. Of course as stated earlier also requirements like QoS and availability are passed down. This could be things like latency, IOPS and how many copies of an object are needed (number of 9s resiliency). On top of that, when requirements change or when for whatever reason SLA is breached then in a requirements driven environment the infrastructure will assess and remediate to ensure requirements are met.

That is what a requirements driven solution should provide: agility, availability, consistency and predictability. Ultimately your full data center should be controlled through policies and defined by requirements. If you look at what VMware offers today, then it is fair to say that we are closing in on reaching this ideal fast.

Startup introduction: Springpath

Last week I was briefed by Springpath and they launched their company officially yesterday, although they have been around for a long time. Springpath was founded by Mallik Mahalingam and Krishna Yadappanavar. For those who don’t know them, Mallik was responsible for VXLAN (See the IETF draft) and Krishna was one of the folks who was responsible for VMFS. (Together with Satyam who started Pernix Data) I believe it was early 2013 or end of 2012 when Mallik reached out to me and he wanted to validate some of his thinking around the software defined storage space, I agreed to meet up and we discussed the state at that time and where some of the gaps were. Since May 2012 they operated in stealth (under the name Storvisor) and landed a total of 34 million dollars from investors like Sequoia, NEA and Redpoint. Well established VC names indeed, but what did they develop?

Springpath is what most folks would refer to as a Server SAN solution, some may also refer to it as “hyper-converged”. I don’t label them as hyper-converged as Springpath doesn’t sell a hardware solution, they sell software and have a strict hardware compatibility list. The list of server vendors on the HCL seemed to cover the majority of big players out there though, I was told Dell, HP, Cisco and SuperMicro are on the list and that others are being worked on as we speak. This approach offers a bit more flexibility according to Springpath for customers as they can chose their own preferred vendor and leverage the server vendor relationship they already have for discounts but also maintain similar operational processes.

Springpath’s primary focus in the first release is vSphere, which knowing the background of these guys makes a lot of sense, and comes in the shape of a virtual appliance. This virtual appliance is installed on top of the hypervisor and grabs local spindles and flash. With a minimum of three nodes you then can create a shared datastore which is served back to vSphere as an NFS mount. There are of course also plans to support Hyper-V and when they do the appliance will provide SMB capabilities and for KVM it will use NFS. But that is on the roadmap right now, but not too far out according to Mallik. (Note that support for Hyper-V, KVM etc will all be released in a different version. KVM and Docker is in Beta as we speak, if you are interested go to their website and drop them an email!) There is even talk about supporting the Springpath solution to run as a Docker container and providing shared storage for Docker itself. All these different platforms should be able to leverage the same shared data platform according to Springpath, the diagram below shows this architecture.

They demonstrated the configuration / installation of their stack and I must say I was impressed with how simple it was. They showed a simple UI which allowed them to configure the IP details etc, but they also showed how they could simply drop a JSON file in there with all the config details which would then be used to deploy the storage environment. When fully configured the whole environment can be managed from the Web Client, no need for a separate UI or anything like that. All integrated within the Web Client, and for Hyper-V and other platforms they had similar plans… no separate client but all manageable through the familiar interfaces those platforms already offer. [Read more…]

Software Defined Storage, which phase are you in?!

Working within R&D at VMware means you typically work with technology which is 1 – 2 years out, and discuss futures of products which are 2-3 years. Especially in the storage  space a lot has changed. Not just innovations within the hypervisor by VMware like Storage DRS, Storage IO Control, VMFS-5, VM Storage Policies (SPBM), vSphere Flash Read Cache, Virtual SAN etc. But also by partners who do software based solutions like PernixData (FVP), Atlantis (ILIO) and SANDisk FlashSoft. Of course there is the whole Server SAN / Hyper-converged movement with Nutanix, Scale-IO, Pivot3, SimpliVity and others. Then there is the whole slew of new storage systems some which are scale out and all-flash, others which focus more on simplicity, here we are talking about Nimble, Tintri, Pure Storage, Xtreme-IO, Coho Data, Solid Fire and many many more.

Looking at it from my perspective, I would say there are multiple phases when it comes to the SDS journey:

  • Phase 0 – Legacy storage with NFS / VMFS
  • Phase 1 – Legacy storage with NFS / VMFS + Storage IO Control and Storage DRS
  • Phase 2 – Hybrid solutions (Legacy storage + acceleration solutions or hybrid storage)
  • Phase 3 – Object granular policy driven (scale out) storage

<edit>

Maybe I should have abstracted a bit more:

  • Phase 0 – Legacy storage
  • Phase 1 – Legacy storage + basic hypervisor extensions
  • Phase 2 – Hybrid solutions with hypervisor extensions
  • Phase 3 – Fully hypervisor / OS integrated storage stack

</edit>

I have written about Software Defined Storage multiple times in the last couple of years, have worked with various solutions which are considered to be “Software Defined Storage”. I have a certain view of what the world looks like. However, when I talk to some of our customers reality is different, some seem very happy with what they have in Phase 0. Although all of the above is the way of the future, and for some may be reality today, I do realise that Phase 1, 2 and 3 may be far away for many. I would like to invite all of you to share:

  1. Which phase you are in, and where you would like to go to?
  2. What you are struggling with most today that is driving you to look at new solutions?

Quality of components in Hybrid / All flash storage

Today I was answering some questions on the VMTN forums and one of the questions was around the quality of components in some of the all flash / hybrid arrays. This person kept coming back to the type of flash used (eMLC vs MLC, SATA vs NL-SAS vs SAS). One of the comments he made was the following:

I talked to Pure Storage but they want $$$ for 11TB of consumer grade MLC.

I am guessing he did a quick search on the internet, found a price for some SSDs and multiplied it and figured that Pure Storage was asking way too much… And even compared to some more traditional arrays filled with SSD they could sound more expensive. I guess this also applies to other solutions, so I am not calling out Pure Storage here.One thing some people seem to forget is that when it comes to these new storage architectures is that they are build with flash in mind.

What does that mean? Well everyone has heard all of the horror stories around consumer grade flash wearing out extremely fast and blowing up in your face. Well fortunately that is only true to a certain extent as some consumer grade SSDs easily reach 1PB of writes these days. On top of that there are a couple of things I think you should know and consider before making statements like these or be influenced by a sales team who says “well we offer SLC versus MLC so we are better than them”.

For instance (As Pure Storage lists on their website), there are many more MLC drives shipped than any other type at this point. Which means that it has been tested inside out by consumers, who can break devices in many more ways than you or your QA team can? Right, the consumer! More importantly if you ask me, ALL of these new storage architectures have in-depth knowledge of the type of flash they are using. That is how their system was architected! They know how to leverage flash, they know how to write to flash, they know how to avoid fast wear out. They developed an architecture which was not only designed but also highly optimized for flash… This is what you pay for. You pay for the “total package” which means the whole solution, not just those flash devices that are leveraged. The flash devices are a part of the solution, and just a relatively small part if you ask me. You pay for total capacity with low latency and functionality like deduplication, compression and replication (in some cases). You pay for the ease of deployment and management (operational efficiency), meaning you get to spent your time on stuff that matters to your customer… their applications.

You can summarize all of it in a single sentence: the physical components used in all of these solutions are just a small part of the solution, whenever someone tries to sell you the “hardware” that is when you need to be worried!