Yesterday Maish and Christian had a nice little back and forth on their blogs about VSAN. Maish published a post titled “VSAN – The Unspoken Truth” which basically talks about how VSAN doesn’t fit blade environments, and how many enterprise environments adopted blade to get better density from a physical point of view. With that meaning increase the number of physical servers to the number of rack U(nits) consumed. Also there is the centralized management aspect of many of these blade solutions that is a major consideration according to Maish.
Christian countered this with a great article titled “VSAN – The Unspoken Future“. I very much agree with Christian’s vision. Christian’s point basically is that when virtualization was introduced IT started moving to blade infrastructures as that was a good fit for the environment they needed to build. Christian then explains how you can leverage for instance the SuperMicro Twin architecture to get a similar (high physical) density while using VSAN at the same time. (See my Twin posts here) However, the essence of the article is: “it shows us that Software Designed Data Center (SDDC) is not just about the software, it’s about how we think, manage AND design our back-end infrastructure.”
There are three aspects here in my opinion:
- Density – the old physical servers vs rack units discussion.
- Cost – investment in new equipment and (potential) licensing impact.
- Operations – how do you manage your environment, will this change?
First of all, I would like to kill the whole density discussion. Do we really care how many physical servers you can fit in a rack? Do we really care you can fit 8 or maybe even 16 blades in 8U? Especially when you take in to consideration your storage system sitting next to it takes up another full rack. Than on top of that there is the impact density has in terms of power and cooling (hot spots). I mean if I can run 500 VMs on those 8 or 16 blades and that 20U storage system, is that better or worse than 500 VMs on 12 x 1U rack mounted with VSAN? I guess the answer to that one is simple: it depends… It all boils down the total cost of ownership and the return on investment. So lets stop looking at a simple metric like physical density as it doesn’t say much!
Before I forget… How often have we had those “eggs in a basket” discussions in the last two years? This was a huge debate 5 years back, in 2008/2009 did you really want to run 20 virtual machines on a physical host? What if that host failed? Those discussions are not as prevalent any longer for a good reason. Hardware improved, stability of the platforms increased, admins became more skilled and less mistakes are made… chances of hitting failures simply declined. Kind of like the old Microsoft blue screen of death joke, people probably still make the joke today but ask yourself how often does it happen?
Of course there is the cost impact. As Christian indicated, you may need to invest in new equipment… As people mentioned on twitter: so did we when we moved to a virtualized environment. And I would like to add: and we all know what that brought us. Yes there is a cost involved. The question is how do you balance this cost. Does it make sense to use a blade system even for VSAN when you can only have a couple of disks at this point in time? It means you need a lot of hosts, and also a lot of VSAN licenses (+maintenance costs). It may be smarter, from economical point of view, to invest in new equipment. Especially when you factor in operations…
Operations, indeed… what does it take / cost today to manage your environment “end to end”? Do you need specialized storage experts to operate your environment? Do you need to hire storage consultants to add more capacity? What about when things go bad, can you troubleshoot the environment by yourself? How about my compute layer, most blade environments offer centralized management for those 8 or 16 hosts. But can I reduce the number of physical hosts from 16 or 8 to for instance 5 with a slightly larger form factor? What would the management overhead be, if there is any? Each of these things need to be taken in to considerations and somehow quantified to compare.
Reality is that VSAN (and all other hyper-converged solutions) brings something new to the table, just like virtualization did years ago. These (hyper-converged) solutions are changing the way the game is played, so you better revise your play book!
Lovely article and strong points put through. Truth is that this new wave of solutions is bring a lot of paradigm shift to every data center. To take advantage of these new capabilities, organizations and administrators are going to have to get out from comfort zones.
Mike Sheehy says
Agreed. Operationally, as you point out, the benefits swing in favor of Scale Out Hyper-Converged if you can remove the frame based storage solution you have today, however you can gain costs for installations of equipment and time to install..etc when you compare to say Cisco UCS. You would, of course, lose the quick deployment that UCS brings with Services profiles coupled with something like Autodeploy. Granted I guess you would need to look at how often you are really installing new hardware. My guess is most places, with the exception of Service Providers are not installing new servers weekly, or even monthly for that matter. Even so, the trade off of managing entire stacks to a Hypervisor Policy based solution, imho, is where you’ll see the positive return.
Nick Bignell says
I don’t understand the argument here. Blade vs Rackmount Servers is irrelevant to VSAN, that’s the while point of the SDDC. Why can’t you run Blade Servers with Blade Storage attatched to each and run VSAN over that? I.e. HyperConverged Blades.
Form factor is how you match physical density with requirements, VSAN just takes the SAN out if the physical discussion…
Am I missing something there?
Duncan Epping says
You will have very limited capacity as Storage Blades (JBOD systems) are not supported with VSAN today,
Nick Bignell says
Yeah fair enough from a product perspective but using virtual SANs from other vendors it works a treat. VMware will increase capacity and product support will improve quickly I’m sure.
Duncan Epping says
Oh it will work with VSAN as well, it just isn’t tested / supported today 🙂
I remember seein this limitation (JBOD) in the beta but haven’t read about it in the final release. Is-it still not supported in GA?
And what about storage blades that are not JBOD but just PCI-Express card deported? (for exemple HP Storage blades which contains HCL supported Smart Array P420i)
Duncan Epping says
It is not supported in GA
Iwan 'e1' Rahabok says
Thanks Duncan. Perfect timing. I just spent 2 hour with one of our global customer (they have 60K VM). He called the TAM and SE to tell us that they are considering super micro. At the end, we agreed that this is the 3rd option they should start exploring. The rackmount and the blade are the first two. The idea of storage gets integrated is a huge win for the bank.
Stu Duncan says
Agreed. I think the point I tripped across is the VM density. If I have 16 blades & 20U of storage, I better be running way more than 500 VMs. More like 2500 VMs. Currently, in one cluster (10 blades, 6U of SATA storage) I’m running 1000 VMs. IOPS being the limiting factor. Throw in 2U of SSDs and I can easily push 2000 VMs.
I think VSAN is a great solution for smaller installs. But I’m not sure it scales well.
Great timing Duncan! I just finished a costing exercise comparing blades/Netapp/EMC solutions with something like Nutanix. Depending on how much storage capacity and I/O performance you need (like VDI or SQL), the hyper-converged solution wins slightly on the CAPEX side. Then, if you factor in the simplicity, ease of management (OPEX), power/cooling, performance, etc. it appears to be a huge win. However, it can be difficult to explain these lower costs using performance, scalability, and simplicity when developing a TCO model, but they are real indeed.
Any ideas with explaining the OPEX win with real numbers?
Customers with an abundance of available floorspace may not care much about server U but I think it’s fair to say by virtue of the virtualization movement and the business we’re in, they are the minority. I’ve worked in large enterprise environments which suffered from datacenter sprawl due to progressive growth, mergers, and acquisitions. The consistent need for land, power, and cooling at the tune of $50M per datacenter were the drivers for datacenter buildout avoidance and other efficiency related programs in the organization. Virtualization was obviously a major contributor to the most efficient use of floor space and pods of blades augmented the infrastructure design.
There is nothing inherently wrong with choosing blades over conventional rack mount boxes, or vise versa. Going back to the earlier days this has traditionally been a “6 to 1, half dozen to the other” discussion. Though the discussion of 19″ boxes may feel a little strange because the pendulum had swung pretty far in the direction of blades and virtualization helped drive that. Bell bottoms came back and I believe one day discso will also.
Getting back to larger 2 or 4 socket boxes to support VSAN isn’t a stretch of the imagination but for customers who are sensitive to floor space, there will be an expectation for higher consolidation ratios to compensate. Some customers are cool with that but many identify the risk (eggs/basket) and draw the line at 20. 50. 100. Well short of the 512 VMs per host supported in vSphere 5.5.
Chris Wahl (@ChrisWahl) says
“Do we really care how many physical servers you can fit in a rack?” – Yes, those ToR switches and colo SqFt aren’t free 🙂
Duncan Epping says
I think you missed the point 🙂
Alexandru Covaliov (@peposimo) says
Duncan has strong points, but… just a picture as example:
Duncan works for an oil company. They decided to make a new type of oil and this oil is compatible with BMW only. BMW – a nice, decent car which offers you power and comfort and everybody wants that car, and Duncan shows how good is this car with this oil. Sweeeeeett…
Now the reality… how many people can afford a BMW? Not so many. The oil is not compatible with other cars except… BMW, Mercedes, Audi.. People try to use that oil on other cars like Honda, Toyota, Mazda, etc…. Somebody is lucky, somebody isn’t.
And Duncan tries to explain to people… if you want that oil, go for a BMW like a described above, because your cars are piece of…
But how many people can afford a BMW or a Mercedes? How many cars of these brands can you see on the road? Of course everybody can afford that brand if to switch from meat to potatoes, but is it reliable in long term?
I understand Duncan’s point of view and his comments are very strong, but can anybody here just drop everything and move to VSAN? I doubt.
Duncan Epping says
I always wanted to do this, talk about myself in 3rd person so here we go:
Duncan doesn’t state everyone should use VSAN. Duncan doesn’t state everything else should be dropped. Duncan is the first to admit that there is a use case for it, and that not every customer is a fit for VSAN. Reality is that Duncan responded to an article by Maish and Christian which were on the topic of Hardware for VSAN, if we were talking about other types of deployments not needing local disks then Duncan may have a different opinion.
PS: Duncan is also a BMW driver, used to drive a Toyota and not a fan of Mercedes.
tom miller says
As VMware introduces new features it seems these features are leaning you towards a traditional rack server verses blade technology indirectly. I can only speak towards UCS and IBM blades but neither of these solutions can house 3 HD’s, only 2. VSAN requires 2 HD’s – 1 SSD 1 traditional, and if you want to utilize vSphere Flash Read Cache it requires it’s own SSD as VSAN and vFlash can not share an SSD. So, indirectly blades may limit your ability to utilize the breath of VMware features if your a Ent Plus customer.