EVO:RAIL engineering interview with Dave Shanley (Lead Dev)

A couple of weeks ago we launched EVO:RAIL, a new VMware solution. I have been part of this since the very beginning, the prototype project started with just Dave and myself as part of the prototype team with Mornay van der Walt as the executive sponsor (interview with Mornay will follow shortly as this project involves many different disciplines). After Dave developed the initial UI mock-ups and we worked on the conceptual architecture, Dave started developing what then became known internally as MARVIN. If my memory serves correct it was our director at Integration Engineering (Adam Z.) who came up with the name and acronym (Modular Automated Rackable Virtual Infrastructure Node). All was done under the umbrella of Integration Engineering, in stealth mode with a very small team. I guess something not a lot of people know is that for instance William Lam was very instrumental when it came to figuring out in which order to configure what (a lot of dependencies as you can imagine) and which API calls to use for what. After a couple of months things really started to shape up, the prototype was demoed to C level and before we realized a new team was formed and gears shifted.

Personally whenever I talk to start-ups I like to know where they came from, what they’ve done in the past, how things went about… as that gives me a better understanding of why the product is what it is. Same applies to EVO:RAIL, no better start then with the lead developer and founding team member Dave Shanley

Good morning Dave, as not all of my readers will know who you are and what you did before joining the EVO:RAIL team can you please introduce yourself.
I’m the lead engineer, designer and software architect of the EVO:RAIL platform. I joined VMware about two and a half years ago. I started out in Integration Engineering, I got to see and experience a lot of the frustration that is often seen when trying to install, configure and integrate our technology. I’ve pretty much worked in web application engineering my entire career that has given me a really broad experience across consumer and enterprise technology. Before VMware I was the CTO of a really cool VC funded start-up in the UK as well as being the lead engineer over at McCann Erickson’s EMEA HQ. [Read more...]

VSAN with AHCI controller with vSphere 5.5 U2

I’ve been following a thread on the community forums closely around the AHCI disk controller. This disk controller is an on-board disk controller which caused some problems when used in conjunction with VSAN because of a driver problem. Note that this disk controller is not on the HCL and is not recommend for use in a production environment or ANY environment where reasonable performance is expected and endurance / availability is key. Many homelabbers used this controller however and I am happy to say that it was reported by Philzy that this fix mentioned in KB 2079729 appears to have solved the issues experienced.

For all those wanting to use VSAN in their homelabs… Game on!

EVO:RAIL vs VSAN Ready Node vs Component based

EVO:RAIL is awesome! That is typically what I hear from customers when pitching the EVO:RAIL play and showing the config and management demo. Customers are all over it I can tell you. They love the ease of deployment, management, procurement and support… Now, every now and then this geeky person pops up and say: but euuhm, I want more disks and I want to scale per node and all of my configuration stuff is scripted. How will that work with EVO:RAIL?

This is when I show them this slide:

It is a very very valid question to be honest. It is something which I, as a geek, would ask as well. How can I tweak the configuration so that it meets my requirements, and can I just use my own deployment mechanism? Sure you can, but not necessarily with EVO:RAIL. Keep in mind that EVO:RAIL is build using trusted VMware technology like VMware vSphere, vCenter Server, Virtual SAN and Log Insight. Although the EVO:RAIL engine (configuration and management interface) cannot be downloaded separately the components can be. We very much realize that EVO:RAIL may not be a fit for all customers and that is exactly why VMware offers choice as the slide above shows.

If you are a geek, love digging through hardware compatibility lists, like to configure your own servers part by part and have absolute maximum flexibility then option 1 is your best choice. Using the “Component Based” approach you can select your own: Server (vSphere HCL) and then from the VSAN HCL pick your components like the disk controller, SSD and magnetic drives. You get to pick how many drives, which type of flash, how much memory, how many cores per CPU… you name it. Note though, that it does mean you will need to do research to find out which components work well together, what kind of performance you can expect from disk controller x, y or z. But it is doable, many customers have already done this and it will allow you to design to your specific needs. Do note, you will need to configure it yourself and purchase licenses / support.

If you prefer a simpler approach, but still a certain level of flexibility then the “Virtual SAN Ready Node” approach is definitely a great option. It provides you a selection of around 40 different OEM configurations which have been validated by both the OEMs and VMware. Note though that these configurations are typically based on VM configuration profiles and IO profiles. This is mentioned in the Virtual SAN Ready Node list, there are low / medium / high configurations and also two different VDI configurations for each of the different server platforms. If you prefer a pre-validated solution, but need some flexibility then this is the way to go. Again, you will need to install/configure it yourself and purchase licenses / support, but it definitely easier than “component based”.

The third option is “VMware EVO:RAIL“. EVO:RAIL is at the far right of the slider –> Maximum Ease of Use. EVO:RAIL is pre-built on a qualified platform. This means that it comes pre-installed, and can be configured within 15 minutes. It has an easy / simple management interface that allows for easy patching/updating, simple VM creation and management, and even easier automatic scale-out (a couple of clicks). On top of that, it is sold as a single SKU (all licenses and support included) and all support will go through 1 channel. No more being pointed from one vendor to the other, no you contact that single vendor for both support of software as for hardware… Maximum Ease of Use as I said. If this is what you are looking for, EVO:RAIL is what you need.

As you see, when it comes to scale-out server SAN / hyper(visor) converged solutions… VMware offers you maximum choice.

Virtual SAN / EVO:RAIL use cases versus supported

I have seen this being debated many times on twitter now, and I’ve seen various Virtual SAN (VSAN) and EVO:RAIL competitors use this in the past to mislead potential customers.

I think we have all seen these slides at VMworld or at a VMUG when it comes to VSAN or EVO:RAIL. The slide contains a couple of primary use cases:

evo:rail use cases

So what does that mean? Does this mean that VMware does not support Exchange or MS SQL on top of VSAN or EVO:RAIL? Does that mean that VMware does not support it as a DR target? Or what about a management cluster? Or what about running Oracle? Or maybe SAP? Or what about my WordPress instance? Or what about MySQL? Or although you mention VDI, would that only be VMware View? What about… Yes by now you get my drift.

Let me try to make it really simple: Primary use cases says nothing about support. Primary use case means that this where the vendor expects the product or solution to fit best. In this case it is where VMware expect VSAN/EVO:RAIL to fit best, this is the target market VMware will be going after with this release.

Why include this in a slide deck? Well it allows you (the user / consultant / architect) to quickly identify where the majority of opportunities will be with the current version for your environment or for your customers. It does NOT mean that if your use case, like running your Exchange environment for instance on top of VSAN, is not listed that it is not supported. (Try listing all use cases on a slide, it will get pretty lengthy.)

Running Tier-1 applications on top of VSAN (or EVO:RAIL) is fully supported as it stands today, however … your application requirements and your service level agreement will determine if EVO:RAIL or VSAN is a good fit. One example would for instance be that if your agreed SLA requires an RPO (recovery point objective) of zero then sync replication is the only option (or stretched clustering), now you will need to determine if this is possible with the platform you want to use (this goes for any solution!). (Yes, you can make that happen with the platform pretty soon before anyone wants to go there…)

I hope that clears things up a bit.

Re: Re: The Rack Endgame: A New Storage Architecture For the Data Center

I was reading Frank Denneman’s article with regards to new datacenter architectures. This in its turn was a response to Stephen Fosket’s article about how the physical architecture of datacenter hardware should change. I recommend reading both articles as that will give a bit more background, plus they are excellent reads by itself. (gotta love these blogging debates) Lets start with an out take of both articles which summarizes blog posts for those who don’t want to read the full article.

Stephen:
Top-of-rack flash and bottom-of-rack disk makes a ton of sense in a world of virtualized, distributed storage. It fits with enterprise paradigms yet delivers real architectural change that could “move the needle” in a way that no centralized shared storage system ever will. SAN and NAS aren’t going away immediately, but this new storage architecture will be an attractive next-generation direction!

If you look at what Stephen describes I think it is more or less in line with what Intel is working towards. The Intel Rack Scale Architecture aims to disaggregate traditional server components and then aggregate by type of resource backed by a super performing and optimized rack fabric. Rack fabric enabled by the new photonic architecture Intel is currently working on. This is not long term future, this is what Intel showcased last year and said to be available in 2015 / 2016.

Frank:
The hypervisor is rich with information, including a collection of tightly knit resource schedulers. It is the perfect place to introduce policy-based management engines. The hypervisor becomes a single control plane that manages both the resource as well as the demand. A single construct to automate instructions in a single language providing a correct Quality of Service model at application granularity levels. You can control resource demand and distribution from one single pane of management. No need to wait on the completion of the development cycles from each vendor.

There’s a bit in Frank’s article as well where he talks about Virtual Volumes and VAAI and how long it took for all storage vendors to adopt VAAI and how he believes that the same may apply to Virtual Volumes and Frank aims more towards the hypervisor being the aggregator instead of doing it through changes in the physical space.

So what about Frank’s arguments? Well Frank has a point with regards to VAAI adoption and the fact that some vendors took a long time to implement these. However, reality is though that Virtual Volumes is going full steam ahead. With many storage vendors demoing it at VMworld in San Francisco last week I have the distinct feeling that things will be different this time. Maybe timing is part of it, as it seems that many customers or on a crosspoint and want to optimize their datacenter operations / architecture by adopting SDDC, of which policy based storage management happens to be a big chunk.

I agree with Frank that the hypervisor is positioned perfect to be that control plane. However, in order to be that control plane for the future there needs to be a way to connect “things” to it which allows for far better scale and more flexibility. VMware, if you ask me, has done that for many parts of the datacenter but one aspect that stills needs to be overhauled for sure is storage. VAAI was a great start, but with VMFS there simply are too many constraints and it doesn’t cater for granular controls.

I feel that the datacenter will need to change on both ends in order to take that next step in the evolution to the SDDC. Intel Rack Scale architecture will allow for far greater scale and efficiency then seen ever before. But it will only be successful when the layer that sits on top has the ability to take all of these disaggregated resources, turn them in to large shared pools and allows to assign resources in a policy driven (and programmable) manner. Not just assign resources but also allow you to specify what the level of availability (HA, DR but also QoS) should be for whatever consumes those resources. Granularity is important here and of course it shouldn’t stop with availability but applies to any other (data) service that one may require.

So where does what fit in? If you look at some of the initiatives that were revealed at VMworld like Virtual Volumes, Virtual SAN and vSphere APIs for IO Filters you can see where the world is moving towards fast. You can see how vSphere is truly becoming that control plane for all resources and how it will be able to provide you end-to-end policy driven management. In order to make all of this reality the current platform will need to change. Changes that allow for more granularity /flexibility and higher scalability and that is where all these (new) initiatives come in to play. Some partners may take longer to adopt than others, especially those that require fundamental changes to the architecture of underlaying platforms (storage systems for instance), but just like with VAAI I am certain that over time this will happen as customers will drive this change by making decisions based on availability of functionality.

Exciting times ahead if you ask me.