Two logical PCIe flash devices for VSAN

A couple of days ago I was asked whether I would recommend to use two logical PCIe flash devices leveraging a single physical PCIe flash device. The reason for the question was the recommendation from VMware to have two Virtual SAN disk groups instead of (just) one disk group.

First of all, I want to make it clear that this is a recommended practices but definitely not a requirement. The reason people have started recommending it is because of “failure domains”. As some of you may know, when a flash device becomes unavailable, which is used for read caching / write buffering and fronts a given set of disks, all the disks in that disk group associated with the flash devices becomes unavailable. As such a disk group can be considered a failure domain, and when it comes to availability it is typically best to spread risks so having multiple failure domains is desirable.

When it comes to PCIe devices would it make sense to carve up a single physical device in to multiple logical? From a failure point of view I personally think it doesn’t add much value, if the device fails then it is likely both logical devices fail. From an availability point of view there isn’t much 2 logical devices adds, however it could be beneficial to have multiple logical devices if you have more than 7 disks per server.

As most of you will know each host can have 7 disks per disk group at most and 5 disk groups per server. If there is a requirement for the server to have more than 7 disks then there will be a need to have multiple flash devices, in that scenario creating multiple logical devices would be needed, although I would still prefer having multiple physical devices from a failure tolerance perspective than having multiple logical devices. But I guess it all depends on what type of devices you use, if you have sufficient PCIe slots available etc. In the end the decision is up to you, but do make sure you understand the impact of your decision.

Operational Efficiency (You’re not Facebook/Google/Netflix)

In previous roles, also before I joined VMware, I was a system administrator and a consultant. The tweets below reminded me of the kind of work I did in the past and triggered a train of thought that I wanted to share…

Howard has a great point here. For some reason many people started using Google, Facebook or Netflix as the prime example of operational efficiency. Startups use it in their pitches to describe what they can bring and how they can simplify your life, and yes I’ve also seen companies like VMware use it in their presentations.When I look back at when I managed these systems my pain was not the infrastructure (servers / network / storage)… Even though the environment I was managing was based on what many refer to as legacy: EMC Clariion, NetApp FAS or HP EVA. The servers were never really the problem to manage either, sure updating firmware was a pain but not my biggest pain point. Provisioning virtual machines was never a huge deal… My pain was caused by the application landscape many of my customers had.

At companies like Facebook and Google the ratio of Application to Admin is different as Howard points out. I would also argue that in many cases the applications are developed in-house and are designed around agility, availability and efficiency… Unfortunately for most of you this is not the case. Most applications are provided by vendors which don’t really seem to care about your requirements, they don’t design for agility and availability. No, instead they do what is easiest for them. In the majority of cases these are legacy monolithic (cr)applications with a simple database which all needs to be hosted on a single VM and when you get an update that is where the real pain begins. At one of the companies I worked for we had a single department using over 80 different applications to calculate mortgages for the different banks and offerings out there, believe me when I say that that is not easy to manage and that is where I would spent most of my time.

I do appreciate the whole DevOps movement and I do see the value in optimizing your operations to align with your business needs, but we also need to be realistic. Expecting your IT org to run as efficient as Google/Facebook/Netflix is just not realistic and is not going to happen. Unless of course you invest deep and develop the majority of your applications in-house, and do so using the same design principles these companies use. Even then I doubt you would reach the same efficiency, as most simply won’t have the scale to reach it. This does not mean you should not aim to optimize your operations though! Everyone can benefit from optimizing operations, from re-aligning the IT department to the demands of todays world, from revising procedures… Everyone should go through this motion, constantly, but at the same time stay realistic. Set your expectations based on what lands on the infrastructure as that is where a lot of the complexity comes in.

NetApp joins the EVO:RAIL party and includes a FAS

NetApp announced yesterday that they are now part of the EVO:RAIL partner program. Although I have been part of the EVO:RAIL team it is not something I would have seen coming. But I can see why they decided to join and find their announcement interesting. I wasn’t planning on writing about it as Mike Laverick already did yesterday, but as I received 8 emails over night on this topic I figured I would share what is going to be included in this package.

NetApp has created a rapid deployment mechanism for theNetApp FAS unit that will be integrated with the  NetApp EVO:RAIL appliance. The FAS unit will connect into the same top of rack switch that the EVO:RAIL appliance will connect into. We have created a link and launch capability that NetApp can leverage from within the EVO:RAIL configuration engine to rapidly configure/integrate the FAS unit with the EVO:RAIL appliance. 

Yes, this does mean that that 2U hyper-converged appliance which includes vSphere, VSAN and LogInsight now also will include a FAS unit (FAS 2500 judging by NetApp’s website?) in NetApp’s case. Now this is not the first time I have seen vendors adding hardware to the VMware EVO:RAIL offering, but in most other cases physical switches were included. I think this is a very interesting play though, and am looking forward to see how these two products will be integrated. From a configuration perspective I can envision what this would look like, but from a management point of view that will be a bit more challenging and may take some more time. With cool features like Virtual Volumes coming out in the near future this could be a nice way of providing a customer multiple types of storage in a seamless way.

Slow backup of VM on VSAN Datastore

Someone at out internal field conference asked me a question around why doing a full back up of a virtual machine on a VSAN datastore is slower then when doing the same exercise for that virtual machine on a traditional storage array. Note that the test that was conducted here was done with a single virtual machine. The best way to explain why this is is by taking a look at the architecture of VSAN. First, let me mention that the full backup of the VM on a traditional array was done on a storage system that had many disks backing the datastore on which the virtual machine was located.

Virtual SAN, as hopefully all of you know, creates a shared datastore out of host local resources. This datastore is formed out of disk and flash. Another thing to understand is that Virtual SAN is an object store. Each object typically is stored in a resilient fashion and as such on two hosts, hence 3 hosts is the minimum. Now, by default the component of an object is not striped which means that components are stored in most cases on a single physical spindle, for an object this means that as you can see in the diagram below that the disk (object) has two components and without stripes is stored on 2 physical disks.

Now lets get back to the original question. Why did the backup on VSAN take longer then with a traditional storage system? It is fairly simple to explain looking at the above info. In the case of the traditional storage array you are reading from multiple disks (10+) but with VSAN you are only reading from 2 disks. As you can imagine when reading from disk performance / throughput results will differ depending on the number of resources the total number of disks it is reading from can provide. In this test, as there it is just a single virtual machine being backed up, the VSAN result will be different as it has a lower number of disks (resources) to its disposal and on top of that is the VM is new there is no data cached so the flash layer is not used. Now, depending on your workload you can of course decide to stripe the components, but also… when it comes to backup you can also decided to increase the number of concurrent backups… if you increase the number of concurrent backups then the results will get closer as more disks are being leveraged across all VMs. I hope that helps explaining  why results can be different, but hopefully everyone understands that when you test things like this that parallelism is important or provide the right level of stripe width.

What is coming for vSphere and VSAN? VMworld reveals…

I’ve been prepping a presentation for upcoming VMUGs, but wanted to also share this with my readers. The session is all about vSphere futures, what is coming soon? Before anyone says I am breaking NDA, I’ve harvested all of this info from public VMworld sessions. Except for the VSAN details, those were announced to the press at VMworld EMEA. Lets start with Virtual SAN…

The Virtual SAN details were posted in this Computer Weekly article, and by the looks of it they interviewed VMware’s CEO Pat Gelsinger and Alberto Farronato from the VSAN product team. So what is coming soon?

  • All Flash Virtual SAN support
    Considering the price of MLC has lowered to roughly the same price as SAS HDDs per GB I think this is a great new feature to have. Being able to build all-flash configurations at the price point of a regular configuration, and with probably many supported configurations is a huge advantage of VSAN. I would expect VSAN to support various types of flash as the “capacity” layer, so this is an architects dream… designing your own all-flash storage system!
  • Virsto integration
    I played with Virsto when it was just released and was impressed by the performance and the scalability. Functions that were part of Virst such as snapshots and clones these have been built into VSAN and it will bring VSAN to the next level!
  • JBOD support
    Something many have requested, and primarily to be able to use VSAN in Blade environments… Well with the JBOD support announced this will be a lot easier. I don’t know the exact details, but just the “JBOD” part got me excited.
  • 64 host VSAN cluster support
    VSAN doesn’t scale? Here you go,

That is a nice list by itself, and I am sure there is plenty more for VSAN. At VMworld for instance Wade Holmes also spoke about support for disk controller based encryption for instance. Cool right?! So what about vSphere? Considering even the version number was dropped during the keynote and it hints at a major release you would expect some big functionality to be introduced. Once again, all the stuff below is harvested from various public VMworld sessions:

  • VMFork aka Project Fargo – discussed here…
  • Increased scale!
    • 64 host HA/DRS cluster, I know a handful of customers who asked for 64 host clusters, so here it is guys… or better said: soon you will have it!
  • SMP vCPU FT – up to 4 vCPU support
    • I like FT from an innovation point of view, but it isn’t a feature I would personally use too much as I feel “fault tolerance” from an app perspective needs to be solved by the app. Now, I do realize that there are MANY legacy applications out there, and if you have a scale-up application which needs to be highly available then SMP FT is very useful. Do note that with this release the architecture of FT has changed. For instance you used to share the same “VMDK” for both primary and secondary, but that is no longer the case.
  • vMotion across anything
    • vMotion across vCenter instances
    • vMotion across Distributed Switch
    • vMotion across very large distance, support up to 100ms latency
    • vMotion to vCloud Air datacenter
  • Introduction of Virtual Datacenter concept in vCenter
    • Enhance “policy driven” experience within vCenter. Virtual Datacenter aggregates compute clusters, storage clusters, networks, and policies!
  • Content Library
    • Content Library provides storage and versioning of files including VM templates, ISOs, and OVFs.
      Includes powerful publish and subscribe features to replicate content
      Backed by vSphere Datastores or NFS
  • Web Client performance / enhancement
    • Recent tasks pane drops to the bottom instead of on the right
    • Performance vastly improved
    • Menus flattened
  • DRS placement “network aware”
    • Hosts with high network contention can show low CPU and memory usage, DRS will look for more VM placements
    • Provide network bandwidth reservation for VMs and migrate VMs in response to reservation violations!
  • vSphere HA component protection
    • Helps when hitting “all paths down” situations by allowing HA to take action on impacted virtual machines
  • Virtual Volumes, bringing the VSAN “policy goodness” to traditional storage systems

Of course there is more, but these are the ones that were discussed at VMworld… for the remainder you will have to wait until the next version of vSphere is released, or you can also sign up for the beta still I believe!