Startup intro: ZeroStack

A couple of months back one of the people I used to work a lot with in the DRS team reaches out to me. He told me that he started a company with some other people I knew and we spoke about the state of the industry and some of the challenges customers faced. Fast forward to today, ZeroStack just came out of stealth and announced to the world what they are building and an A round funding of roughly $ 5.6m.

At the head of the company as CEO we have Ajay Gulati, former VMware employee and most known for Storage IO Control, Storage DRS and DRS. Kiran Bondapalati is the CTO and some may recognize that name as he was a lead architect on Bromium. The DNA of the company is a mix of VMware, Nutanix, Bromium, Cisco, Google an more. Not a bad list I must say

So what are they selling? ZeroStack has developed a private cloud solution which is delivered in two parts:

  1. Physical 2U/4Node Appliance which comes with KVM preinstalled named ZS1000
  2. Management / Monitoring solution which is delivered in a SaaS model.

ZeroStack showed me a demo and getting their appliance up and running took about 15 minutes, the configuration wizard wasn’t unlike EVO:RAIL and looked very easy to run through. The magic however if you ask me isn’t in their configuration section, it is the SaaS based management solution. I stole a diagram from their website which immediately shows the potential.


The SaaS management layer provides you a single pane of glass of all the deployed appliances. These can be in a single site or in multiple sites. You can imagine that especially for ROBO deployments this is very useful, but also in larger environments. Now it doesn’t just show you the physical aspect, it also shows you all the logical constructs that have been created like “projects”.

At this part of the demo by the way I got reminded of vCloud Director a bunch of times, and AWS for that matter. ZeroStack allows you to create “tenants” and designate resources to them in the form of projects. These can even have a lease times, which is kind of similar to what vCloud Director offers also.

When looking at the networking aspects of ZeroStack’s solution it also has the familiar constructs like private networks and public networks etc. On top of that networking services like routing / firewall’ing are implemented also in a distributed fashion. And before I forget, everything you see in the UI can also be automated through the APIs which are fully Openstack compatible.

Last but not least we had a discussion about patching and updating. With most systems this is usually the most complicated part. ZeroStack took a very customer friendly approach. The SaaS layer is being updated by them, and this can happen as frequent as once every ten days. The team said they are very receptive to feedback and have a short turnaround time for implementing new functionality, as their goal is to provide most functionality through the SaaS layer. The appliance will be on a different patch/update scheme, probably once every 3 or 6 months, of course depending on the problems fixed and features introduced. The updates are done in a rolling fashion and non-disruptive to your workloads, as expected.

That sounds pretty cool right? Well as always with a 1.0 version there is still some functionality missing. Functionality that is missing in 1.0 is for instance a “high availability” feature for your workloads. If a host fails then you as an admin will need to restart those VMs. Also when it comes to load balancing, there is no “DRS-alike” functionality today. Considering the background of the team though, I can imagine both of those showing up at some point in the near future. It does however mean that for some workloads the 1.0 version may not be the right solution for now. Nevertheless, test/dev and things like cloud native apps could land on it.

All in all, a nice set of announcements and some cool functionality coming. These guys are going to be at VMworld so make sure to stop by their booth if you want to see what they are working on.

Virtual SAN Ready Nodes taking charge!

Yes that is right, Virtual SAN Ready Nodes are taking charge! As of today when you visit the VMware Compatibility Guide for Virtual SAN it will all revolve around Virtual SAN Ready Nodes instead of individual components. You may ask yourself why that is, well basically because we want to make it easier for you to purchase the hardware needed while removing the complexity of selecting components. This means that if you are a Dell customer and want to run Virtual SAN you can simply select Dell in the VMware Compatibility Guide and then look at the different models there are of the different sizes. It is very easy as can be seen in the screenshot below.

virtual san ready nodes

Traditionally there were 3 different sizes for “Server Virtualization”, but with the full overhaul of the VSAN VCG a new size was added. The naming of the sizing has also changed. Let me explain what it looks like now, note that these “sizing profiles” are the same across all vendors so comparing HP to Dell or IBM (etc) was never easier!

New NameOld Name
HY-2Hybrid Server Low
HY-4** new **
HY-6Hybrid Server Medium
HY-8Hybrid Server High
HY-8Hybrid VDI Linked Clones
Hybrid VDI Full Clones
AF-6All Flash Server Medium
AF-8All Flash Server High
AF VDI Linked Clones
AF VDI Full Clones

The new model introduced is HY-4 Series, the reason this model was introduced is because some customers felt that the price difference between HY-2 and H&-6 was too big. By introducing a model in between we now cover all price ranges. Note that it is still possible when selecting the models to make changes to the configuration. If you want model HY-2 with an additional 2 disks, or with 128GB of memory instead of 32GB then you can simply request this.

So what are we talking about in terms of capacity etc? Of course this is all documented and listed on the VCG as well, but let me share it with you here also for your convenience. Note that performance and VM numbers may be different for your scenario, this of course will depend on your workload and the size of your VMs etc.

ModelCPU / MemStorage CapStorage PerfVMs per node
HY-21 x 6 core / 32GB2TB4000 IOPSUp to 20
HY-42 x 8 core / 128GB4TB10K IOPSUp to 30
HY-62 x 10 core / 256GB8TB20K IOPSUp to 50
HY-82 x 12 core / 348GB12TB40K IOPSUp to 100
AF-62x12 core / 256GB8TB50K IOPSUp to 60
AF-82x12 core / 348GB12TB80K IOPSUp to 120

In my opinion, this new “Ready Node” driven VMware Compatibility Guide driven approach is definitely 10 times easier then focusing on individual components. You pick the ready node that comes close to what you are looking for, provide your OEM with the SKU listed and tell them about any modifications needed in terms of CPU/Mem or Disk Capacity. PS: If you want to access the “old school HCL” then just click on the “Build Your Own based on Certified Components” link on the VCG page.

Taking on some additional responsibilities… lead evangelist

Some of you may have noticed that this week I posted my first article on the VMware Virtual Blocks blog. In the future you will see me posting there more frequently. Of course I will also stay posting here, but some more VSAN evangelism related stories will probably go to Virtual Blocks blog first. So what has changed?

Chuck Hollis just published a blog that he has decided to pursue a new career opportunity outside of VMware. Chuck has been the lead evangelist for the Storage & Availability BU over the last couple of years, and if you ask me has been instrumental putting Virtual SAN on the map. I want to wish Chuck all the best. Thank for everything you have done for the company and Virtual SAN in particular.

When a door closes a window opens is what they say, window of opportunity that is in this situation. I’m honoured and humbled that I’ve been asked to take on the responsibility as lead evangelist for the Storage and Availability BU effective immediately on top of my current responsibilities as a Chief Technologist in the CTO Office.

I am looking forward to spend even more time talking publicly about all of our efforts in the storage and availability space, expect to hear from me soon!  

Tintri announces all-flash storage device and Tintri OS 4.0

Last week I had the pleasure of catching up with Tintri. It has been a while since I spoke with them, but I have been following them from the very start. I met up with them in Mountain View a couple of times when it was just a couple of guys on a rather empty floor with a solution that sounded really promising. Tintri’s big thing is simplicity if you ask me. Super simple to setup, really easy to manage, and providing VM granular controls for about everything you can imagine. The solution comes in the form of a hybrid storage device (disks and flash) which is served up to the hypervisor as an NFS mount.

Today Tintri announces that they will be offering an all-flash system next to their hybrid systems. When talking to Kieran he made it clear that the all-flash system would probably be only for a subset of their customers. The key reason for this being that the hybrid solution already brings great performance and is at a much lower cost of course. The new all-flash model is named VMstore T5000 and comes in two variants: T5060 and T5080. The T5060 can hold up to 2500 VMs and around 36TB with dedupe and compression. For the T5080 that is 5000 VMs and around 73TB. Both delivered in a 2U form factor by the way. The expected use case for the all flash systems is large persistent desktops and multi TB high performance databases. Key thing here is of course not jus the number of IOPS it can drive, but the consistent low latency it can deliver.

Besides the hardware, there is also a software refresh. Tintri OS 4.0 and Global Center 2.1 are being announced. Tintri OS 4.0 is what is sitting on the VMstore storage systems and Global Center is their central management solution. With the 2.1 release Global Center now supports up to 100.000 VMs. It allows you to centrally manage both Tintri’s hybrid and all-flash systems from one UI and smart things like informing you when a VM is provisioned to the wrong storage system (hybrid but performance wise requires all-flash for instance). Not just inform you, but it also has the ability to migrate the VM from storage system to storage system. Note that during the migration all aspects that were associated with it (QoS, Replication etc) is kept. (Not unlike Storage DRS, but in this case the solution is aware of all that happens on the storage system) What I liked personally about Global Center is the performance views / health views. It is very easy to see what the state of your environment is, where latency is coming from etc. Also, if you need to configure things like QoS, replication or snapshotting for multiple VMs you can do this from the Global Center console by simply grouping them as show in the screenshot below.

Tintri QoS was demoed during the call, and I found this also particularly interesting as it allows you to define QoS on a VM (or VMDK) granular level. When you do things like specifying an IOPS limit it is good to know that Tintri normalizes the IOPS based on the size of the IO. Simply said, all IO of 8KB or lower becomes 1 normalized IOPS, an IO which is 16KB will be 2 normalized IOPS etc. This to ensure fairness in environments (this will be almost every environment) where IO sizes greatly vary. Those whom have ever tried to profile their workloads will know why this is important. What I’ve always like about Tintri is their monitoring things like latency for instance how they split that up in hypervisor, network and storage is very useful. They have done an excellent job again for QoS management.

Last but not least Tintri introduces Tintri VMstack. Basically their converged offering where Compute + Storage + Hypervisor is bundled and delivered as a single stack to customers. It will provide you the choice of storage platform (well needs to be Tintri of course), hypervisor, compute and network infrastructure. It can also include things like OpenStack or the vRealize Suite. Personally I think this is a smart move, but this is something I would have preferred to have seen launched 12-18 months ago. Nevertheless, it is a good move.

Using VM-Host rules without DRS enabled

This week I was playing with the VM-Host rules in my environment. In this particular environment I had DRS disabled and I noticed some strange things when I created the VM-Host rules. I figured it should all work in a normal way as I was always told that VM/Host rules can be configured without DRS being enabled. And from a “configuration” perspective that is correct. However there is a big caveat here, and lets look at the two options you have when creating a rule namely “should” and “must”.

When using a VM-Host “must” rule when DRS is disabled it all works as expected. When you have the rule defined then you cannot place the VM on a host which is not within the VM-Host group. You cannot power it on on those hosts, no vMotion and HA will not place the VM there either after a failure. Everything as expected.

In the case of a VM-Host “should” rule when DRS is disabled this is different! When you have a should rule defined and DRS is disabled then vCenter will allow you to power on a VM on a host which is not part of the rule. HA will restart VMs on hosts as well which are not part of the rule, and you can migrate a VM to one of those hosts. All of this without a warning that the host is not in the rule and that you are violating the rule. Even after explicitly defining an alarm I don’t see anything triggered. The alarm by the way is called “VM is violating a DRS VM-Host affinity rule”.

I reached out to the HA/DRS engineering team and asked them why that is. It appears the logic for the “should” rule, in contrary to the “must rule, is handled by DRS. This includes the alerting. It makes sense to a certain extent, but it wasn’t what I expected.  So be warned, if you don’t have DRS enabled, “VM-Host should rules” will not work. Must rules however will work perfectly fine. (Yes, I’ve asked them to look in to this and fix it so it behaves as you would expect it to behave and come with a warning when you try anything that violates a should rule.)