Alert: vSphere 5.5 U1 and NFS issue!

Some had already reported on this on twitter and the various blog posts but I had to wait until I received the green light from our KB/GSS team. An issue has been discovered with vSphere 5.5 Update 1 that is related to loss of connection of NFS based datastores. (NFS volumes include VSA datastores.)

This is a serious issue, as it results in an APD of the datastore meaning that the virtual machines will not be able to do any IO to the datastore at the time of the APD. This by itself can result in BSOD’s for Windows guests and filesystems becoming read only for Linux guests.

Witnessed log entries can include:

2014-04-01T14:35:08.074Z: [APDCorrelator] 9413898746us: [vob.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
2014-04-01T14:35:08.075Z: [APDCorrelator] 9414268686us: [esx.problem.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
2014-04-01T14:36:55.274Z: No correlator for vob.vmfs.nfs.server.disconnect
2014-04-01T14:36:55.274Z: [vmfsCorrelator] 9521467867us: [esx.problem.vmfs.nfs.server.disconnect] 192.168.1.1/NFS-DS1 12345678-abcdefg0-0000-000000000000 NFS-DS1
2014-04-01T14:37:28.081Z: [APDCorrelator] 9553899639us: [vob.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.
2014-04-01T14:37:28.081Z: [APDCorrelator] 9554275221us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.

If you are hitting these issues than VMware recommends reverting back to vSphere 5.5. Please monitor the following KB closely for more details and hopefully a fix in the near future: http://kb.vmware.com/kb/2076392

 

Startup News Flash part 16

Number 16 of the Startup News Flash, here we go:

Nakivo just announced the beta program for 4.0 of their backup/replication solution. It adds some new features like: recovery of Exchange objects directly from compressed and deduplicated VM backups, Exchange logs truncation, and automated backup verification. If you are interested in testing it, make sure to sign up here. I haven’t tried it, but they seem to be a strong upcoming player in the backup and DR space for SMB.

SanDisk announced a new range of SATA SSDs called “cloudspeed”. They released 4 different models with various endurance levels and workload targets, of course ranging in sizes from 100GB up to 960GB depending on the endurance level selected. Endurance level ranges from 1 up to 10 full drive writes per day. (Just as an FYI, for VSAN we recommend 5 full drive writes per day as a minimum) Performance numbers range between 15k to 20k write IOps and 75 to 88K read IOps. More details can be found in the spec sheet here. What interest me most is the FlashGuard Technology that is included, interesting how SanDisk is capable of understanding wear patterns and workloads to a certain extend and place data in a specific way to prolong the life of your flash device.

CloudPhysics announced the availability of their Storage Analytics card. I gave it a try last week and was impressed. I was planning on doing a write up on their new offering but as various bloggers already covered it I felt there was no point in repeating what they said. I think it makes a lot more sense to just try it out, I am sure you will like it as it will show you valuable info like “performance” and the impact of “thin disks” vs “thick disks”. Sign up here for a 30day free trial!

Startup News Flash part 14

Part 13 of the Startup News Flash… Hopefully not an unlucky one for the startups featured. Just a short one considering I am in Vietnam and away ‘from work’ for the last 2 weeks.

A3Cube is a startup which came out of stealth recently and announced as they call it a ‘brain inspired’ data plane encapsulated in a NIC designed to bridge supercomputing benefits to the enterprise. The core of their solution is called Ronnie Express. They aim is to eliminate the I/O performance gap between CPU power and data access performance for HPC, Big Data and data center applications. A3CUBE’s In-Memory Network technology allows direct shared non-coherent global memory across the entire network, enabling global communication based on shared memory segments and direct load/store operations between the nodes. Basically a server “interconnect” solutions for lrge scale. They took the word “scale” serious by the way and can go up to 64,000 nodes. For more details, I highly recommend to read this excellent article by Enrico.

Infinio just announced Infinio Accelerator 1.2. This new version of the Infinio Accelerator now supports vSphere 5.5. Useful to know for those who have a home lab, Infinio is running a limited-time offer of free non-expiring licenses for test labs. Hit their website to find out more.

Startup News Flash part 13

Edition 13 of the Startup News Flash already. This week is VMware Partner Exchange 2014 so I expected some announcements to be made. There were a couple of announcements the last week(s) which I felt were worth highlighting. There is one that is not really a startup, but I figured should at least be included in the article and that is the fact that Scale.IO and SuperMicro / LSI / Mellanox / VMware showed an appliance at PEX that was optimized for View deployments. I found it an interesting move, and appealing solution. Chris Mellor wrote an article about it here for the Register.

DataGravity announced their Partner Early Access Program this week. They haven’t revealed what they are building, but judging by the quotes in the announcement publication they are aiming to bring a simple cost-effictive solution to enable analysis of unstructured data. Definitely interesting, and something I will look more closer in to at some point in time.

Atlantis ILIO USX was announced this week. I already mentioned it in my VSAN update. Atlantis ILIO USX is an in-memory storage solution. They added the ability to pool and optimize any class of storage including SAN, NAS, RAM or any type of DAS (SSD, Flash, SAS, SATA) to create a hybrid solution. A change of direction for Atlantis as there primary focus was caching so far, but it makes a lot of sense to me especially as they already have many of the data services for their caching platform.

PernixData announced their Beta program for FVP 1.5. They added support for vSphere 5.5, the vSphere Web Client and also in this version allow you to use a different VMkernel interface other than the vMotion interface which their product uses by default. If you want to know more, Chris Wahl wrote a nice article on his experience with FVP 1.5.

Tintri announced it has closed a $75 million Series E funding round led by Insight Venture Partners, with participation from existing investors Lightspeed Venture, Menlo Ventures and NEA. Good to see Tintri getting another boost, and will be interesting to see how they move forward. I have been following them from the very start and have always been impressed with the ease of the solution they have built.

Virtual SAN (related) PEX Updates

I am at VMware Partner Exchange this week and there and figured I would share some of the Virtual SAN related updates.

  • 6th of March their is an online Virtual SAN event with Pat Gelsinger, Ben Fathi and John Gilmartin… Make sure to register for it!
  • Ben Fathi (VMware CTO) stated that VSAN will be GA in Q1, more news in the upcoming weeks
  • Maximum cluster size has been increased from 8 (beta) to 16 according to Ben Fathi, VMware VSAN engineering team is ahead of schedule!
  • VSAN has linear scalability, close to a million IOPS with 16 hosts in a cluster (100% read, 4K blocks). Mixed IOPS close to half a million. All of this with less than 10% CPU/Memory overhead. That is impressive if you ask me. Yeah yeah I know, numbers like these are just a part of the overall story… still it is nice to see that this kind of performance numbers can be achieved with VSAN.
  • I noticed a tweet Chetan Venkatesh and it looks like Atlantis ILIO USX (in memory storage solution) has been tested on top of VSAN and they were capable of hitting 120K IOPS using 3 hosts, WOW. There is a white paper on this topic to be found here, interesting read.
  • It was also reinstated that customers who sign up and download the beta will get a 20% discount on the first purchase of 10 VSAN licenses or more!
  • Several hardware vendors announced support for VSAN, a nice short summary by Alberto to be found here.

Startup News Flash part 12

First edition of the 2014 of the Startup News Flash. I expect this year to be full of announcements, new rounds of funding, new products, new features and new companies. There are various startups planning to come out of stealth this year and all play in the storage / flash space, so make sure to follow this series!

On Tuesday the 14th of January Nutanix announced a new round of funding. Series D financing is co-led by Riverwood Capital and SAP Ventures, and the total amount is $101 million. The company has now raised a total of $172.2 million in four rounds of funding and has been valuated close to $ 1 billion. Yes, that is huge. Probably one of the most successful startups of the last couple of years. Congrats to everyone involved!

announced a rather aggressive program. The Register reported it here, and it is all about replacing NetApp systems with Tintri systems. In short: “The “Virtualize More with 50% Less” Program offers 50% storage capacity and rack space savings versus currently installed NetApp FAS storage to support deployed virtualization workloads”. I guess it is clear what kind of customers they are going after and who their primary competition is. Of course there is a list of requirements and constraints which the Register already outlined nicely. If you are looking to replace your current NetApp storage infrastructure I guess this could be a nice offer, or a nice way to get more discount.. Either way, you win.

SSD and PCIe flash devices are king these days, but SanDisk is looking to change that with the announcement of the availability of the ULLtraDIMM. The ULLtraDIMM is a combination of Diablo’s DDR3 tranlation protocol and SanDisk’s flash and controllers on top of a nice DIMM. Indeed, it doesn’t get closer to your CPU then straight on your memory bus. By the looks of it IBM is one of the first vendors to offer it, as they  recently announced that the eXFlash DIMM is an option for its System x3850 and x3950 X6 servers providing up to 12.8TB of flash capacity2. Early benchmarks showed write latency around 5-10 microsecond! I bet half the blogosphere just raised their hands to give this a go in their labs!

 

How to calculate what your Virtual SAN datastore size should be

I have had this question so many times I figured I would write an article about it, how to calculate what your Virtual SAN datastore size should be? Ultimate this determines which kind of server hardware you can use, which disk controller you need and which disks… So it is important that you get it right. I know the VMware Technical Marketing team is developing collateral around this topic, when that has been published I will add a link here. Lets start with a quote by Christian Dickmann one of our engineers as it is the foundation of this article:

In Virtual SAN your whole cluster acts as a hot-spare

Personally I like to work top-down, meaning that I start with an average for virtual machines or a total combined number. Lets take an example to go through the exercise, makes it a bit easier to digest.

Lets assume the average VM disk size is 50GB. On average the VMs have 4GB of memory provisioned. And we have 100 virtual machines in total that we want to run on a 4 host cluster. Based on that info the formula would look something like this:

(total number of VMs * average VM size) + (total number of VMs * average VM memory size) = total capacity required

In our case that would be:

(100 * 50GB) + (100 * 4GB) = 5400 GB

So that is it? Well not really, like every storage / file system there is some overhead and we will need to take the “failures to tolerate” in to account. If I set my “failures to tolerate” to 1 than I would have 2 copies of my VMs, this means I need 5400 GB * 2 = . Personally I also add an additional 10% in disk capacity to ensure we have room for things like: meta data, log files, vmx files and some small snapshots when required. Note that VSAN by default provisions all VMDKs as thin objects (note that swap files are thick, Cormac explained that here), so there should be room available regardless. Better safe than sorry though. This means that 10800 GB actually becomes 11880 GB. I prefer to round this up to 12TB. The formula I have been using thus looks as follows:

(((Number of VMs * Avg VM size) + (Number of VMs * Avg mem size)) * FTT+1) + 10%

Now the next step is to see how you divide that across your hosts. I mentioned we would have 4 hosts in our cluster. We have two options, we create a cluster that can re-protect itself after a full host failure or we create cluster that cannot. Just to clarify, in order to have 1 host of spare capacity available we will need to divide the total capacity by 3 instead of 4. Lets look at those two options, and what the impact is:

  • 12TB / 3 hosts = 4TB per host (for each of the 4 hosts)
    • Allows you re-protect (sync/mirror) all virtual machine objects even when you lose a full host
    • All virtual machines will maintain availability levels when doing maintenance
    • Requires an additional 1TB per host!
  • 12TB / 4 hosts = 3TB per host (for each of the 4 hosts)
    • If all disk space is consumed, when a host fails virtual machines cannot be “re-protected” as there would be no capacity to sync/mirror the objects again
    • When entering maintenance mode data availability cannot be maintained as there would be no room to sync/mirror the objects to another disk

Now if you look at the numbers, we are talking about an additional 1TB per host. With 4 hosts, and lets assume we are using 2.5″ SAS 900GB Hitachi drives that would be 4 additional drives, at a cost of around 1000 per drive. When using 3.5″ SATA drives the cost would be a lot lower even. Although this is just a number I found on the internet it does illustrate that the cost of providing additional availability could be small. Prices could differ though depending on the server brand used. But even at double the cost, I would go for the additional drive and as such additional “hot spare capacity”.

To make life a bit easier I created a calculator. I hope this helps everyone who is looking at configuring hosts for their Virtual SAN based infrastructure.