Alert: vSphere 5.5 U1 and NFS issue!

Some had already reported on this on twitter and the various blog posts but I had to wait until I received the green light from our KB/GSS team. An issue has been discovered with vSphere 5.5 Update 1 that is related to loss of connection of NFS based datastores. (NFS volumes include VSA datastores.)

This is a serious issue, as it results in an APD of the datastore meaning that the virtual machines will not be able to do any IO to the datastore at the time of the APD. This by itself can result in BSOD’s for Windows guests and filesystems becoming read only for Linux guests.

Witnessed log entries can include:

2014-04-01T14:35:08.074Z: [APDCorrelator] 9413898746us: [vob.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
2014-04-01T14:35:08.075Z: [APDCorrelator] 9414268686us: [esx.problem.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
2014-04-01T14:36:55.274Z: No correlator for vob.vmfs.nfs.server.disconnect
2014-04-01T14:36:55.274Z: [vmfsCorrelator] 9521467867us: [esx.problem.vmfs.nfs.server.disconnect] 192.168.1.1/NFS-DS1 12345678-abcdefg0-0000-000000000000 NFS-DS1
2014-04-01T14:37:28.081Z: [APDCorrelator] 9553899639us: [vob.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.
2014-04-01T14:37:28.081Z: [APDCorrelator] 9554275221us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.

If you are hitting these issues than VMware recommends reverting back to vSphere 5.5. Please monitor the following KB closely for more details and hopefully a fix in the near future: http://kb.vmware.com/kb/2076392

 

Disk Controller features and Queue Depth?

I have been working on various VSAN configurations and a question that always comes up is what are my disk controller features and queue depth for controller X? (Local disks, not FC based…) Note that this is not only useful to know when using VSAN, but also when you are planning on doing host local caching with solutions like PernixData FVP or SanDisk FlashSoft for instance. The controller used can impact the performance, and a really low queue depth will result in a lower performance, it is as simple as that.

I have found myself digging through documentation and doing searches on the internet until I stumbled across the following website. I figured I would share the link with you, as it will help you (especially consultants) when you need to go through this exercise multiple times:

http://forums.servethehome.com/index.php?threads/lsi-raid-controller-and-hba-complete-listing-plus-oem-models.599/

Just as an example, the Dell H200 Integrated disk controller is on the VSAN HCL. According to the website above it is based on the LSI 2008 and provides the following feature set: 2×4 port internal SAS, no cache, no BBU, RAID 0, 1 and 10. According to the VSAN HCL also provides “Virtual SAN Pass-Through”. I guess the only info missing is queue depth of the controller. I have not been able to find a good source for this. So I figured I would make this thread a source for that info.

Before we dive in to that, I want to show something which is also important to realize. Some controllers take: SAS / NL-SAS and SATA. Although typically the price difference between SATA and NL-SAS is neglectable, the queue depth difference is not. Erik Bussink was kind enough to provide me with these details of one of the controllers he is using as an example, first in the list is “RAID” device – second is SATA and third SAS… As you can see SAS is the clear winner here, and that includes NL-SAS drives.

mpt2sas_raid_queue_depth: int
     Max RAID Device Queue Depth (default=128)
  mpt2sas_sata_queue_depth: int
     Max SATA Device Queue Depth (default=32)
  mpt2sas_sas_queue_depth: int
     Max SAS Device Queue Depth (default=254)

If you want to contribute, please take the following steps and report the Vendor, Controller type and aqlength in a comment please.

  1. Run the esxtop command on the ESXi shell / SSH session
  2. Press d
  3. Press f and select Queue Stats (d)
  4. The value listed under AQLEN is the queue depth of the storage adapter

The following table shows the Vendor, Controller and Queue Depth. Note that this is based on what we (my readers and I) have witnessed in our labs and results my vary depending on the firmware and driver used. Make sure to check the VSAN HCL for the supported driver / firmware version, note that not all controllers below are on the VSAN HCL, this is a “generic” list as I want it to serve multiple use cases.

Generally speaking it is recommended to use a disk controller with a queue depth > 256 when used for VSAN or “host local caching” solutions.

Vendor Disk Controller Queue Depth
Adaptec RAID 2405 504
Dell (R610) SAS 6/iR 127
Dell PERC 6/i 925
Dell PERC H200 Integrated 600
Dell PERC H310 25
Dell (M710HD) PERC H200 Embedded 499
Dell (M910) PERC H700 Modular 975
Dell PERC H700 Integrated 975
Dell (M620) PERC H710 Mini 975
Dell (T620) PERC H710 Adapter 975
Dell (T620) PERC H710p 975
Dell PERC H810 975
HP Smart Array P220i 1020
HP Smart Array P400i 128
HP Smart Array P410i 1020
HP Smart Array P420i 1020
HP Smart Array P700m 1200
IBM ServeRAID-M5015 965
Intel C602 AHCI (Patsburg) 31 (per port)
Intel C602 SCU (Patsburg) 256
Intel RMS25KB040 600
LSI 2004 25
LSI 2008 25
LSI 2108 600
LSI 2208 600
LSI 2308 600
LSI 3008 600
LSI 9300-8i 600

VSAN for ROBO?

I noticed this new SuperMicro VSAN Ready Node being published last week. The configuration is potentially a nice solution for ROBO deployments, primarily due to the cost of the system.

When I did the math it came in around $ 3800,-. This is the configuration:

  • SuperMicro SuperServer 1018D-73MTF
  • 1 x Intel E3-1270 V3 3.5GHz- Quadcore
  • 32GB Memory
  • 5 x 1TB 7200 RPM NL-SAS HDD
  • 1 x 200GB Intel S3700 SSD
  • LSI 2308 Disk controller
  • 4 x 1GbE NIC port

It is a nice configuration that will allow for roughly fifteen 1 vCPU Virtual Machines with 3GB of memory and 60GB disk capacity per host. Personally I would use a different CPU and some more memory probably as that gives you a bit more headroom, especially during maintenance. The cost from a software point of view is socket based so you can increase memory and change the type of CPU with relative low cost impact. The SuperMicro server listed however is limited to the E3 CPU family and to 32GB but there are alternatives out there. (For instance the Dell R320 or maybe even the R210 etc)

From a software point of view the cost of this configuration is limited to 3 x VSAN license and 3 x vSphere. As VSAN even works with Essentials Plus and Standard you could leverage that to keep the cost down, but keep in mind that you won’t have DRS if you drop down to Standard or lower. Still sounds like a nice ROBO package to me, especially when you have many sites this could be a great way to create a standardized packaged solution.

ESXi DCUI Shutdown vs vCenter Shutdown of a host

Today on the community forums someone mentioned he had shutdown his host and that he expected vSphere HA to restart his virtual machines. For whatever reason he got in a situation where all of his VMs were still running but he couldn’t do much anymore with them and as such he wanted to kill the host so that HA could safely restart the virtual machines. However when he shutdown his host nothing happened, the VMs remained powered off. Why did this happen?

I had seen this before in the past, but it never really sunk in until I saw the questions from this customer. I figured I would test it just to see what happened and if I could spot a difference in the vSphere HA logs. I powered on a VM on one of my hosts and moved off all other VMs. I then went to the DCUI of the host and gave a “shutdown” using F12. I tailed the FDM log on one of my hosts and spotted the following log message:

2014-04-04T11:41:54.882Z [688C2B70 info 'Invt' opID=SWI-24c018b] [VmStateChange::SavePowerChange] vm /vmfs/volumes/4ece24c4-3f1ca80e-9cd8-984be1047b14/New Virtual Machine/New Virtual Machine.vmx curPwrState=unknown curPowerOnCount=0 newPwrState=powered off clnPwrOff=true hostReporting=host-113

In the above scenario the virtual machine was not restarted even though the host was shutdown. I did the exact same exercise again, but only this time I did the shutdown using the vCenter Web Client. After I witnessed the VM being restarted I also noticed a difference in the FDM log:

2014-04-04T12:12:06.515Z [68040B70 info 'Invt' opID=SWI-1aad525b] [VmStateChange::SavePowerChange] vm /vmfs/volumes/4ece24c4-3f1ca80e-9cd8-984be1047b14/New Virtual Machine/New Virtual Machine.vmx curPwrState=unknown curPowerOnCount=0 newPwrState=powered on clnPwrOff=false hostReporting=host-113

The difference is the power-off state that is reported by vSphere HA. In the first scenario the virtual machine is marked as “clnPwrOff=true” which basically tells vSphere HA that an administrator has powered off the virtual machine, this is what happened when “shutdown” was initiated through the DCUI and hence no restart took place. (It seems that ESXi initiates a shutdown of all running virtual machines.) In the second scenario vSphere HA reported that the VM was not cleanly powered off (“clnPwrOff=false”), and as such it restarted the virtual machine as it assumed something bad had happened to it.

So what did we learn? If you, for whatever reason, want vSphere HA to restart your virtual machines which are currently running on a host that you want to shutdown, make sure that you use the vCenter Web Client instead of the DCUI!

Disclaimer: my tests were conducted using vSphere 5.5 Update 1. I believe that at some point in the past “shutdown” via the DCUI would also allow HA to restart the VMs. I am now investigating why this has changed and when. When I find out I will update this post.

30K for a VSAN host @theregister? I can configure one for 2250 USD!

I’ve been following the posts from the Register on VSAN and was surprised when they posted the cost of the hosts they configured: 30K each. With 3 at a minimum they concluded that for 90K you could buy yourself a nice legacy storage system. I don’t disagree with that to be honest… for 90K you can buy a nice legacy storage system. I guess you need to ask yourself first though what you will do with that 90K storage system by itself? Not much indeed, as you would need compute resources sitting next to it in order to do anything. So if you want to make a comparison, do not compare a full VSAN environment (or any other hyper-converged solution out there) to just a storage system at it just doesn’t make sense.

Now that still doesn’t make these hosts cheap I can hear you think, and again I agree with that. Although I have absolutely no clue where the 30K came from, and judging by the tweets this morning most people don’t know and feel it probably was overkill. Call me crazy, but I can configure a fully supported VSAN configuration for about 2250 USD (just HW) on the Dell website.

  • Dell T320
  • Intel Xeon E5-2420 1.90GHz 6 Core
  • Perc H310 Disk Controller
  • 32GB Memory
  • 1 x 7200RPM 1TB NL-SAS
  • 1 x 100GB Intel S3700 SSD (or dell equal drive)
  • 5 x 1GbE NIC Port

I would like to conclude that VSAN would be a lot cheaper than those legacy solutions, less than 7500 USD for 3 hosts is peanuts right?!? Yes I know, the above configuration wouldn’t fit many use cases (except for maybe a ROBO deployment where only a couple of VMs are needed) and that was the whole point of the exercise showing how pointless these exercises can be. You can twist these numbers anyway you like, and you can configure your VSAN hosts any way you like as long as the components (HDD/SSD/Controller) are on the VSAN HCL and the system is on the vSphere HCL. PS: Dear Register, next time you run through the exercise, you may want to post the configuration you selected… It makes things a bit clearer.

VSAN – Misconfiguration Detected

Although Cormac Hogan already wrote about this I figured I would repeat some of his work. It seems like various folks are hitting this issue where an error is thrown while configuring VSAN: Misconfiguration Detected. The misconfiguration in this case refers to how the physical network has been configured. In order for VSAN to be successfully configured your layer 2 VSAN network will need to be enabled for multicast traffic. (below a screenshot of the error which I borrowed from Cormac… thanks Cormac)

In order to successfully configure VSAN you can do two things, now lets be clear that I am not the networking expert and personally I would always advise to discuss with your networking team what the best option is. Here are your two options:

  • Enable IGMP Snooping for your VSAN network (VLAN) and define an IGMP Snooping Querier. Default setting on most Cisco switches is IGMP Snooping enabled but without an IGMP Snooping Querier. In this configuration VSAN will not be able to configure correctly!
  • Disable IGMP Snooping for your VSAN network (VLAN). Please note that you can typically disable IGMP Snooping globally and per VLAN, in this case if you want to disable it… disable it on your VLAN!

Please consult your network vendor documentation on how to do this.

VSAN – The spoken reality

Yesterday Maish and Christian had a nice little back and forth on their blogs about VSAN. Maish published a post titled “VSAN – The Unspoken Truth” which basically talks about how VSAN doesn’t fit blade environments, and how many enterprise environments adopted blade to get better density from a physical point of view. With that meaning increase the number of physical servers to the number of rack U(nits) consumed. Also there is the centralized management aspect of many of these blade solutions that is a major consideration according to Maish.

Christian countered this with a great article titled “VSAN – The Unspoken Future“. I very much agree with Christian’s vision. Christian’s point basically is that when virtualization was introduced IT started moving to blade infrastructures as that was a good fit for the environment they needed to build. Christian then explains how you can leverage for instance the SuperMicro Twin architecture to get a similar (high physical) density while using VSAN at the same time. (See my Twin posts here) However, the essence of the article is: “it shows us that Software Designed Data Center (SDDC) is not just about the software, it’s about how we think, manage AND design our back-end infrastructure.”

There are three aspects here in my opinion:

  • Density – the old physical servers vs rack units discussion.
  • Cost – investment in new equipment and (potential) licensing impact.
  • Operations – how do you manage your environment, will this change?

First of all, I would like to kill the whole density discussion. Do we really care how many physical servers you can fit in a rack? Do we really care you can fit 8 or maybe even 16 blades in 8U? Especially when you take in to consideration your storage system sitting next to it takes up another full rack. Than on top of that there is the impact density has in terms of power and cooling (hot spots). I mean if I can run 500 VMs on those 8 or 16 blades and that 20U storage system, is that better or worse than 500 VMs on 12 x 1U rack mounted with VSAN? I guess the answer to that one is simple: it depends… It all boils down the total cost of ownership and the return on investment. So lets stop looking at a simple metric like physical density as it doesn’t say much!

Before I forget… How often have we had those “eggs in a basket” discussions in the last two years? This was a huge debate 5 years back, in 2008/2009 did you really want to run 20 virtual machines on a physical host? What if that host failed? Those discussions are not as prevalent any longer for a good reason. Hardware improved, stability of the platforms increased, admins became more skilled and less mistakes are made… chances of hitting failures simply declined. Kind of like the old Microsoft blue screen of death joke, people probably still make the joke today but ask yourself how often does it happen?

Of course there is the cost impact. As Christian indicated, you may need to invest in new equipment… As people mentioned on twitter: so did we when we moved to a virtualized environment. And I would like to add: and we all know what that brought us. Yes there is a cost involved. The question is how do you balance this cost. Does it make sense to use a blade system even for VSAN when you can only have a couple of disks at this point in time? It means you need a lot of hosts, and also a lot of VSAN licenses (+maintenance costs). It may be smarter, from economical point of view, to invest in new equipment. Especially when you factor in operations…

Operations, indeed… what does it take / cost today to manage your environment “end to end”? Do you need specialized storage experts to operate your environment? Do you need to hire storage consultants to add more capacity? What about when things go bad, can you troubleshoot the environment by yourself? How about my compute layer, most blade environments offer centralized management for those 8 or 16 hosts. But can I reduce the number of physical hosts from 16 or 8 to for instance 5 with a slightly larger form factor? What would the management overhead be, if there is any? Each of these things need to be taken in to considerations and somehow quantified to compare.

Reality is that VSAN (and all other hyper-converged solutions) brings something new to the table, just like virtualization did years ago. These (hyper-converged) solutions are changing the way the game is played, so you better revise your play book!