Heartbleed Security Bug fixes for VMware

It seems to be patch Saturday as today a whole bunch of updates of products were released. All of these updates relate to the heartbleed security bug fix. There is no point in listing every single product as I assume you all know the VMware download page by now, but I do want to link the most commonly used for your convenience:

Time to update, but before you do… if you are using NFS based storage make sure to read this first before jumping straight to vSphere 5.5 U1a!

Alert: vSphere 5.5 U1 and NFS issue!

Some had already reported on this on twitter and the various blog posts but I had to wait until I received the green light from our KB/GSS team. An issue has been discovered with vSphere 5.5 Update 1 that is related to loss of connection of NFS based datastores. (NFS volumes include VSA datastores.)

*** Patch released, read more about it here ***

This is a serious issue, as it results in an APD of the datastore meaning that the virtual machines will not be able to do any IO to the datastore at the time of the APD. This by itself can result in BSOD’s for Windows guests and filesystems becoming read only for Linux guests.

Witnessed log entries can include:

2014-04-01T14:35:08.074Z: [APDCorrelator] 9413898746us: [vob.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
2014-04-01T14:35:08.075Z: [APDCorrelator] 9414268686us: [esx.problem.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
2014-04-01T14:36:55.274Z: No correlator for vob.vmfs.nfs.server.disconnect
2014-04-01T14:36:55.274Z: [vmfsCorrelator] 9521467867us: [esx.problem.vmfs.nfs.server.disconnect] 192.168.1.1/NFS-DS1 12345678-abcdefg0-0000-000000000000 NFS-DS1
2014-04-01T14:37:28.081Z: [APDCorrelator] 9553899639us: [vob.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.
2014-04-01T14:37:28.081Z: [APDCorrelator] 9554275221us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.

If you are hitting these issues than VMware recommends reverting back to vSphere 5.5. Please monitor the following KB closely for more details and hopefully a fix in the near future: http://kb.vmware.com/kb/2076392

 

Disk Controller features and Queue Depth?

I have been working on various VSAN configurations and a question that always comes up is what are my disk controller features and queue depth for controller X? (Local disks, not FC based…) Note that this is not only useful to know when using VSAN, but also when you are planning on doing host local caching with solutions like PernixData FVP or SanDisk FlashSoft for instance. The controller used can impact the performance, and a really low queue depth will result in a lower performance, it is as simple as that.

** NOTE: This post is not about VSAN disk controllers, but rather about disk controllers and their queue depth. Always check the HCL before buying! **

I have found myself digging through documentation and doing searches on the internet until I stumbled across the following website. I figured I would share the link with you, as it will help you (especially consultants) when you need to go through this exercise multiple times:

http://forums.servethehome.com/index.php?threads/lsi-raid-controller-and-hba-complete-listing-plus-oem-models.599/

Just as an example, the Dell H200 Integrated disk controller is on the VSAN HCL. According to the website above it is based on the LSI 2008 and provides the following feature set: 2×4 port internal SAS, no cache, no BBU, RAID 0, 1 and 10. According to the VSAN HCL also provides “Virtual SAN Pass-Through”. I guess the only info missing is queue depth of the controller. I have not been able to find a good source for this. So I figured I would make this thread a source for that info.

Before we dive in to that, I want to show something which is also important to realize. Some controllers take: SAS / NL-SAS and SATA. Although typically the price difference between SATA and NL-SAS is neglectable, the queue depth difference is not. Erik Bussink was kind enough to provide me with these details of one of the controllers he is using as an example, first in the list is “RAID” device – second is SATA and third SAS… As you can see SAS is the clear winner here, and that includes NL-SAS drives.

mpt2sas_raid_queue_depth: int
     Max RAID Device Queue Depth (default=128)
  mpt2sas_sata_queue_depth: int
     Max SATA Device Queue Depth (default=32)
  mpt2sas_sas_queue_depth: int
     Max SAS Device Queue Depth (default=254)

If you want to contribute, please take the following steps and report the Vendor, Controller type and aqlength in a comment please.

  1. Run the esxtop command on the ESXi shell / SSH session
  2. Press d
  3. Press f and select Queue Stats (d)
  4. The value listed under AQLEN is the queue depth of the storage adapter

The following table shows the Vendor, Controller and Queue Depth. Note that this is based on what we (my readers and I) have witnessed in our labs and results my vary depending on the firmware and driver used. Make sure to check the VSAN HCL for the supported driver / firmware version, note that not all controllers below are on the VSAN HCL, this is a “generic” list as I want it to serve multiple use cases.

Generally speaking it is recommended to use a disk controller with a queue depth > 256 when used for VSAN or “host local caching” solutions.

Vendor Disk Controller Queue Depth
Adaptec RAID 2405 504
Dell (R610) SAS 6/iR 127
Dell PERC 6/i 925
Dell PERC H200 Integrated 600
Dell PERC H310 25
Dell PERC H330 256
Dell (M710HD) PERC H200 Embedded 499
Dell (M910) PERC H700 Modular 975
Dell PERC H700 Integrated 975
Dell (M620) PERC H710 Mini 975
Dell (T620) PERC H710 Adapter 975
Dell (T620) PERC H710p 975
Dell PERC H810 975
HP Smart Array B110i 1020
HP Smart Array B120i 31
HP Smart Array P220i 1020
HP Smart Array P400i 128
HP Smart Array P410i 1020
HP Smart Array P420i 1011
HP Smart Array P440ar 1020
HP Smart Array P700m 1200
IBM ServeRAID-M5015 965
IBM ServeRAID-M5016 975
IBM ServeRAID-M5110 975
Intel C602 AHCI (Patsburg) 31 (per port)
Intel C602 SCU (Patsburg) 256
Intel RMS25KB040 600
LSI 2004 25
LSI 2008 25 / 600 (firmware dependent!)
LSI 2108 600
LSI 2208 600
LSI 2308 600
LSI 3008 600
LSI 9271-8i 975
LSI 9300-8i 600

Startup News Flash part 17

Number 17 already… A short one, I expect more news next week when we have “Storage Field Day”, hence I figured I would release this one already. Make sure to watch the live feed if you are interested in getting the details on new releases from companies like Diablo, SanDisk, PernixData etc.

Last week Tintri announced support for the Red Hat Enterprise Virtualization platform. Kind of surprising to see them selecting a specific linux vendor to be honest, but then again it probably also is the more popular option for people who want full support etc. What is nice in my opinion is that Tintri offers the exact same “VM Aware” experience for both platforms. Although I don’t see too many customers using both VMware and RHEV in production, it is nice to have the option.

CloudVolumes, no not a storage company, announced support for View 6.0. CloudVolumes developed a solution which helps you manage applications. They provude a central management solution, and the option to distribute and elimate the need for streaming / packaging. I have looked at it briefly and it is an interesting approach they take. I like how they solved the “layering” problem by isolating the app in its own disk container. It does make me wonder how this scales when you have dozens of apps per desktop, never the less an interesting approach worth looking in to.

Win a Jackery Giant backup battery, by just leaving a comment

**** CLOSED, WINNER = David ****

I was one of the lucky guys who won a price during the Top Bloggers award “ceremony”. Veeam was so kind enough to provide two of the exact same items so that every blogger who won a price could also give away a price to their readers. I am not going to make it more difficult than it needs to be. Leave a comment before Friday the 18th of April, make sure use your real email address in the form, and I will let my daughter pick a random winner on Saturday morning. I will update this blog post and inform the winner.

What can you win? (Funny, I was at the point of buying one of these myself as I always run out of battery on my phone and iPad during all-day events!)

Jackery Giant

– Large power capacity with 2.1A output
– The world’s most powerful external rechargeable battery
– 2.1A fast charging
– Size, style and speed make this most powerful external rechargeable battery to-date

This large capacity portable external battery has dual output ports and 10,400mAh for lengthening mobile device battery life up to 500% for smart phones. Its compact size and stylish design has three LED charge status indicators with a two LED flashlight for up to 700 hours of illumination.