• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

vSphere 5.5 U1 patch released for NFS APD problem!

Duncan Epping · Jun 11, 2014 ·

On April 19th I wrote about an issue with vSphere 5.1 and NFS based datastores APD ‘ing. People internally at VMware have worked very hard to root cause the issue and fix it. Log entries witnessed are:

YYYY-04-01T14:35:08.075Z: [APDCorrelator] 9414268686us: [esx.problem.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
YYYY-04-01T14:36:55.274Z: No correlator for vob.vmfs.nfs.server.disconnect
YYYY-04-01T14:36:55.274Z: [vmfsCorrelator] 9521467867us: [esx.problem.vmfs.nfs.server.disconnect] 192.168.1.1/NFS-DS1 12345678-abcdefg0-0000-000000000000 NFS-DS1
YYYY-04-01T14:37:28.081Z: [APDCorrelator] 9553899639us: [vob.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed. 

More details on the fix can be found here: http://kb.vmware.com/kb/2077360

Related

Storage 5.5, apd, nfs, Storage, u1, vSphere

Reader Interactions

Comments

  1. Jeff says

    11 June, 2014 at 09:12

    Awesome work from vmware engineers as usual! Anxious to deploy this over the weekend.

  2. Admin says

    11 June, 2014 at 12:56

    Thanks for releasing the patch for an important bug after almost two months. Come on VMware, you can do better.

    Keeping enterprise customers in dark about the release detail will not make them happy.

    • Duncan Epping says

      11 June, 2014 at 14:38

      I recommend that you provide this feedback directly to your VMware pre-sales or sales contact. That way the people responsible will hear directly from customers how things like these are experienced. Thanks,

  3. Anthony Spiteri (@anthonyspiteri) says

    11 June, 2014 at 15:38

    Transparency on the root cause would be a favourable outcome given the time it took to resolve and the general hush hush nature of the problem.

    It otherwise causes unnecessary speculation.

    Glad to have the fix though…just in time for a platform upgrade.

  4. Andy says

    20 June, 2014 at 07:45

    I’ve patched our hosts and still getting the APDCorrelator errors and NFS datastores going offline for a minute or two in the vSphere Client.
    Not getting BSOD’s or issues with linux file systems just performance issues to the point where any applications that have to connect to DB’s are crashing…. Got a ticket with VMware and waiting to hear something.

    • Will says

      1 July, 2014 at 16:32

      that is worrying! we are planning to upgrade and we use NFS for almost everything. am I better staying on ESXi 5.5 GA I wonder!?

    • Jim says

      17 July, 2014 at 20:31

      My company too applied the patch and had APD occur again. We are using NetApp and found NetApp is still recommending the nfs max queuedepth = 64 as still being needed for NetApp.

      KB ID: 1014696 Version: 5.0 Published date: 07/11/2014
      https://kb.netapp.com/support/index?page=content&id=1014696&actp=LIST_RECENT
      VMware has published KB 2016122: NFS connectivity issues on NetApp NFS filers on ESXi 5.x and KB 2077360: VMware ESXi 5.5, Patch ESXi550-201406401-SG: Updates esx-base

      Their (VMware) claim is that there is a version of Data ONTAP that ‘resolves’ this issue
      The Data ONTAP upgrade referenced will ONLY prevent the TCP windowsize from dropping to 0, it will NOT resolve all APD issues
      Additionally, enabling SIOC will only ‘resolve’ the issue ‘after’ it begins occurring, it will not ‘prevent’ the issue from occurring.

      The only recommended way to resolve this is to limit the NFS maxqueuedepth to 64.

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

May 24th – VMUG Poland
June 1st – VMUG Belgium
Aug 21st – VMware Explore
Sep 20th – VMUG DK
Nov 6th – VMware Explore
Dec 7th – Swiss German VMUG

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2023 · Log in