On April 19th I wrote about an issue with vSphere 5.1 and NFS based datastores APD ‘ing. People internally at VMware have worked very hard to root cause the issue and fix it. Log entries witnessed are:
YYYY-04-01T14:35:08.075Z: [APDCorrelator] 9414268686us: [esx.problem.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
YYYY-04-01T14:36:55.274Z: No correlator for vob.vmfs.nfs.server.disconnect
YYYY-04-01T14:36:55.274Z: [vmfsCorrelator] 9521467867us: [esx.problem.vmfs.nfs.server.disconnect] 192.168.1.1/NFS-DS1 12345678-abcdefg0-0000-000000000000 NFS-DS1
YYYY-04-01T14:37:28.081Z: [APDCorrelator] 9553899639us: [vob.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.
More details on the fix can be found here: http://kb.vmware.com/kb/2077360