On the VMware Community Forums someone reported he was having issues unmounting datastores when vSphere HA was enabled. Internally I contacted various folks to see what was going on. The error that this customer was hitting was the following:
The vSphere HA agent on host '<hostname>' failed to quiesce file activity on datastore '/vmfs/volumes/<volume id>'
After some emails back and forth with Support and Engineering (awesome to work with such a team by the way!) the issue was discovered and it seems that in two separate instances issues were resolved that had to do with unmounting of datastores. Keith Farkas explained on the forums how you can figure out if you are hitting those exact problems or not and in which release they are fixed, but at I realize those kind of threads are difficult to find I figured I would post it here for future reference:
You can determine if you are encountering this issue by searching the VC log files. Find the task corresponding to the unmount request, and see if the follow error message is logged during the task’s execution (Fixed in 5.1 U1a) :
2012-09-28T11:24:08.707Z [7F7728EC5700 error 'DAS'] [VpxdDas::SetDatastoreDisabledForHACallback] Failed to disable datastore /vmfs/volumes/505dc9ea-2f199983-764a-001b7858bddc on host [vim.HostSystem:host-30,10.112.28.11]: N3Csi5Fault16NotAuthenticated9ExceptionE(csi.fault.NotAuthenticated)
While we are on the subject, I’ll also mention that there is another know issue in VC 5.0 that was fixed in VC5.0U1 (the fix is in VC 5.1 too). This issue related to unmounting a force mounted VMFS datastore. You can determine whether you are hitting this error by again checking the VC log files. If you see an error message such as the following with VC 5.0, then you may be hitting this problem. A work around, like above, is to disable HA while you unmount the datastore.
2011-11-29T07:20:17.108-08:00 [04528 info 'Default' opID=19B77743-00000A40] [VpxLRO] -- ERROR task-396 -- host-384 -- vim.host.StorageSystem.unmountForceMountedVmfsVolume: vim.fault.PlatformConfigFault:
Ken Werneburg says
Very interesting, wonder what the root cause is… I personally haven’t seen this in SRM that of course does a lot of unmounts, so maybe it is world specific?
i guess i am encountering the first issue in one of my VSphere Environment since updating to 5.1.
Normally the issue resolves itself when i try to delete the datastore a few minutes later, i have not yet looked into the issue in detail.
Excellent nugget as always. We have encountered this error in one of our environments as well. Usually a second attempt will result in a successful unmount. Since we have only unmounted vmfs 3 datatstores (in preparation to rebuild larger standardized luns formatted as vmfs 5) not sure how this work with unmounting vmfs 5 luns. Also, knowing where to log in the logs for future reference is a bonus! Thanks,
Jason B says
I’ve seen this during SRM and NetApp VSC Restore cleanup. VC 5.1.0b and ESXi 5.1 EP3. Will see if I can get time to upgrade to VC 5.1 U1a and report back.
carlos quintas says
@ jason B
Hi Jason, please report back after U1. I’m having the same problem with SRM on Vmax.