Per-volume management features in 4.x

Last week I noticed that one of the articles that I wrote in 2008 is still very popular. This article explains the various possible combinations of the advanced settings “EnableResignature” and “DisallowSnapshotLUN”. For those who don’t know what these options do in a VI3 environment; they allow you to access a volume which is marked as “unresolved” due to the fact that the VMFS metadata doesn’t match the physical properties of the LUN. In other words, the LUN that you are trying to access could be a Snapshot of a LUN or a copy (think replication) and vSphere is denying you access.

These advanced options where often used in DR scenarios where a fail-over of a LUN needed to occur or when for instance when a virtual machine needed to be restored from a snapshot of a volume. Many of our users would simply change the setting for either EnableResignature to 1 or for DisallowSnapshotLUN to 0 and force the LUN to be available again. Those readers who paid attention noticed that I used “LUN” instead of “LUNs” and here lies one of the problems…. These settings were global settings. Which means that ANY given LUN that was marked as “unresolved” would be resignatured or mounted. This unfortunately more than often lead to problems where incorrect volumes were mounted or resignatured. These volumes should probably have not have been presented in the first place but that is not the point. The point is that a global setting increases the chances that issues like these occur. With vSphere this problem was solved as VMware introduced “esxcfg-volume -r”.

This command enables you to resignature on a per volume basis…. Well not only that, “esxcfg-volume -m” enables you to mount volumes and “-l” is used to list volumes. Of course you can also do this through the vSphere client as well:

  • Click the Configuration tab and click Storage in the Hardware panel.
  • Click Add Storage.
  • Select the Disk/LUN storage type and click Next.
  • From the list of LUNs, select the LUN that has a datastore name displayed in the VMFS Label column and click Next.
    The name present in the VMFS Label column indicates that the LUN is a copy that contains a copy of an existing VMFS datastore.
  • Under Mount Options, select Assign a New Signature and click Next.
  • In the Ready to Complete page, review the datastore configuration information and click Finish.

But what if I do want to resignature all these volumes at once? What if you have a corner case scenario where this is a requirement? Well in that case you could actually still use the advanced features as they haven’t exactly disappeared, they have been hidden in the UI (vSphere Client) but they are still around. From the commandline you can still query the status:

esxcfg-advcfg -g /LVM/EnableResignature

Or you can change the global configuration option:

esxcfg-advcfg -s 1 /LVM/EnableResignature

Please note that the first step was hiding them, but they will be deprecated at some future release. It is recommended to use “esxcfg-volume” and resignature on a per volume basis.

Mythbusters: ESX/ESXi caching I/O?

We had a discussion internally about ESX/ESXi caching I/Os. In particular this discussion was around caching of writes  as a customer was concerned about consistency of their data. I fully understand that they are concerned and I know in the past some vendors were doing write caching however VMware does not do this for obvious reasons. Although performance is important it is worthless when your data is corrupt / inconsistent. Of course I looked around for  data to back this claim up and bust this myth once and for all. I found a KB article that acknowledges this and have a quote from one of our VMFS engineers.

Source Satyam Vaghani (VMware Engineering)
ESX(i) does not cache guest OS writes. This gives a VM the same crash consistency as a physical machine: i.e. a write that was issued by the guest OS and acknowledged as successful by the hypervisor is guaranteed to be on disk at the time of acknowledgement. In other words, there is no write cache on ESX to talk about, and so disabling it is moot. So that’s one thing out of our way.

Source – Knowledge Base
VMware ESX acknowledges a write or read to a guest operating system only after that write or read is acknowledged by the hardware controller to ESX. Applications running inside virtual machines on ESX are afforded the same crash consistency guarantees as applications running on physical machines or physical disk controllers.

Virtual Machine Storage and Snapshots Survey

It seems to be survey month…. Nevertheless this survey will take you a couple of minutes and is about Virtual Machine Storage and Snapshots. Most of our PMs are currently revising / updating and prioritizing the roadmap and real customer data and opinions are always welcome to define these. We would appreciate it if you could take 5 minutes of your time to complete this one, it is 12 questions.

Virtual Machine Storage and Snapshots Survey

ALUA and the useANO setting

Disclaimer: Now, lets make this very clear. Don’t touch “useANO” unless you are specifically instructed to do so, this article is just for educational purposes.

I had some issues in my lab with an ALUA array. (If you have no clue what ALUA is, read this post.) As you hopefully know with an ALUA array you typically have 4 paths. Two of these paths are marked within vCenter as “Active (I/O)” and the remaining two are marked as “Active”. The command-line interface describes this slightly better in my opinion as it says “Active” and “Active unoptimized”. So lets assume for you a second you would use Round Robin, vSphere is smart enough to only use the paths marked in vCenter as “Active (I/O)”.

During the discussions around the issues the setting “useANO” was dropped. I thought I knew what it did but during the discussion I started to doubt myself. So I did a quick search on the internet and noticed that not a lot of people actually knew what it stands for and what it does. I’ve seen people stating that paths are disabled or hidden… That is not the case at all. It’s not a magic setting. Lets start with explaining what it stands for.

useANO = Use Active-Non-Optimized

So in other words “useANO” allows you to enable the usage of Active-Non-Optimized paths. By default this is of course set to 0 as in a normal situation you wouldn’t want to use a non-optimized path as this would mean traffic would need to flow back to the owning processor. Chad Sakac made a nice diagram that depicts this scenario in his article on ALUA (must read!). Note that “SP B” is the processor that “owns” the LUN and the right path would typically be the path marked as “Active (I/O)”. The left path would be the “Active” path or less elegantly put the Non-Optimized path:

As you can understand having traffic flowing through a non-optimized path is normally something you would want to avoid as this will cause latency to go up. (more hops) This is a scenario of course that could happen when the path to the “owning” processor (SP B in the diagram) is unavailable for whatever reason…. You could also force this to happen by setting “useANO=1”.  That is what is does, it allows you to use non-optimized paths. For those who skipped it, please read the disclaimer at the top!

Enable TRIM on OS X 10.6.7

Thanks to Jason Nash’s article I managed to finally get TRIM enabled for my SSD on my MAC. The procedure works great, however when you have a dual SSD setup like I have (booting on a 120GB Intel and running my VMs on a 256GB Kingston) it doesn’t work as replacing the identifier leaves you with 1 SSD without TRIM support. I googled around for a bit and found this article by Oskar Groth. Oskar made a nice GUI tool that enables TRIM for you on all SSDs that your Mac contains! Be aware that this is a hack and there is no support whatsoever for this. So why would you add it? Well look at the test that Jason did, and I also did some testing and the preliminary findings are that it vastly improves your performance!