Should I use many small LUNs or a couple large LUNs for Storage DRS?

At several VMUGs I presented a question that always came up was the following: “Should I use many small LUNs or a couple of large LUNs for Storage DRS? What are the benefits of either?”

I posted about VMFS-5 LUN sizing a while ago and I suggest reading that first if you haven’t yet, just to get some idea around some of the considerations taken when sizing datastores. I guess that article already more or less answers the question… I personally prefer many “small LUNs” than a couple of large LUNs, but let me explain why. As an example, lets say you need 128TB of storage in total. What are your options?

You could create 2x 64TB LUNs, 4x 32TB LUNs, 16x 8TB LUNs or 32x 4TB LUNs. What would be easiest? Well I guess 2x 64TB LUNs would be easiest right. You only need to request 2 LUNs and adding them to a datastore cluster will be easy. Same goes for the 4x 32TB LUNs… but with 16x 8TB and 32x 4TB the amount of effort increases.

However, that is just a one-time effort. You format them with VMFS, add the to the datastore cluster and you are done. Yes, it seems like a lot of work but in reality it might take you 20-30 minutes to do this for 32 LUNs. Now if you take a step back and think about it for a second… why did I wanted to use Storage DRS in the first place?

Storage DRS (and Storage IO Control for that matter) is all about minimizing risk. In storage, two big risks are hitting an “out of space” scenario or extremely degraded performance. Those happen to be the two pain points that Storage DRS targets. In order to prevent these problems from occurring Storage DRS will try to balance the environment, when a certain threshold is reached that is. You can imagine that things will be “easier” for Storage DRS when it has multiple options to balance. When you have one option (2 datastores – source datastore) you won’t get very far. However, when you have 31 options (32 datastores – source datastore) that increases the chances of finding the right fit for your virtual machine or virtual disk while minimizing the impact on your environment.

I already dropped the name, Storage IO Control (SIOC), this is another feature to take in to account. Storage IO Control is all about managing your queues, you don’t want to do that yourself. Believe me it is complex and no one likes queues right. (If you have Enterprise Plus, enable SIOC!) Reality is though, there are many queues in between the application and the spindles your data sits on. The question is would you prefer to have 2 device queues with many workloads potentially queuing up, or would you prefer to have 32 device queues? Look at the impact that this could have.

Please don’t get me wrong… I am not advocating to go really small and create many small LUNs. Neither am I saying you should create a couple of really large LUNs. Try to find the the sweetspot for your environment by taking failure domain (backup restore time), IOps, queues (SIOC) and load balancing options for Storage DRS in to account.

Be Sociable, Share!

    Comments

    1. says

      I always have such a different take on features (one of the great things about VMware is that folks can do different things with it, its flexible).

      I always envisioned Storage DRS about different arrays and different controllers or different costs of storage – I guess the same thing can be said for different LUNs but I felt like Storage DRS allowed the ability to have two different storage targets (and yes LUNs are different) but I meant an iSCSI / SATA pool, a iSCSI SAS pool, a 4/8 GBe FC / SAS pool, a 10 Gbe NFS / SAS pool and a SSD pool or two.

      And then based on business use, cost of the storage and IO requirements – the VM workloads could move around and the business could try to avoid spending money on VMs that don’t need expensive storage or try to fit to purpose the storage to the VM.

      • says

        I am not sure I am following you.

        Storage DRS is not designed to handle various tiers in a single datastore cluster. In other words, if you place various tiers (sata / fc / ssd) in a single datastore cluster it is difficult to control what ends up where.

    2. says

      Hi Duncan, good discussion, I think the question that also needs to be asked is what pool of physical disks the LUNs ultimately back onto and whether this is part of the same performance aggregate or pool. There is far less benefit in using Storage IO Control to load balance IO across LUNs ultimately backed by the same physical disks than load balancing across separate physical storage pools. When you may be thin provisioning the LUNs as well on the same disk pool behind the scenes even using initial placement means less when it is all the same storage.

      • says

        I agree that in a scenario like this that there is more to be discussed. But I am not sure I am following you with regards to SIOC and multiple spindles.

        Even if you have 1 large pool of disks shared by 10 LUNs SIOC will find the workload causing the problems, when it is a virtual workload on a SIOC enabled volume, and throttle it. SIOC is not about load balancing, it is about throttling and fairness of scheduling.

        If none of the virtual workloads is causing this latency then SIOC will simply back off. If one of your virtual workloads is causing the latency then SIOC will throttle it and ensure at least you virtual workloads will get the fair share they deserve.

        • says

          Hey Duncan, yes you are right (of course) and SIOC works in a queue based mechanism regardless of the underlying physical layout. Glad I’m learning more today! Although best to be turned on for all pools + associated datastores backed on the same disks, it measures latency and will apply the throttling to the VMs even if datastores not under SIOC are backed to the same disks.

          Thanks for the discussion!

    3. Jonathan Meier says

      I agree with Ducan that it is all about finding the sweet spot. Julian’s comment about the storage backend is exteremly valid. I find the additional overhead of having more luns to be cost benifical to the storage backend.

    4. Angelo says

      Thanks Ducan, however how does SIOC come in to play with devices like a VNX and FAST, my understanding is you shouldn’t enable SIOC in this case and SDRS should be set to manual, kinda defeats the purpose.

        • Angelo says

          Sorry Duncan I was referring to the VMware vSphere
          Storage DRS™ Interoperability doc pg. 6 Array-Based Auto-Tiering “VMware recommends configuring Storage DRS in manual mode with I/O metric disabled.”

          • Duncan says

            Which doesn’t mean SIOC doesn’t work. Just that moving VM based on IO metrics might be counteractive if FAST also moves block.

            • Angelo says

              Understood, but if we are trying to get best performance than SIOC should be disabled in this type of scenario? If not than there is gotta be a better guideline. Reason I’m asking is both vmware and emc vnx doc just say disable it and manual which to me is not the way I want to go.

            • Duncan says

              Not sure which doc you are refering to but i have never seen the recommendation to disable SIOC. I have seen the recommendation to disable IO load balancing for Storage DRS, these are two different things!

    5. Patrick says

      SIOC works good until you use an HP Lefthand array which can’t separate VMware LUNs from raw iSCSI LUNs.

    6. Drew Henning says

      Duncan, thanks for sharing your thoughts on datastore sizing. I’ve been thinking about this lately in my environment.

      Just curious if you have the same feelings for NAS vs block?

      I agree in taking into account the pool of disks the LUN/volume resides on. But it seems to me there are less queue’s for vSphere to account for in NFS. (maybe I’m wrong)

      Also, with NetApp, the datastore/volume is the de-dupe boundary. Datastore sizing can have an affect on de-dupe rates and data inflating/deflating as you move between datastores/volumes.