Queuedepth, and what’s next?

I’ve seen a lot of people picking up on the queuedepth settings lately, especially when there are QLogic adapters involved. Although it can be really beneficial to set the queuedepth to 64 it’s totally useless when one forgets about the “Disk.SchedNumReqOutstanding” setting. This setting always has to be aligned with the queuedepth because if the Disk.SchedNumReqOutstanding parameter is given a lower value than the queue depth, only that many outstanding commands are issued from the ESX kernel to the LUN from all virtual machines. In other words if you set a queuedepth of 64 and a Disk.SchedNumReqOutstanding of 16, only 16 commands get issued at a time to the LUN instead of the 64 your queuedepth is set to.

You can set Disk.SchedNumReqOutstanding via the command line and via VirtualCenter:

  1. VirtualCenter -> Configuration Tab -> Advanced Settings -> Disk -> Disk.SchedNumReqOutstanding
  2. Commandline -> esxcfg-advcfg -s 64 /Disk/SchedNumReqOutstanding

Another setting often overlooked is Disk.UseLunReset and/or Disk.UseDeviceReset. ESX defaults to Disk.UseLunReset=1 and Disk.UseDeviceReset=1. This means that when a SCSI bus is reset all SCSI reservations are cleared, not for a specific LUN but for the complete device. This is useful when one uses local storage, but within a VMware environment most companies utilize a SAN and you don’t want to disrupt the entire SAN when it’s not necesarry. You can set this via the commandline and via VirtualCenter:

  1. VirtualCenter -> Configuration Tab -> Advanced Settings -> Disk -> Disk.UseLunReset=1 , Disk.UseDeviceReset=0
  2. Commandline -> esxcfg-advcfg -s 1 /Disk/UseLunReset
    Commandline -> esxcfg-advcfg -s 0 /Disk/UseDeviceReset

17 Responses to “ Queuedepth, and what’s next? ”

  1. Hi,

    Could it be that you mixed up the default values in this sentence:

    ESX defaults to Disk.UseLunReset=1 and Disk.UseDeviceReset=0

    Because this is what you’re setting it to.

    Regards,

    Lukas

  2. I checked my ESX 3.5u1 Servers and they have both device and lun-reset-parameters set to 1 (!!!). What’s up with that? I have not changed the parameters from the native install.

  3. Yeah you’re right Lukas, that’s a typo. Fixed it. It defaults to 1 - 1, and it should be 1 - 0 in a SAN environment.

  4. More ESX Disk Tunning from Planet VMware…

    A small but nice set of recommendations above and beyond queue depth to optimize SAN performance under ESX…….

  5. Duncan,

    First of all great site, great information.

    I have a different apreciation of the purpose of the SchedNumReqOutstanding setting. I use it to ensure that a single high IO VM cannot swamp the HBA queue at the expense of other VMs on that LUN. In that case the SchedNumReqOutstanding is a per VM limit to the requests that can be sent to a particular LUN and should be less than the HBA queue depth so that other VMs can also queue requests.

    The VMware DSA course also notes that the SchedNumReqOutstanding limit only apples when there is more than one VM per LUN. If there is only one VM on the LUN then the HBA queue depth applies.

  6. Well that’s another way to put it I guess. If you reach 64 mainly due to 1 VM that’s a great way to cap the VM, or probably the only way.

    Edit:
    Just been looking for more info and this pdf came along:
    http://www.vmware.com/files/pdf/scalable_storage_performance.pdf

    Also make sure to set the Disk.SchedNumReqOutstanding parameter to the same value as the queue depth. If this parameter is given a higher value than the queue depth, it is still capped at the queue depth. However, if this parameter is given a lower value than the queue depth, only that many outstanding commands are issued from the ESX kernel to the LUN from all virtual machines. The Disk.SchedNumReqOutstanding setting has no effect when there is only one virtual machine issuing I/O to the LUN.

  7. Yes it says:

    Also make sure to set the Disk.SchedNumReqOutstanding parameter to the same value as the queue depth. 
    If this parameter is given a higher value than the queue depth, it is still capped at the queue depth. However, 
    if this parameter is given a lower value than the queue depth, only that many outstanding commands are 
    issued from the ESX kernel to the LUN from all virtual machines. 

    But what about the situation when you have multiple LUNs with multiple VMs.

  8. it’s 64 for each lun per vm when multiple vm’s access the lun.

  9. Do all the ESX servers need to be rebooted after?

  10. yes, and do a “esxcfg-boot -b” when you’ve applied the queuedepth settings!

  11. What does that command do?

  12. it sets up the information required for booting which includes this parameter.

  13. I can run it and reboot at a later date correct?

  14. yes you can, but i would recommend doing it asap.

  15. Thanks Duncan. Much appreciated.

  16. There’s much more that goes into setting the Queue Depth than making the assumption that a queue depth of X value on the host will be sufficient and will drive high I/O.

    The fan-in ratio of host ports to a single Target port needs to be considered AND
    The Queue depth on the Array target port side AND given that A/A multipathing is not officially supported, the Active path for each LUN should be balanced across all front end target ports.

    If I have a hypotherical Target queue depth per port of 512 and thru that port I expose 4 LUNs (Datastores) to a 4 node ESX cluster with a host queue depth value of 64 on each. Each Host has an Active path thru the Target Port.

    Then I can, potentially, end up with 4 x 4 x 64 = 1024 outstanding IOs at which point the Target port will issue a QFULL condition simply because it will be saturated.

    The way FC drivers deal with such conditions is that they will throttle I/O significantly so the target queues have time to clear and thenthe initiator will gradually increase I/O again. The end results is significant latency and for some Operating systems (i.e AIX) this condition will result in an I/O error if 3 consecutive QFULL conditions occur for the same request.

    I’ve written an article on Dynamic Queue Depth management and what NetApp has done to control, monitor and dynamically allocate and change queue slot allocations without having to touch the host queue depth value beyond the initial setup.

    http://partners.netapp.com/go/techontap/matl/fc-sans.html

    cheers

  17. Thanks for the excellent reply nick!! this is valuable info,

Leave a Reply