• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Queuedepth, and what’s next?

Duncan Epping · Jul 21, 2008 ·

I’ve seen a lot of people picking up on the queuedepth settings lately, especially when there are QLogic adapters involved. Although it can be really beneficial to set the queuedepth to 64 it’s totally useless when one forgets about the “Disk.SchedNumReqOutstanding” setting. This setting always has to be aligned with the queuedepth because if the Disk.SchedNumReqOutstanding parameter is given a lower value than the queue depth, only that many outstanding commands are issued from the ESX kernel to the LUN from all virtual machines. In other words if you set a queuedepth of 64 and a Disk.SchedNumReqOutstanding of 16, only 16 commands get issued at a time to the LUN instead of the 64 your queuedepth is set to.

You can set Disk.SchedNumReqOutstanding via the command line and via VirtualCenter:

  1. VirtualCenter -> Configuration Tab -> Advanced Settings -> Disk -> Disk.SchedNumReqOutstanding
  2. Commandline -> esxcfg-advcfg -s 64 /Disk/SchedNumReqOutstanding

Disk.UseDeviceReset section is deprecated, see this article for more info.

Related

Server ESX, Howto, Scripting, service console, VMware

Reader Interactions

Comments

  1. Lukas Beeler says

    21 July, 2008 at 09:47

    Hi,

    Could it be that you mixed up the default values in this sentence:

    ESX defaults to Disk.UseLunReset=1 and Disk.UseDeviceReset=0

    Because this is what you’re setting it to.

    Regards,

    Lukas

  2. Sascha Reuter says

    21 July, 2008 at 09:59

    I checked my ESX 3.5u1 Servers and they have both device and lun-reset-parameters set to 1 (!!!). What’s up with that? I have not changed the parameters from the native install.

  3. Duncan Epping says

    21 July, 2008 at 10:13

    Yeah you’re right Lukas, that’s a typo. Fixed it. It defaults to 1 – 1, and it should be 1 – 0 in a SAN environment.

  4. Alastair says

    21 July, 2008 at 22:23

    Duncan,

    First of all great site, great information.

    I have a different apreciation of the purpose of the SchedNumReqOutstanding setting. I use it to ensure that a single high IO VM cannot swamp the HBA queue at the expense of other VMs on that LUN. In that case the SchedNumReqOutstanding is a per VM limit to the requests that can be sent to a particular LUN and should be less than the HBA queue depth so that other VMs can also queue requests.

    The VMware DSA course also notes that the SchedNumReqOutstanding limit only apples when there is more than one VM per LUN. If there is only one VM on the LUN then the HBA queue depth applies.

  5. Duncan Epping says

    21 July, 2008 at 22:52

    Well that’s another way to put it I guess. If you reach 64 mainly due to 1 VM that’s a great way to cap the VM, or probably the only way.

    Edit:
    Just been looking for more info and this pdf came along:
    http://www.vmware.com/files/pdf/scalable_storage_performance.pdf

    Also make sure to set the Disk.SchedNumReqOutstanding parameter to the same value as the queue depth. If this parameter is given a higher value than the queue depth, it is still capped at the queue depth. However, if this parameter is given a lower value than the queue depth, only that many outstanding commands are issued from the ESX kernel to the LUN from all virtual machines. The Disk.SchedNumReqOutstanding setting has no effect when there is only one virtual machine issuing I/O to the LUN.

  6. Virgil says

    22 July, 2008 at 04:46

    Yes it says:

    Also make sure to set the Disk.SchedNumReqOutstanding parameter to the same value as the queue depth. 
    If this parameter is given a higher value than the queue depth, it is still capped at the queue depth. However, 
    if this parameter is given a lower value than the queue depth, only that many outstanding commands are 
    issued from the ESX kernel to the LUN from all virtual machines. 

    But what about the situation when you have multiple LUNs with multiple VMs.

  7. Duncan Epping says

    22 July, 2008 at 12:37

    it’s 64 for each lun per vm when multiple vm’s access the lun.

  8. Matt says

    23 July, 2008 at 15:37

    Do all the ESX servers need to be rebooted after?

  9. Duncan Epping says

    23 July, 2008 at 15:59

    yes, and do a “esxcfg-boot -b” when you’ve applied the queuedepth settings!

  10. Matt says

    23 July, 2008 at 16:36

    What does that command do?

  11. Duncan says

    23 July, 2008 at 18:55

    it sets up the information required for booting which includes this parameter.

  12. Matt says

    23 July, 2008 at 18:56

    I can run it and reboot at a later date correct?

  13. Duncan says

    23 July, 2008 at 19:40

    yes you can, but i would recommend doing it asap.

  14. Matt says

    23 July, 2008 at 19:52

    Thanks Duncan. Much appreciated.

  15. Nick Triantos says

    25 July, 2008 at 21:11

    There’s much more that goes into setting the Queue Depth than making the assumption that a queue depth of X value on the host will be sufficient and will drive high I/O.

    The fan-in ratio of host ports to a single Target port needs to be considered AND
    The Queue depth on the Array target port side AND given that A/A multipathing is not officially supported, the Active path for each LUN should be balanced across all front end target ports.

    If I have a hypotherical Target queue depth per port of 512 and thru that port I expose 4 LUNs (Datastores) to a 4 node ESX cluster with a host queue depth value of 64 on each. Each Host has an Active path thru the Target Port.

    Then I can, potentially, end up with 4 x 4 x 64 = 1024 outstanding IOs at which point the Target port will issue a QFULL condition simply because it will be saturated.

    The way FC drivers deal with such conditions is that they will throttle I/O significantly so the target queues have time to clear and thenthe initiator will gradually increase I/O again. The end results is significant latency and for some Operating systems (i.e AIX) this condition will result in an I/O error if 3 consecutive QFULL conditions occur for the same request.

    I’ve written an article on Dynamic Queue Depth management and what NetApp has done to control, monitor and dynamically allocate and change queue slot allocations without having to touch the host queue depth value beyond the initial setup.

    http://partners.netapp.com/go/techontap/matl/fc-sans.html

    cheers

  16. Duncan Epping says

    25 July, 2008 at 21:17

    Thanks for the excellent reply nick!! this is valuable info,

  17. Paul Geerlings says

    1 April, 2010 at 15:37

    I see this is a two year old article, nevertheless still current on our site.

    I read it and have a question (or two).

    How can you raise one Disk.Sched setting without raising (or changing) the other 3 ?
    I mean, when you raise the Disk.SchedNumReqOutstanding value are you not required to also raise 1 or more of the other Disk.Sched settings ?

    If you, for example, look at the Disk.SchedQuantum default and Max values (8 and 64) they are comparable (factor 8) to the default and Max value of the Disk.SchedNumReqOutstanding (32 and 256). So if you double the one shouldn’t you be doubling the other too ?

    The same goes for the Disk.SchedQControlSeqReqs, to be able to max out the outstanding commands shouldn’t you also double the default here?

    These are the four Disk.Sched settings:

    Disk.SchedNumReqOutstanding
    Number of outstanding commands to a target with competing worlds [1-256: default = 32]: 32

    Disk.SchedQuantum
    Number of consecutive requests from one World [1-64: default = 8]: 8

    Disk.SchedQControlSeqReqs
    Number of consecutive requests from a VM required to raise the outstanding commands to max [0-2048: default = 128]: 128

    Disk.SchedQControlVMSwitches
    Number of switches between commands issued by different VMs required to reduce outstanding commands to SchedNumReqOutstanding [0-2048: default = 6]: 6

  18. Lee Goodwin says

    9 August, 2012 at 01:43

    The explaination the vCentre gives for DSNRO is when two worlds are competing against the same resource (LUN) Datastor.

    so what i have seen is drop the QD value and the DNSRO to the same as it will operate better that way

    havin too much QD value will eat up availble port tags
    so TAG count /LUNS = qdepth

    But and a big but watch out if you add more LUN the tags avalible will drop and so you will swamp the port

    8 -16 is fine

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

May 24th – VMUG Poland
June 1st – VMUG Belgium

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2023 · Log in