I’ve seen a lot of people picking up on the queuedepth settings lately, especially when there are QLogic adapters involved. Although it can be really beneficial to set the queuedepth to 64 it’s totally useless when one forgets about the “Disk.SchedNumReqOutstanding” setting. This setting always has to be aligned with the queuedepth because if the Disk.SchedNumReqOutstanding parameter is given a lower value than the queue depth, only that many outstanding commands are issued from the ESX kernel to the LUN from all virtual machines. In other words if you set a queuedepth of 64 and a Disk.SchedNumReqOutstanding of 16, only 16 commands get issued at a time to the LUN instead of the 64 your queuedepth is set to.
You can set Disk.SchedNumReqOutstanding via the command line and via VirtualCenter:
- VirtualCenter -> Configuration Tab -> Advanced Settings -> Disk -> Disk.SchedNumReqOutstanding
- Commandline -> esxcfg-advcfg -s 64 /Disk/SchedNumReqOutstanding
Disk.UseDeviceReset section is deprecated, see this article for more info.
Lukas Beeler says
Hi,
Could it be that you mixed up the default values in this sentence:
ESX defaults to Disk.UseLunReset=1 and Disk.UseDeviceReset=0
Because this is what you’re setting it to.
Regards,
Lukas
Sascha Reuter says
I checked my ESX 3.5u1 Servers and they have both device and lun-reset-parameters set to 1 (!!!). What’s up with that? I have not changed the parameters from the native install.
Duncan Epping says
Yeah you’re right Lukas, that’s a typo. Fixed it. It defaults to 1 – 1, and it should be 1 – 0 in a SAN environment.
Alastair says
Duncan,
First of all great site, great information.
I have a different apreciation of the purpose of the SchedNumReqOutstanding setting. I use it to ensure that a single high IO VM cannot swamp the HBA queue at the expense of other VMs on that LUN. In that case the SchedNumReqOutstanding is a per VM limit to the requests that can be sent to a particular LUN and should be less than the HBA queue depth so that other VMs can also queue requests.
The VMware DSA course also notes that the SchedNumReqOutstanding limit only apples when there is more than one VM per LUN. If there is only one VM on the LUN then the HBA queue depth applies.
Duncan Epping says
Well that’s another way to put it I guess. If you reach 64 mainly due to 1 VM that’s a great way to cap the VM, or probably the only way.
Edit:
Just been looking for more info and this pdf came along:
http://www.vmware.com/files/pdf/scalable_storage_performance.pdf
Virgil says
Yes it says:
Also make sure to set the Disk.SchedNumReqOutstanding parameter to the same value as the queue depth.
If this parameter is given a higher value than the queue depth, it is still capped at the queue depth. However,
if this parameter is given a lower value than the queue depth, only that many outstanding commands are
issued from the ESX kernel to the LUN from all virtual machines.
But what about the situation when you have multiple LUNs with multiple VMs.
Duncan Epping says
it’s 64 for each lun per vm when multiple vm’s access the lun.
Matt says
Do all the ESX servers need to be rebooted after?
Duncan Epping says
yes, and do a “esxcfg-boot -b” when you’ve applied the queuedepth settings!
Matt says
What does that command do?
Duncan says
it sets up the information required for booting which includes this parameter.
Matt says
I can run it and reboot at a later date correct?
Duncan says
yes you can, but i would recommend doing it asap.
Matt says
Thanks Duncan. Much appreciated.
Nick Triantos says
There’s much more that goes into setting the Queue Depth than making the assumption that a queue depth of X value on the host will be sufficient and will drive high I/O.
The fan-in ratio of host ports to a single Target port needs to be considered AND
The Queue depth on the Array target port side AND given that A/A multipathing is not officially supported, the Active path for each LUN should be balanced across all front end target ports.
If I have a hypotherical Target queue depth per port of 512 and thru that port I expose 4 LUNs (Datastores) to a 4 node ESX cluster with a host queue depth value of 64 on each. Each Host has an Active path thru the Target Port.
Then I can, potentially, end up with 4 x 4 x 64 = 1024 outstanding IOs at which point the Target port will issue a QFULL condition simply because it will be saturated.
The way FC drivers deal with such conditions is that they will throttle I/O significantly so the target queues have time to clear and thenthe initiator will gradually increase I/O again. The end results is significant latency and for some Operating systems (i.e AIX) this condition will result in an I/O error if 3 consecutive QFULL conditions occur for the same request.
I’ve written an article on Dynamic Queue Depth management and what NetApp has done to control, monitor and dynamically allocate and change queue slot allocations without having to touch the host queue depth value beyond the initial setup.
http://partners.netapp.com/go/techontap/matl/fc-sans.html
cheers
Duncan Epping says
Thanks for the excellent reply nick!! this is valuable info,
Paul Geerlings says
I see this is a two year old article, nevertheless still current on our site.
I read it and have a question (or two).
How can you raise one Disk.Sched setting without raising (or changing) the other 3 ?
I mean, when you raise the Disk.SchedNumReqOutstanding value are you not required to also raise 1 or more of the other Disk.Sched settings ?
If you, for example, look at the Disk.SchedQuantum default and Max values (8 and 64) they are comparable (factor 8) to the default and Max value of the Disk.SchedNumReqOutstanding (32 and 256). So if you double the one shouldn’t you be doubling the other too ?
The same goes for the Disk.SchedQControlSeqReqs, to be able to max out the outstanding commands shouldn’t you also double the default here?
These are the four Disk.Sched settings:
Disk.SchedNumReqOutstanding
Number of outstanding commands to a target with competing worlds [1-256: default = 32]: 32
Disk.SchedQuantum
Number of consecutive requests from one World [1-64: default = 8]: 8
Disk.SchedQControlSeqReqs
Number of consecutive requests from a VM required to raise the outstanding commands to max [0-2048: default = 128]: 128
Disk.SchedQControlVMSwitches
Number of switches between commands issued by different VMs required to reduce outstanding commands to SchedNumReqOutstanding [0-2048: default = 6]: 6
Lee Goodwin says
The explaination the vCentre gives for DSNRO is when two worlds are competing against the same resource (LUN) Datastor.
so what i have seen is drop the QD value and the DNSRO to the same as it will operate better that way
havin too much QD value will eat up availble port tags
so TAG count /LUNS = qdepth
But and a big but watch out if you add more LUN the tags avalible will drop and so you will swamp the port
8 -16 is fine