Queuedepth, and what’s next?
I’ve seen a lot of people picking up on the queuedepth settings lately, especially when there are QLogic adapters involved. Although it can be really beneficial to set the queuedepth to 64 it’s totally useless when one forgets about the “Disk.SchedNumReqOutstanding” setting. This setting always has to be aligned with the queuedepth because if the Disk.SchedNumReqOutstanding parameter is given a lower value than the queue depth, only that many outstanding commands are issued from the ESX kernel to the LUN from all virtual machines. In other words if you set a queuedepth of 64 and a Disk.SchedNumReqOutstanding of 16, only 16 commands get issued at a time to the LUN instead of the 64 your queuedepth is set to.
You can set Disk.SchedNumReqOutstanding via the command line and via VirtualCenter:
- VirtualCenter -> Configuration Tab -> Advanced Settings -> Disk -> Disk.SchedNumReqOutstanding
- Commandline -> esxcfg-advcfg -s 64 /Disk/SchedNumReqOutstanding
Another setting often overlooked is Disk.UseLunReset and/or Disk.UseDeviceReset. ESX defaults to Disk.UseLunReset=1 and Disk.UseDeviceReset=1. This means that when a SCSI bus is reset all SCSI reservations are cleared, not for a specific LUN but for the complete device. This is useful when one uses local storage, but within a VMware environment most companies utilize a SAN and you don’t want to disrupt the entire SAN when it’s not necesarry. You can set this via the commandline and via VirtualCenter:
- VirtualCenter -> Configuration Tab -> Advanced Settings -> Disk -> Disk.UseLunReset=1 , Disk.UseDeviceReset=0
- Commandline -> esxcfg-advcfg -s 1 /Disk/UseLunReset
Commandline -> esxcfg-advcfg -s 0 /Disk/UseDeviceReset




July 21st, 2008 at 09:47
Hi,
Could it be that you mixed up the default values in this sentence:
ESX defaults to Disk.UseLunReset=1 and Disk.UseDeviceReset=0
Because this is what you’re setting it to.
Regards,
Lukas
July 21st, 2008 at 09:59
I checked my ESX 3.5u1 Servers and they have both device and lun-reset-parameters set to 1 (!!!). What’s up with that? I have not changed the parameters from the native install.
July 21st, 2008 at 10:13
Yeah you’re right Lukas, that’s a typo. Fixed it. It defaults to 1 - 1, and it should be 1 - 0 in a SAN environment.
July 21st, 2008 at 10:37
More ESX Disk Tunning from Planet VMware…
A small but nice set of recommendations above and beyond queue depth to optimize SAN performance under ESX…….
July 21st, 2008 at 22:23
Duncan,
First of all great site, great information.
I have a different apreciation of the purpose of the SchedNumReqOutstanding setting. I use it to ensure that a single high IO VM cannot swamp the HBA queue at the expense of other VMs on that LUN. In that case the SchedNumReqOutstanding is a per VM limit to the requests that can be sent to a particular LUN and should be less than the HBA queue depth so that other VMs can also queue requests.
The VMware DSA course also notes that the SchedNumReqOutstanding limit only apples when there is more than one VM per LUN. If there is only one VM on the LUN then the HBA queue depth applies.
July 21st, 2008 at 22:52
Well that’s another way to put it I guess. If you reach 64 mainly due to 1 VM that’s a great way to cap the VM, or probably the only way.
Edit:
Just been looking for more info and this pdf came along:
http://www.vmware.com/files/pdf/scalable_storage_performance.pdf
July 22nd, 2008 at 04:46
Yes it says:
Also make sure to set the Disk.SchedNumReqOutstanding parameter to the same value as the queue depth.
If this parameter is given a higher value than the queue depth, it is still capped at the queue depth. However,
if this parameter is given a lower value than the queue depth, only that many outstanding commands are
issued from the ESX kernel to the LUN from all virtual machines.
But what about the situation when you have multiple LUNs with multiple VMs.
July 22nd, 2008 at 12:37
it’s 64 for each lun per vm when multiple vm’s access the lun.
July 23rd, 2008 at 15:37
Do all the ESX servers need to be rebooted after?
July 23rd, 2008 at 15:59
yes, and do a “esxcfg-boot -b” when you’ve applied the queuedepth settings!
July 23rd, 2008 at 16:36
What does that command do?
July 23rd, 2008 at 18:55
it sets up the information required for booting which includes this parameter.
July 23rd, 2008 at 18:56
I can run it and reboot at a later date correct?
July 23rd, 2008 at 19:40
yes you can, but i would recommend doing it asap.
July 23rd, 2008 at 19:52
Thanks Duncan. Much appreciated.
July 25th, 2008 at 21:11
There’s much more that goes into setting the Queue Depth than making the assumption that a queue depth of X value on the host will be sufficient and will drive high I/O.
The fan-in ratio of host ports to a single Target port needs to be considered AND
The Queue depth on the Array target port side AND given that A/A multipathing is not officially supported, the Active path for each LUN should be balanced across all front end target ports.
If I have a hypotherical Target queue depth per port of 512 and thru that port I expose 4 LUNs (Datastores) to a 4 node ESX cluster with a host queue depth value of 64 on each. Each Host has an Active path thru the Target Port.
Then I can, potentially, end up with 4 x 4 x 64 = 1024 outstanding IOs at which point the Target port will issue a QFULL condition simply because it will be saturated.
The way FC drivers deal with such conditions is that they will throttle I/O significantly so the target queues have time to clear and thenthe initiator will gradually increase I/O again. The end results is significant latency and for some Operating systems (i.e AIX) this condition will result in an I/O error if 3 consecutive QFULL conditions occur for the same request.
I’ve written an article on Dynamic Queue Depth management and what NetApp has done to control, monitor and dynamically allocate and change queue slot allocations without having to touch the host queue depth value beyond the initial setup.
http://partners.netapp.com/go/techontap/matl/fc-sans.html
cheers
July 25th, 2008 at 21:17
Thanks for the excellent reply nick!! this is valuable info,