Today I was fooling around with my new Lab environment when I noticed my Path Selection Policy (PSP) was set to fixed while the array (Clariion CX4-120) most definitely supports Round Robin (RR). I wrote about it in the past(1, 2) but as with vSphere 4.1 the commands slightly changed I figured it wouldn’t hurt to write it down again:
First I validated what the currently used Storage Array Type Plugin (SATP) was and which Path Selected Policy was used:
esxcli nmp device list
(note that compared to 4.1 the “storage” bit was added… yes a minor but important change!)
Than I wanted to make sure that every single LUN that would be added would get the standard PSP for Round Robin:
esxcli nmp satp setdefaultpsp --satp VMW_SATP_ALUA_CX --psp VMW_PSP_RR
Now I also needed to set the PSP per LUN, for which I used these two lines of “script”:
for i in `ls /vmfs/devices/disks | grep naa.600`; do esxcli nmp device setpolicy --device $i --psp VMW_PSP_RR;done
And I figured why not just set the number of IOps down to 1 as well just to see if it changes anything:
for i in `ls /vmfs/devices/disks/ | grep naa.600`; do esxcli nmp roundrobin setconfig --device $i --type "iops" --iops=1;done
Setting “iops=1” Didn’t make much difference for me, but it appears to be a general recommendation these days so I figured it would be best to include it.
Before I forget, I wanted to document this as well. For my testing I used the following command which lets you clone a VMDK and time it:
time vmkfstools -i source.vmdk destination.vmdk
And the result would look as follows:
Destination disk format: VMFS zeroedthick Cloning disk 'destination.vmdk'... Clone: 100% done. real 2m 9.67s user 0m 0.33s sys 0m 0.00s
Something that might be useful as well, timing the creation of a zeroedthick VMDK:
time vmkfstools -c 30G -d eagerzeroedthick newdisk.vmdk
I am using this to measure the difference between using and not using VAAI on a storage platform. It is a lot easier than constantly kicking off tasks in through vCenter. (Yes Alan and Luc I know it is way easier with PowerCLI.)
Excellent post Duncan.Here’s something we have had todo on storage when we configured Roundrobin PSP for multipathing.posting here as it may benefit the readers. The Failover Mode of the Clariion was changed from 1 to 4 using Unisphere(FLARE 30) Use navisphere for previous FLare versions .By Default the storage system is set to 1 failover mode By Changing the failover mode to 4 the CLARiiON now switchs to the ALUA (Asymmetric Logical Unit Access or Active/Active) mode.Thank You!
Duncan – do you know of a way to set the IOPS value using PowerCLI? Also (and possibly the same answer) curious how we’d go about setting IOPS setting on ESXi in future versions.
Thanks!
Philip,
See my response below.
Indeed, you should set the failover mode on the CLARiiON to 4 first) so it will do ALUA. This is needed before you can use the RR policy on CLARiiON. (You can use the Failover wizzrd in Navisphere or Unisphere to do this)
Default failover mode is still 1 (No ALUA, Active/Passive)
But Since the psp was fixed in your situation, the Failover Mode is probably already set to 4 on your CLARiiON
And regarding the iops, think you where the one of the ones advocating this did not make sense. Did you change your mind on this?
(See http://www.yellow-bricks.com/2010/03/30/whats-the-point-of-setting-iops1/)
No Duco, I still don’t really see the point. Most of these tests were conducted with a “limited” set of hosts and VMs, which I barely would dare to call real life. In most scenarios you more than likely have over 50 VMs running on just a couple of hosts. The amount of numbers and randomization that drives by itself. Let alone when talking 250+ VMs.
Duncan,
Changing Failover Mode on VNX/CX4, to enable ALUA, not only useful for changing PSP on RR, it also impacted on VAAI Full Copy operation,
On my blog i posted, how ALUA for VNX was impacted on VAAI’s operation (Full Copy). Full Copy not working if you didn’t enable Failover Mode 4 (ALUA Mode)
http://vmlab.ge/what-is-common-between-vaai-and-vnxcx4-alua-mode/
On the Celerra, enabling ALUA will double the number of possible paths (in conjunction with the –useANO 1 setting to double the number of active paths) which may not be necessary if the SPs are already multipathed.
Here’s how I did it using PowerCLI – http://www.gamersanon.com/?p=119
Essentially this is the command –
Get-VMHost | Get-ScsiLun –LunType “disk” | where {$_.MultipathPolicy –ne “RoundRobin”} | Set-ScsiLun –MultipathPolicy “RoundRobin”
That seems dangerous, wouldn’t that change local disks to RR as well which is probably not a supported configuration?
You can set the default PSP to round-robin for an SATP using the new get-esxcli command from powerCLI 4.1.1
(this example is for an EqualLogic array with VMW_SATP_EQL)
$esxCli = Get-EsxCli -Server
$esxCli.nmp.satp.setdefaultpsp(“VMW_PSP_RR”, “VMW_SATP_EQL”)
But this would require a reboot to come in effect for existing disks.
If you know the vendor ID of your disks you could run this command (example assumes they are Equallogic)
Get-VMHost | Get-ScsiLun –LunType “disk” | where {$_.Vendor –eq “EQLOGIC”} | Set-ScsiLun –MultipathPolicy “RoundRobin”
To find the vendor ID of your disks and other useful info run Get-VMHost | Get-ScsiLun –LunType “disk” | fl
Andrew,
Great point! We’ve never had an issue with that because we don’t use local disk for anything except the ESX install point, but if you were to use it for storing ISO’s or High Priority/Low Latency VM’s that could be an issue.
MMAGeek has the updated line which would provide a more granular approach at which disks you reconfigured for RR. I also edited the script at one point to use the canonical name since the local disks for us always started with naa.600 and our SAN disks were naa.609 but I can’t find that. Use MMAGeeks to accomplish it.
Yeah I just use the CanonicalName option.
foreach ($VMHost in $VMHosts)
{
$luns = $VMHost | get-scsilun -luntype disk -CanonicalName eui.001* | Sort-Object CanonicalName | Set-SCSILUN -MultipathPolicy “RoundRobin”
}
In my experience just “esxcli nmp satp setdefaultpsp” + a reboot is enough, you don’t need those for loops to set path policy on existing luns, except when you are not able to reboot.
I know, but I prefer not to reboot my hosts when I don’t have direct access to the datacenter.
Hi Duncan,
the code that you type is for ESX not ESXi/vMA.
I think that this command
$(esxcfg-scsidevs -c | awk ‘{ print $1 }’ | grep naa.600);
instead of the ls can work for the job.
aLex
In the Tech Support Mode the ls works fine for me on ESXi 4.1 Update 1
Good post, im still wondering on the whole “set IOPS=1” thing. What is the overhead on that?
I have seen it setup in iSCSI environments using Jumbo Frames where, in my opinion, it does not make sense at all.
I have also seen it in both Dell and HP papers, but they fail to explain me the “why” part.
What is you comment on this practice?
Most of the data I have seen around these tests were conducted with a few VMs on a 3 – 5 hosts of which a couple drive a lot of traffic. However in real life this looks totally different. So indeed, the question is will you benefit? Maybe, and maybe you won’t difficult to say.
Nvm…just saw the http://www.yellow-bricks.com/2010/03/30/whats-the-point-of-setting-iops1/
Pretty slick. Good vCalendar 3.0 material.
I’ve been using RR in combination with HP EVA storage since our fresh install of ESXi 40U1. (Did not change the iops setting).
After upgrading to 4.1 (via VUM), it seems that the Path Selection Policy settings are back to their default settings. So for VMW_SATP_ALUA the psp is reset to the default psp of MRU. Luckily it is now easy to change with Powershell.
Just came accross this searching for some information, but it can be done much more simply:
get-cluster “cluster_name” | get-vmhost | get-scsilun -canonicalname “naa.600*” | set-scsilun -MultiPathPolicy “RoundRobin”
No need for the foreach loop etc.
I’d be hesitant to run the above scripts in a production environment with a mix of VMware NMP and 3rd-party MPIO providers (e.g. PowerPath/VE). It may just throw an error when attempting to set the PSP on non-NMP devices, but I didn’t have a lab to try this on, and it’s an easy workaround anyway.
Get-VMHost | Get-ScsiLun -CanonicalName “naa.600009700001*” | ?{$_.MultipathPolicy -notmatch “RoundRobin|Unknown”} | Set-ScsiLun -MultipathPolicy “RoundRobin”
Hey thanks a Ton for the Script here, This was a lot helpful on one of My cases…