Of course by now we have all read the excellent and lengthy posts by Chad Sakac on ALUA. I’m just a simple guy and usually try to summarize posts like Chad’s in a couple of lines which makes it easier for me to remember and digest.
First of all ALUA stands for “Asymmetric Logical Unit Access”. As Chad explains and as a google search shows it’s common for midrange arrays these days to have ALUA support. With midrange we are talking about EMC Clariion, HP EVA and others. My interpretation of ALUA is that you can see any given LUN via both storage processors as active but only one of these storage processors “owns” the LUN and because of that there will be optimized and unoptimized paths. The optimized paths are the ones with a direct path to the storage processor that owns the LUN. The unoptimized paths have a connection with the storage processor that does not own the LUN but have an indirect path to the storage processor that does own it via an interconnect bus.
In the past when you configured your HP EVA(Active/Active according to VMware terminology) attached VMware environment you would have had two(supported) options as pathing policies. The first option would be Fixed and the second MRU. Most people used Fixed however and tried to balance the I/O. As Frank Denneman described in his article this does not always lead to the expected results. This is because the path selection might not be consistent within the cluster and this could lead to path thrashing as one half of the cluster is accessing the LUN through storage processor A and the other half through storage processor B.
This “problem” has been solved with vSphere. VMware vSphere is aware of what the most optimal path is to the LUN. In other words VMware knows which processor owns which LUNs and sends traffic preferably directly to the owner. If the optimized path to a LUN is dead an unoptimized path will be selected and within the array the I/O will be directed via an interconnect to the owner again. The pathing policy MRU also takes optimized / unoptimized paths into account. Whenever there’s no optimized path available MRU will use an unoptimized path; when an optimized path returns MRU will switch back to the optimized path. Cool huh!?!
What does this mean in terms of selecting the correct PSP? Like I said you will have three options: MRU, Fixed and RR. Picking between MRU and Fixed is easy in my opinion as MRU is aware of optimized and unoptimized paths it is less static and error prone than Fixed. When using MRU however be aware of the fact that your LUNs need to be equally balanced between the storage processors, if they are not you might be overloading one storage processor while the other is doing absolutely nothing. This might be something you want to make your storage team aware off. The other option of course is Round Robin. With RR 1000 commands will be send down a path before it switches over to the next one. Although theoretically this should lead to a higher throughput I haven’t seen any data to back this “claim” up. Would I recommend using RR? Yes I would, but I would also recommend to perform benchmarks to ensure you are making the right decision.
Kenneth van Ditmarsch says
Do you know how this was arranged in ESX3.5?
I’ve tested RR in an ESX3.5 environment a while ago (HP EVA) and saw that both paths were loaded equally.
Ex. without RR my IOPS were 1000 on the active HBA and with RR both HBA’s did 500.
I’m wondering how this “1000 commands-switch-over-policy” is working. (I know that the 1000-value is customizable.)
Question obviously is, what’s the average time it takes for the 1000 commands to be reached? If this would take for example 1 minute you would see 1 minute load on one HBA and one minute load on the second HBA, right?
Chad Sakac says
The IOoperationsLimit value (default of 1000) isn’t bad – and depending on your environment – can result in a very balanced config. As you increase LUN counts, and VM counts, it all averages out.
It’s only aggregated in the case (which is often what people test) – of a single datastore…..
How long it takes for the command limit to be reached depends on a million parameters….
Can anyone please comment on how ALUA is set up or enabled within an HP MSA 2012i SAN array??
Apparently this is now required because both HP and VMware will only support hardware iSCSI for this SAN.
I have asked to HP but not yet received an answer.
It’s really not practical for us to get a new SAN, sigh, the iSCSI cards are expensive but cheaper than any other alternatives.
Also, for the MSA 2012i, the SAN guide specifically says ALUA and MRU, I imagine this is the only supported configuration…
Thank you, Tom
Fom what I understand you cannot have path thrashing with an active/active SAN like the EVA.
You can only have path thrashing with an active/passive SAN like the IBM DS4000 with a policy path to FIXED. In the case of an active/passive like the IBM DS4000, the cache between the 2 SP (Storage Controller) is not synchronize like on a Netapp FAS3070. So, if you have 2 ESX host accessing the same LUN through a different SP, the ownership will flip between the SP. In some circonstance like high IO, the SP will not have the time to flush is cache before passing the ownership to the other SP.
In the case of an active/active SAN like Netapp FAS and probably the EVA (Not an EVA expert) the cache are synchronize, so 2 ESX hosts can access the same LUN through different SP without having path thrashing, BUT since one ESX host will not use the prefered path the data will needed to go through the inter-connect cable between the SP(Netapp), so the SP owner of the LUN can write the data to the disk. I did a benchmark on the performance impact of using the a non-prefered SP and I saw a degradation of 20% in my environment.
Duncan Epping says
Well an EVA is an asymmetric active/active. although you can access a LUN via both SPs, there’s only 1 SP that owns the LUN. In other words there’s a penalty for using the non-owner. Check Frank’s excellent article on this topic: http://www.frankdenneman.nl/?p=5
thanks for the article duncan! Is Alua just a specific protocol which is used with FC? Or is it also usable with iSCSI – independent of the array.
dan pritts says
iscsi arrays (e.g., fujitsu DX80) can be ALUA.
Does anyone know if in the case of an IBM DS3500 SAN that is active/passive initially, but with the most recent firmware adds the ALUA functionality, does this make it an active/active SAN?
I currently have MRU setup and if trying to split LUN paths between 2 controllers I am getting high latency occassionally causing vmware host failures.
Any info would be appreciated.