I really love the discussions going on in some of the blog postings. And some posts even trigger other bloggers to respond. Frank Denneman commented on my “Balancing LUN paths with Powershell” post and explained in short why load balancing your LUNs on some Active/Active SANs might not always lead to a performance increase. It even can lead to a performance decrease.
Frank was so kind to elaborate on why this exactly is some more on his own blog:
The arrays from the EVA family are AAA arrays. In an Asymmetric Active-Active both controllers are online and both can accept IO, but one controller is assigned as the preferred (owning) controller of the LUN. The owning controller can issue IO commands directly to the LUN. The non-owning controller, – or to make this text more legible – proxy controller can accept IO commands, but cannot communicate with the LUN. For example, if a read request reaches the array through the proxy controller, it will be forwarded to the owning controller of the LUN.
If the array detects in 60 minutes that at least 2/3 of the total read request to a LUN are proxy reads, ownership of the LUN is transitioned to the non-owning proxy controller. Making it the owning controller. Justin’s powershell script assigns the same path to every server the same way. This way the EVA should switch the managing controller within the hour. (If you have multiple ESX hosts run multiple VM’s on the LUN of course)
Now, you will probably say that this is just what should happen… But for LUNs replicated via HP’s Continuous Access this might be a problem. Go to Frank’s blog and read why…
I was just about to publish this article and noticed that Chad also wrote an article on this subject yesterday! Chad seems to be reading my mind.
An “Active/Passive” array using VMware’s definition would be a EMC CLARiiON, an HP EVA,NetApp FAS or LSI Engenio (rebranded as several IBM midrange platforms). These are usually called “Mid-range” arrays by the array vendors. It’s notable that all the array vendors (including EMC) call these “Active/Active” – so we have a naming conflict (hey… “SRM” to storage people means “Storage Resource Management” – not “Site Recovery Manager” 🙂 They are “Active/Active” in the sense that historically each head can carry an active workload on both “brains” (storage processor), but not for a single LUN. I say historically, because they can also support something called “ALUA”, or “Asymmetrix Logical Unit Access” – where LUNs that are “owned” by a storage processor can be access via ports from the other using an internal interconnect – each vendor’s implementation and internal interconnect varies. This is moot for the topic of loadbalancing a given LUN with ESX 3.5, though, because until the next major release, ALUA is not supported. I prefer to call this an “Active/Passive LUN ownership” array. The other big standout is that these “midrange” Active/Passive arrays lose half their “brains” (each vendor calls these something different) if one fails – so either you accept that and oversubscribe – accepting some performance degradation if you lose a brain (acceptable in many use cases), or use it to only 50% of it’s performance envelope.
Read Chad’s full article here cause there’s a lot of useful information in this post! Thanks Chad for clearing this up.