ESX

Which Metrics does DRS use?

Duncan Epping · Oct 15, 2009 ·

I received a question a while back about DRS initiated VMotions. One of my customers wanted to know which metrics were used by DRS for deciding if a VM needs to be VMotioned to a different host or not. These metrics are:

Host CPU: Active (includes run and ready Mhz)

Host Memory: Active

Just a little something that’s nice to know I guess. I need to dive into the actual algorithm that is being used by DRS and if I can find some decent info and have some spare time on my hands I will definitely write an article about it.

Active / Standby etherchannels?

Duncan Epping · Oct 12, 2009 ·

I’ve seen this a couple of times already and just had a very long phone call with a customer who created the following set up:

So basically the first two nics are active with load balancing set to IP-Hash and configured as an Etherchannel on the stacked Cisco 3750’s. The second pair are “standby”. Also with load balancing set to IP-Hash and configured as a second Etherchannel on the stacked Cisco 3750’s. A diagram probably makes more sense:

Explanation: All NICs belong to the same vSwitch. Etherchannel 01 consist of “vmnic0” and “vmnic3” and both are active. Etherchannel 02 consists of “vmnic1” and “vmnic4” and both are standby.

My customer created this to ensure a 2Gb link is always available. In other words if “vmnic3” fails “vmnic1” and “vmnic4” should take over as they are a “pair”. But is this really what happens when “vmnic3” fails?

As you can clearly see, what they expected to happen did not happen. When “vmnic3” failed VMware ESX “promoted” the first standby NIC to active, which in this case belongs to a different Etherchannel. What happened next was not a pretty sight, mac-address table went completely nuts with “MACFLAPS” all over the place. I’m not a networking guy but I can tell you this introducing a loop when you configured portfast is not a smart idea. DON’T DO THIS AT HOME KIDS!

Disable your USB controller?

Duncan Epping · Sep 30, 2009 ·

While going through my email folders I noticed an email chain about disabling your USB controller to avoid avoid the well known IRQ sharing situation. This has been extensively described in KB Article 1003710. In short: IRQ Sharing limits interrupts to a single CPU. Normally this isn’t a problem but the Service Console is locked to CPU 0 which thus could lead to a bottleneck when using high interrupt devices.

With vSphere this isn’t the case anymore. As of vSphere the VMkernel is the proud owner of for instance your USB controller. So remember when you are doing an implementation or creating a design there’s no need to disable the USB controller anymore!

Re: VMFS-3, How Do I Despise Thee

Duncan Epping · Sep 30, 2009 ·

Jason Perlow writes for ZDNet.com a respectable online magazine. Jason wrote an “excellent” article about VMFS-3 and the fact that it was so hard to copy files from and to these volumes. (“VMware bad / Microsoft good” is the tone of the article.) Jason could only get the trick done by creating an NFS share, mounting it and then copy the data. He even experienced ESX crashes due to storage and network contention problems. (First time I ever heard this and it makes me wonder what kind of environment we are talking about here.) Scott Lowe already responded and at first I did not wanted to respond but it started itching again when one of my colleagues reminded me of this article today.

In the past I’ve used various methods to copy files to and from a VMFS-3 volume but after reading Jason’s article I started doubting myself. Did I really attach a USB drive, mounted the FAT32 partition and copied files or am I delusional? Is my vBrain playing tricks on me? No, I’m pretty sure it did work in the past. Other options I’ve used in the past are of course Veeam’s excellent tool FastSCP(Jason, try Google the next time!) and tools like WinSCP. Another option, and most definitely the best option when importing VM’s of course is the freely available VMware Converter, which everyone who is familiar with VMware probably knows and used at one point in time.

So lets repeat the options mentioned to do this simple task:

Mounting a FAT32 formatted USB Drive
FastSCP
WinSCP
VMware Converter
VMware vSphere/vCenter Client

I guess anyone can use option 2 to 5 but option 1 might be a bit more difficult for some as according to Jason it is impossible to mount a FAT32 formatted USB Drive. Hold on here we go and I promise it is going to be a bumpy ride:

Logon to ESX Host and insert USB Drive
Create a folder to mount the USB Drive to:
```
mkdir /mnt/usbdrive
```
Figure out which device to mount:
```
dmesg | tail
```
Mount the FAT32 formatted USB Drive:
```
mount /dev/sdc1 /mnt/usbdrive -t vfat
```
Do your magic

Just to make sure it worked I copied /etc/vmware/esx.conf to the USB Drive. Now lets copy the data to the VMFS volume…. hey that’s weird it works flawlessly. No bumps at all. I don’t know what went wrong in Jason’s situation / environment but ESX is perfectly capable of mounting FAT32 formatted USB Devices. Hey, if you want to you can even mount NTFS devices, although this is unsupported. But why take the difficult route? Why not hook your disks up to a workstation and install the free VMware Converter and import your VMs the easy way without worrying about mounting drives… Next time before you try to bash VMFS / VMware you might want to get your facts straight. Oh and by the way, it’s VMware and not VMWare.

What’s that ALUA exactly?

Duncan Epping · Sep 29, 2009 ·

Of course by now we have all read the excellent and lengthy posts by Chad Sakac on ALUA. I’m just a simple guy and usually try to summarize posts like Chad’s in a couple of lines which makes it easier for me to remember and digest.

First of all ALUA stands for “Asymmetric Logical Unit Access”. As Chad explains and as a google search shows it’s common for midrange arrays these days to have ALUA support. With midrange we are talking about EMC Clariion, HP EVA and others. My interpretation of ALUA is that you can see any given LUN via both storage processors as active but only one of these storage processors “owns” the LUN and because of that there will be optimized and unoptimized paths. The optimized paths are the ones with a direct path to the storage processor that owns the LUN. The unoptimized paths have a connection with the storage processor that does not own the LUN but have an indirect path to the storage processor that does own it via an interconnect bus.

In the past when you configured your HP EVA(Active/Active according to VMware terminology) attached VMware environment you would have had two(supported) options as pathing policies. The first option would be Fixed and the second MRU. Most people used Fixed however and tried to balance the I/O. As Frank Denneman described in his article this does not always lead to the expected results. This is because the path selection might not be consistent within the cluster and this could lead to path thrashing as one half of the cluster is accessing the LUN through storage processor A and the other half through storage processor B.

This “problem” has been solved with vSphere. VMware vSphere is aware of what the most optimal path is to the LUN. In other words VMware knows which processor owns which LUNs and sends traffic preferably directly to the owner. If the optimized path to a LUN is dead an unoptimized path will be selected and within the array the I/O will be directed via an interconnect to the owner again. The pathing policy MRU also takes optimized / unoptimized paths into account. Whenever there’s no optimized path available MRU will use an unoptimized path; when an optimized path returns MRU will switch back to the optimized path. Cool huh!?!

What does this mean in terms of selecting the correct PSP? Like I said you will have three options: MRU, Fixed and RR. Picking between MRU and Fixed is easy in my opinion as MRU is aware of optimized and unoptimized paths it is less static and error prone than Fixed. When using MRU however be aware of the fact that your LUNs need to be equally balanced between the storage processors, if they are not you might be overloading one storage processor while the other is doing absolutely nothing. This might be something you want to make your storage team aware off. The other option of course is Round Robin. With RR 1000 commands will be send down a path before it switches over to the next one. Although theoretically this should lead to a higher throughput I haven’t seen any data to back this “claim” up. Would I recommend using RR? Yes I would, but I would also recommend to perform benchmarks to ensure you are making the right decision.