Layer 2 Adjacency for vMotion (vmkernel)

Duncan Epping · Aug 19, 2010 ·

Recently I had a discussion around Layer 2 adjacency for the vMotion(vmkernel interface) network. With that meaning that all vMotion interfaces, aka vmkernel interfaces, are required to be on the same subnet as otherwise vMotion would not function correctly.

Now I remember when this used to be part of the VMware documentation but that requirement is nowhere to be found anywhere. I even have a memory of documentation of the previous versions stating that it was “recommended” to have layer-2 adjacency but even that is nowhere to be found. The only reference I could find was an article by Scott Lowe where Paul Pindell from F5 chips in and debunks the myth, but as Paul is not a VMware spokes person it is not definitive in my opinion. Scott also just published a rectification of his article after we discussed this myth a couple of times over the last week.

So what are the current Networking Requirements around vMotion according to VMware’s documentation?

On each host, configure a VMkernel port group for vMotion
Ensure that virtual machines have access to the same subnets on source and destination hosts
Ensure that the network labels used for virtual machine port groups are consistent across hosts

Now that got me thinking, why would it even be a requirement? As far as I know vMotion is all layer three today, and besides that the vmkernel interface even has the option to specify a gateway. On top of that vMotion does not check if the source vmkernel interface is on the same subnet as the destination interface, so why would we care?

Now that makes me wonder where this myth is coming from… Have we all assumed L2 adjacency was a requirement? Have the requirements changed over time? Has the best practice changed?

Well one of those is easy to answer; no the best practice hasn’t changed. Minimize the amount of hops needed to reduce latency, is and always will be, a best practice. Will vMotion work when your vmkernels are in two different subnets, yes it will. Is it supported? No it is not as it has not explicitly gone through VMware’s QA process. However, I have had several discussions with engineering and they promised me a more conclusive statement will be added to our documentation and the KB in order to avoid any misunderstanding.

Hopefully this will debunk this myth that has been floating around for long enough once and for all. As stated, it will work it just hasn’t gone through QA and as such cannot be supported by VMware at this point in time. I am confident though that over time this statement will change to increase flexibility.

References:

Comments

RussellCorey says

19 August, 2010 at 16:58

Perhaps the requirement to have it for the vm itself to avoid a service interruption is being confused for the vmkernel requirement. After all if you vmotion the vm into a different broadcast domain you are effectively dropping it off the network.
Duncan Epping says

19 August, 2010 at 16:59

Yes that is an often made mistake indeed ….
Chuck says

19 August, 2010 at 17:36

Isn’t there a common default gateway for all vmkernel interfaces? If you have separate subnets for management, vmotion, and IP storage only one of those could be routed. This would be less restrictive in ESX since the vswif has its own gateway, but it seems like if you want to remove the layer-2 adjacency requirement you have to give up isolating VMotion traffic to its own VLAN.
Brandon says

19 August, 2010 at 18:52

I believe Chuck is spot on. What about esxi? Its management interface is also vmkernel, so if it needed to be on its own separated subnet away from vmotion traffic.. that would pretty much limit vmotions to being on a layer 2 isolated network since the gateway would be different otherwise. Same for FT and ISCSI/NFS as far as segregated layer 2 goes. Both ESX and ESXi have the same problems, but as Chuck pointed out; at least ESX lets you separate out the management functions with its own network. Lots of places have requirements that the management interface be separated.

I recently went through this with ESXi… and sure enough vmotions are forcibly unroutable in order to separate the management traffic.
NiTRo says

19 August, 2010 at 21:23

For an obscur reason coming from the networking team, i got a cluster where some esxi’s vmkernel ports are not on the same subnet than the others. vmotions are fine since months, only slower.
Victor Costa says

20 August, 2010 at 03:38

Well,
That must be right only for nodes members of same cluster… were DRS should move the VM’s all the time…
Let’s assume that we have 320 hosts, and are building an HUGE cloud…
We can split the 320 nodes in 10 clusters of 32 nodes, but this isn’t the ideal, so let’s split in 20 clusters of 16 nodes to permit growth… and one NAS will be adopted to permit mobility of vm’s across the cloud..( storage Vmotion to NAS + Machine Vmotion to destination cluster)..
Adopting this definitions, if all the clusters have the same vmotion subnet, the broadcast domain will be so HUGE, that could cause network constraints…
So, in this case, we can split the cloud in some sets of subnets, and the environment will be supported by vmware ? The mobility across the Cloud can be done ? we will need to be fenced to the cluster ? If the l2 is an explicit requirement, where is the mobility across datacenters in a SAN extended scenario ? and the “teleport” future solution ??

The way should be define the maximum latency for VMOTION…
Nathan says

20 August, 2010 at 04:08

Way back when, 2.5 days, when vmotion was new it was advised that vMotion must have its own dedicated network. even VMware reps here in OZ still tout this as good (best) practice as do I.

– The reason for this primarily are to isolate and minimise any possible disruption to the migration of the live state of the VM. see NiTro’s comment above regarding performance.

– This is unencrypted data and was always touted as a possible attack vector, ok this one is a bit thin but it is possible.

As also previously stated ESXi now supports only one gateway, so if you are following the old design rules you will have vmotion on a network for which you cannot configure a default gateway, even if the network is routable the vmotion interface will not be routable.
Duncan Epping says

20 August, 2010 at 07:34

@Chuck / @Brandon : Not sure what the problem would be as with
esxcfg-route you can just add a static route per network? so even if you have 10 different storage networks which need to be routed you could do it.

Even with ESXi you can do this @Nathan.

I do agree, that routing NFS/iSCSI/vMotion should never be recommended. Adding hops will increase latency etc.
Michael Poore says

20 August, 2010 at 10:15

Technologies such as Cisco’s OTV (http://www.cisco.com/en/US/prod/switches/ps9441/nexus7000_preso.html) could introduce some issues in the future to those that use it. Potentially the vmkernel subnet could be made to span multiple datacentres. Clarification from VMware with regards to support would be good in this area.

For information, I’m due to be working on a couple of projects in this area in the coming months and whilst the intention is not to configure vMotion between datacentres (the object of the first project is to simplify DR/BCP), there will be sufficient bandwidth to support LD vMotion although the latency will be another issue. If nothing else, it’ll be inetersting to experiment!
Brandon says

20 August, 2010 at 14:42

@Duncan, ah… I learn something new all the time. Just goes to show you what relying too much on the vSphere Client will do ;). I just checked and esxcfg-route is available via the vMA/vCLI. I do hope they add a way to do this to the GUI eventually, but that is a small gripe.

There is still some additional administration burden compared to using a gateway (and just one more thing to remember), but usually for these types of traffic you always know where it is going and it probably doesn’t change often. Besides, using layer 3 seems somewhat of an abnormality as every situation I’ve encountered for vmotion, iscsi, ft or whatever is always on its own unroutable vlan anyways.
Doug Baer says

21 August, 2010 at 03:21

I always looked at the L2 adjacency for VMotion as something that people inferred based on the fact that the same networks had to be available to the VM’s that were being VMotioned. As @Duncan and others here mentioned, L2 adjacency for that network had the added benefit of keeping the latency to a minimum. With all of the tricks, like OTV mentioned by @Michael Poore, to make things look like they are L2 adjacent these days, it is probably simpler just to route the traffic and eliminate the overhead.

I think people often misunderstand that a ‘default gateway’ is just the path of last resort. There is a lot more to IP routing than that little gem. 🙂
Andy Daniel says

17 March, 2011 at 16:23

Duncan,

Any update on QA for this by a chance? We’ve been doing long-distance vMotion over OTV without issue but are thinking about just removing the OTV layer if it isn’t required.

Thanks,

Andy
- John Kennedy says
  
  12 May, 2011 at 19:38
  
  Andy,
  Don’t go throwing out OTV just yet. The VMs still need L2 adjacency. Also, configuring the routing on the ESX server to allow the vMotion network to route can cause the console interfaces to _not_ route (you do have those on separate VLANs, don’t you?). If you want to inject some static routing into your systems, fine, but I am of the opinion that minimizing complexity is it’s own reward. And since you have the OTV product running anyway, it seems awkward to do your routing at the server level when you could be managing it centrally.

Related

Reader Interactions

Comments