• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Migrating to a VDS switch when using etherchannels

Duncan Epping · Feb 21, 2012 ·

Last week at PEX I had a discussion around migrating to a Distributed Switch (VDS) and some of the challenges one of our partners faced. During their migration they ran in to a lot of network problems, which made them decide to change back to a regular vSwitch. They were eager to start using the VDS but could not take the risk again to run in to the problems they faced.

I decided to grab a piece of paper and we quickly drew out the current implemented architecture at this customer site and discussed the steps the customer took to migrate. The steps described were exactly as the steps documented here and there was absolutely nothing wrong with it. At least not at first sight…. When we dived in to their architecture a bit more a crucial keyword popped up, ether channels. Why is this a problem? Well look at this process for a minute:

  • Create Distributed vSwitch
  • Create dvPortgroups
  • Remove vmnic from vSwitch0
  • Add vmnic to dvSwitch0
  • Move virtual machines to dvSwitch0 port group

Just imagine you are using an ether channel and traffic is being load balanced, but now you have one “leg” of the ether channel ending up in dvSwitch0 and one in vSwitch0. Yes not a pretty sight indeed. In this scenario the migration path would need to be:

  1. Create Distributed vSwitch
  2. Create dvPortgroups
  3. Remove  all the ports from the etherchannel configuration on the physical switch
  4. Change vSwitch load balancing from “IP Hash” to “Virtual Port ID”
  5. Remove vmnic from vSwitch0
  6. Add vmnic to dvSwitch0
  7. Move virtual machines to dvSwitch0 port group

For the second vmnic only steps 5 and 6 (and any subsequent NICs) would need to be repeated. After this the dvPor group can be configured to use “IP-hash” load balancing and the physical switch ports can be added to the etherchannel configuration again. You can repeat this for additional portgroups and VMkernel NICs.

I do want to point out, that I am personally not a huge fan of etherchannel configurations in virtual environments. One of the reason is the complexity which often leads to problems when things are misconfigured. An example of when things can go wrong is displayed above. If you don’t have any direct requirements to use IP-hash… use Load Based Teaming on your VDS instead, believe me it will make your life easier in the long run!

 

Related

Storage 5.0, distributed vswitch, vds, vSphere

Reader Interactions

Comments

  1. Manish says

    21 February, 2012 at 16:18

    LBT in junction with NIOC (if NIC supports the feature) will be the best combination on vDS rather than using IP Hash and most likely this situation happens on the Blades having only 2 x 10 GB NICs scenario. So Duncan has rightly suggested to use LBT on vDS.

  2. Sean says

    21 February, 2012 at 16:31

    Duncan,
    I encountered this exact problem when switching to VDS. I have 2 10Gb NICs teamed and passing traffic for both VMKernel and VMs in a DR site that is yet-to-be-complete (using SRM). As soon as I would migrate one vmnic to the VDS I would lose all communication with the host and its VMs. This was a particularly nasty problem because I first tried this on the host where the vCenter lived so you can imagine the kind of headache I had. For the uninitiated, you need vCenter to manage the VDS in ESXi 4.1. I had our networking guys test the ether channel on their end by shutting ports down and we never lost connectivity. I came to the conclusion that maybe we don’t need ether channel but have yet to convince everyone else. Too bad you didn’t write this about two weeks ago. 🙂

    At some point we will migrate to Nexus 1000V and I remember seeing in a feature matrix that LACP is fully supported, will ether/port channel on the switch be required for that set up? I tried engaging VMWare support and they sent me some Cisco links to read up on, but I really haven’t had the time to read them fully.

    Keep up the great work with the blog, you’re a permanent fixture in my RSS reader!

    Thanks,
    Sean

  3. Rob Quast says

    22 February, 2012 at 06:55

    I would also add the following step in between #2 and #3 “put host in maintenance mode”. Should be assumed but someone may overlook when reading. It’s pretty much a guarantee that some traffic wil drop between those two steps which would be disastrous for production traffic (especially iscsi/nfs).

    And if you are migrating the management portgroup you’ll need to put all but one vmnic in the unused state or there is a good chance you will lose management connectivity after step 2.

    I’ll echo the down with etherchannel notion. It has it’s place which is in the traditional network. I’m not sold on LBT for all traffic though.

    Thanks as always for the post.

  4. Michael Webster says

    26 February, 2012 at 13:31

    @Rob, The problem with having the host in Maintenance mode is that you can’t move the VM’s over to the vDS. This will cause you problems later if you only have 2 NIC’s in the hosts, ok if you’re only moving VMK ports. The Migrate Networking Wizard that is part of the vDS configuration will migrate VM’s without any interruption of service and is the supported method to do the migration. Of course based on this article you’ll be disabling Ether channel before doing it.

    I totally agree with Duncan. Ether channel doesn’t offer any benefit in the vast majority of cases and is overly complicated and can cause issues such as this. LBT and NIOC are far better and far simpler solutions.

    Thanks being said. @Sean the N1KV includes full LACP support and if I recall correctly 14 or 17 different load balancing algorithms. Even if you’re not on vSphere 5 this allows load balancing of vMotion Streams (if there is more than 1 concurrent stream). However any individual stream is limited to the capacity of a single link. But with hash based on src/dst IP and port and potentially MAC you get a pretty good balance in most cases.

  5. Shan says

    7 June, 2012 at 01:05

    Thanks a lot Duncan, I encountered the same issue with our new HP DL360 Gen8 with 10GbE FlexLOM. I was scratching my head for few days and luckily saw your article. Excellent post!!!

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

May 24th – VMUG Poland
June 1st – VMUG Belgium

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2023 · Log in