When I was writing my “Configuring VXLAN” post I was trying to dig up some details around VXLAN requirements and recommendations to run a full “VMware” implementation. Unfortunately I couldn’t find much, or at least not a single place with all the details. I figured I would gather all I can find and throw it in to a single post to make it easier for everyone.
Virtual:
- vSphere 5.1
- vShield Manager 5.1
- vSphere Distributed Switch 5.1.0
- Portgroups will be configured by vShield Manager, recommend to use either “LACP Active Mode”, “LACP Passive Mode” or “Static Etherchannel”
- When “LACP” or “Static Etherchannel” (Cisco only) is configured note that a port/ether channel will need to be created on the physical side
- “Fail Over” is supported, but not recommended
- You cannot configure the portgroup with “Virtual Port ID” or “Load Based Teaming”, these are not supported
- Requirement for MTU size of 1600 (Kamau explains why here)
Physical:
- Recommend to have DHCP available on VXLAN transport VLANs, fixed IP also works though!
- VXLAN port (UDP 8472) is opened on firewalls (if applicable)
- Port 80 is opened from vShield Manager to the Hosts (used to download the “vib / agent”)
- For Link Aggregation Control Protocol (LACP), 5- tuple hash distribution must be enabled
- MTU size requirement is 1600
- Strongly recommended to have IGMP snooping enabled on L2 switches to which VXLAN participating hosts are attached. IGMP Querier enabled on router or L3 switch with connectivity to the multicast enabled networks.
- If VXLAN traffic is traversing routers –> multicast routing must be enabled
-
The recommended Multicast protocol to deploy for this scenario is Bidirectional Protocol Independent Multicast (PIM-BIDIR), since the Hosts act as both multicast speakers and receivers at the same time.
-
That should capture most requirements and recommendations. If anyone has any additions please leave a comment and I will add it.
** Please note, proxy arp is not a requirement for a VXLAN / VDS implementation, only when Cisco Nexus 1000v is used this is a requirement **
References:
VXLAN Primer by Kamau
vShield Administration Guide
Internal training ppt


Is there any documentation available that specifies why Virtual Port ID and LBT are not supported?
Not that I have seen. By the way, when configuring VXLAN they are not even presented as an option.
Right. I think the confusion lies with LACP is tied to Uplink Port Group whereas the failover mode for the VXLAN port group is initially set to “Use explicit failover order” when created. If I enable LACP on the Uplink Port Group, then I’d want to update the VXLAN port group to use “Route based on IP Hash”.
I’m curious is LACP absolutely required on the Uplink Port Profile, because so far, there have been no warnings saying this needs to be enabled when setting up vCloud.
@JustinG: Not supported ( as in not an option) as Virtual Port ID and LBT only applies when you have more than one active NIC attached to a portgroup. You do have to run IP Hash load balancing when using teamed NICs though to get a good distribution of traffic across your team.
@Duncan, what i don’t understand yet is that as per the requirement for all the vCloud Networking and security editions is that enterprise plus is a major requirement. although some customers might have the enterprise plus editions to make the software services works. other customers might have standard or enterprise editions. i am saying this because all of the documentations are talking about the distributed switch requirement and setup.
i am just thinking about the support compatibility.
Distributed vSwitch is a requirement for VXLAN –> so you need Enterprise Plus. Not sure what the confusion is?
Do you know what the fallback/consequences may be with upping the MTU to 1600 on all switches? Consider a flat network and only 1 VLAN and subnet. You’d have to up the MTU on all production switches and firewall routers on both sides (each datacenter) – I feel like this may cause some packet fragmentation with all the other devices (pc, thin client, printers) that share some of these switches/firewalls.
I am not sure I am following it. But you can set MTU on a port level. I also think it is very unlikely that a datacenter with a flat VLAN and a single subnet is going to implement VXLAN. VXLAN is targeted at large environments.
Some more details about fragmentation can be found here: https://supportforums.cisco.com/thread/2062337
Well the gateway port on the firewall coming into the network will have to be 1600 – a lot of data passes through this port right? Both originating from user traffic and VXLAN traffic. We are a small shop with alot of sensitive data so this is kind of our setup and we will be implementing VXLAN.
Another requirement if VXLAN is traversing routers (providing L2 adjacency over L3 networks) at least today, is that you enable Proxy ARP on the first hop routers. This is because the VXLAN does not use the host routing table to ensure the VXLAN vmknic is used for traffic between VTEPs.
After some internal discussion and testing I have to update the statement above. Proxy ARP is only required in the Nexus 1000v implementation of VXLAN. VXLAN on a VMware VDS does not require Proxy ARP if VTEPs are in different L2 networks.
Thanks Ray!
I guess a better question is – can this even be used over a VPN?
I am not sure I am following it. You want to do VXLAN from where to where across what using what?
Assume my stretched cluster was from OfficeA to an Colo over 100mbit link – but the link is not Metro-E it’s a VPN.
VXLAN offers the following benefits:
Flexibility: Datacenter server and storage utilization and flexibility is maximized through the support of “stretched clusters” that cross switching and pod boundaries
Streamlined Network Operations: VXLAN runs on standard Layer 3 IP networks, eliminating the need to build and manage a large Layer 2 underlying transport layer.
Investment Protection: VXLAN runs over standard switching hardware, with no need for software upgrades or special code versions on the switches.
Hi Duncan;
Correct me if I’m wrong but the recommendation to have DHCP on the VXLAN transport VLANS is only a requirement of the Vmkernel interfaces (vmknics) that get auto-magically added to each ESXi Host participating in the VXLAN. Correct?
Also, the VXLAN attributes, such as “LCAP”, “Static Etherchannel” and “Failover” affect the “vDS” Teaming Policy (not Port group) and are configured within vShield Manager whereas the requirement to use “explicit failover order” or “route based on IP hash” is really part of the “Port Group” Teaming and Failover Policy and is configured via the Web/vSphere Client interface only..
Thx
Hi Ron,
With regards to DHCP you are correct, this is for the vmknics that carry the VXLAN traffic. They are configured with DHCP by default, but static also works… it is just a matter of correcting the config at that point.
With regards to the teaming policy, the selected teaming policy (lacp / static / failover) dictates (as far as I have seen) how the VDS is configured. I will validate that tomorrow when I have the time.
Meaning you fix it directly on the vmkernel interface… right? Just checking — I’m configuring this now, and want to make sure I understand. We’re not going to do cross-site VXLAN (yet) or probably ever — so to get all these things talking, its probably best to use a /8 network since we have the freedom of doing it that way. In our case, it doesn’t need to be routable…
Hi again Duncan;
I have one more question related to VXLAN requirements that I’m hoping you can answer as well.
When values for the “Pool of Segment IDs” and “Multicast Address Range” are identified within vShield Manager, can this be an arbitrary pool and range or do “static/pre-determined” entries need to be identified by the Network Ops Team? Reason I ask is because we, as vCloud Admins, do not have visibility or access to the physical network and merely want to ensure we are aware of and communicate all backend requirements in advance.
Thx.
Hi Duncan, another invaluable article, thanks.
I’ve just configured VXLAN on our setup, a couple of observations/comments:
1. We are using a pair of 10Gb adapters for our dVS which also carry iSCSI. As the requirement on an iSCSI vmk is to only have a single NIC, this means our dVS/adapters cannot participate in any kind of switch adapter teaming. Therefore the *only* possible configuration is “Fail Over”. I do hope it stays as supported
2. Why DHCP on the new vmk used for transport? DHCP doesn’t really go hand in hand with datacentre infrastructure, and what if your DHCP server was unavailable? I have configured static IP, again I do hope I don’t get caught out !
1) as far as I am aware “failover” will stay supported
2) this is the default they selected, but if no DHCP is available fixed is fully supported / tested and should not give any problems
Many thanks. I just found an update from Kendrick “I had some DHCP lease issues today. Changing them to Static fixed the issue right up.”.
Static addressing FTW !
Great post Duncan. Just a couple points wanted to throw out there. An all software based VXLAN, certainly doable, does cause a level degradation in performance (Look for the VXLAN VMware ESX performance evaluation paper by VMware) and hence in some cases it makes perfect sense to introduce hardware VTEPs. Arista 7150 series switches support VXLAN & provide two fold benefits. 1. Offloading compute resource consumption from ESX hosts & 2. bringing bare metal servers, appliances etc that are non-vxlan aware into the vxlan domain by serving as a gateway.
Interesting joint demo between VMware & Arista at VMworld (http://www.aristanetworks.com/media/system/pdf/VMworld_Demo_Brief.pdf)
Hi Duncan.
Here’s a good’un. When you create a virtual wire you’re allowed to name it “nicely”, then a dvs portgroup is created based on that name. You can later on rename the virtual wire if you wish inside VCNS. I have been asked whether its OK to rename the dvs portgroups though. Certainly in my testing I’ve found no issues with renaming them and it makes life for vSphere Admins when choosing networks to hook vNICs upto. But I wondered if you had any thoughts on this like “Hell no!! Leave them alone!!”