cloud

VXLAN basics and use cases (when / when not to use it)

Duncan Epping · Nov 2, 2012 ·

I have been getting so many hits on my blog for VXLAN I figured it was time to expand a bit on what I have written about so far. My first blog post was about Configuring VXLAN, the steps required to set it up in your vSphere environment. As I had many questions about the physical requirements I followed up with an article about exactly that, VXLAN Requirements. Now I am seeing more and more questions around where and when VXLAN would be a great fit, so lets start with some VXLAN basics.

The first question that I would like to answer is what does VXLAN enable you to do?

In short, and I am trying to make it as simple as I possibly can here… VXLAN allows you to create a logical network for your virtual machines across different networks. More technically speaking, you can create a layer 2 network on top of layer 3. VXLAN does this through encapsulation. Kamau Wanguhu wrote some excellent articles about how this works, and I suggest you read that if you are interested. (VXLAN Primer Part 1, VXLAN Primer Part 2) On top of that I would also highly recommend Massimo’s Use Case article, some real useful info in there! Before we continue, I want to emphasize that you could potentially create 16 million networks using VXLAN, compare this to the ~4000 VLANs and you understand by this technology is important for the software defined datacenter.

Where does VXLAN fit in and where doesn’t it (yet)?

First of all, lets start with a diagram.

In order for the VM in Cluster A which has “VLAN 1” for the virtual machine network to talk to the VM in Cluster B (using VLAN 2) a router is required. This by itself is not overly exciting and typically everyone will be able to implement it by the use of a router or layer 3 switching device. In my example, I have 2 hosts in a cluster just to simplify the picture but imagine this being a huge environment and hence the reason many VLANs are created to restrict the failure domain / broadcast domain. But what if I want VMs in Cluster A to be in the same domain as the VMs in Cluster B? Would I go around and start plumbing all my VLANs to all my hosts? Just imagine how complex that will get fairly quickly. So how would VXLAN solve this?

Again, diagram first…

Now you can see a new component in there, in this case it is labeled as “vtep”. This stand for VXLAN Tunnel End point. As Kamau explained in his post, and I am going to quote him here as it is spot on…

The VTEPs are responsible for encapsulating the virtual machine traffic in a VXLAN header as well as stripping it off and presenting the destination virtual machine with the original L2 packet.

This allows you to create a new network segment, a layer 2 over layer 3. But what if you have multiple VXLAN wires? How does a VM in VXLAN Wire A communicate to a VM in VXLAN Wire B? Traffic will flow through an Edge device, vShield Edge in this case as you can see in the diagram below.

So how about applying this cool new VXLAN technology to an SRM infrastructure or a Stretched Cluster infrastructure? Well there are some caveats and constraints (right now) that you will need to know about, some of you might have already spotted one in the previous diagram. I have had these questions come up multiple times, so hence the reason I want to get this out in the open.

In the current version you cannot “stitch” VXLAN wires together across multiple vCenter Servers, or at least this is not supported.
In a stretched cluster environment a VXLAN implementation could lead to traffic tromboning.

So what do I mean with traffic tromboning? (Also explained in this article by Omar Sultan.) Traffic tromboning means that potentially you could have traffic flowing between sites because of the placement of a vShield Edge device. Lets depict it to make it clear, I stripped this to the bare minimum leaving VTEPs, VLANs etc out of the picture as it is complicated enough.

In this scenario we have two VMs both sitting in Site A, and cluster A to be more specific… even the same host! Now when these VMs want to communicate with each other they will need to go through their Edge device as they are on a different wire, represented by a different color in this diagram. However, the Edge device sits in Site B. So this means that for these VMs to talk to each other traffic will flow through the Edge device in Site B and then come back to Site A to the exact same host. Yes indeed, there could be an overhead associated with that. And with two VMs that probably is minor, with 1000s of VMs that could be substantial. Hence the reason I wouldn’t recommend it in a Stretched environment.

Before anyone asks though, yes VMware is fully aware of these constraints and caveats and are working very hard towards solving these, but for now… I personally would not recommend using VXLAN for SRM or Stretched Infrastructures. So where does it fit?

I think in this post there are already a few mentioned but lets recap. First and foremost, the software defined datacenter. Being able to create new networks on the fly (for instance through vCloud Director, or vCenter Server) adds a level of flexibility which is unheard of. Also those environments which are closing in on the 4000 VLAN limitation. (And in some platforms this is even less.) Other options are sites where each cluster has a given set of VLANs assigned but these are not shared across cluster and do have the requirement to place VMs across clusters in the same segment.

I hope this helps…

Database clustering support for vCloud Director added in version 5.1!

Duncan Epping · Oct 18, 2012 ·

Those who have been architecting vCloud Director environments from the early days know that this has always been a pain point. I personally have had many discussions with product management and engineering to get support for database clustering like Oracle RAC or Microsoft clustering services for MS SQL. Unfortunately neither 1.0 and 1.5 supported it. So the big questions always was, when will database clustering support for vCloud Director be added?

I had a couple of discussions around this again last week and noticed it was still not listed until someone pointed me to the vCAT 3.0 documents. Hidden on page 110 of document “3a Architecting a VMware vCloud.pdf” I found the following statement:

VMware vCloud component database resiliency is provided through database clustering. Microsoft Cluster Service for SQL and Oracle RAC are supported.

Yes I do realize that this is not a KB article, or even mentioned in the vCloud Director documentation. I have requested the docs to be revised and a KB to be created. Hopefully those will follow soon, for now this statement is all we needed! When the docs are revised or a KB is published I will add the references to this article.

<update – 18/Oct/2012> KB just got added – http://kb.vmware.com/kb/2037802 </update>

CloudPhysics adds functionality: VM reservations/limits and Snapshots

Duncan Epping · Oct 5, 2012 ·

CloudPhysics just announced two new cards. One card is titled “Snapshots Gone Wild Card“, the other is titled “VM Reservations & Limits Card“. This is the direct result of the contest that CloudPhysics held right before VMworld US. I guess that is the nice thing about being a start-up, being able to respond to community / customer requests quickly. However, it is also due to the nature of the CloudPhysics solution.

All the cards CloudPhysics will be offering are objects by itself, making it easy to add new cards or changing cards based on customer requests without the need to QA the whole platform. Flexibility / Agility right there.

So what exactly was added? The first card “Snapshots gone wild” is all about … yes you guessed it VMware snapshots. Which virtual machines have snapshots? How many snapshots? How old is the snapshot? That is the kind of data it reveals. Considering the many problems I have seen out in the field with snapshots, I would say that this is one you will want to check regularly.

The second card is all about VM Reservations and Limits. Frank and I wrote about this many times, and warned people about the impact many times. I guess most of you are aware of the impact by now, but you would be surprised to see what comes up when you run this card in your environment. I have done many many health checks in the past, and VM limits always kept popping up randomly. Definitely highly recommend to take a look at.

Of course besides these two new cards there are various others which are very useful like the Cluster Health card or the VMware Tools cards. I suggest you head over to CloudPhysics.com and sign up and give it a try.

VXLAN requirements

Duncan Epping · Oct 4, 2012 ·

When I was writing my “Configuring VXLAN” post I was trying to dig up some details around VXLAN requirements and recommendations to run a full “VMware” implementation. Unfortunately I couldn’t find much, or at least not a single place with all the details. I figured I would gather all I can find and throw it in to a single post to make it easier for everyone.

Virtual:

vSphere 5.1
vShield Manager 5.1
vSphere Distributed Switch 5.1.0
Portgroups will be configured by vShield Manager, recommend to use either “LACP Active Mode”, “LACP Passive Mode” or “Static Etherchannel”
- When “LACP” or “Static Etherchannel” (Cisco only) is configured note that a port/ether channel will need to be created on the physical side
- “Fail Over” is supported, but not recommended
- You cannot configure the portgroup with “Virtual Port ID” or “Load Based Teaming”, these are not supported
Requirement for MTU size of 1600 (Kamau explains why here)

Physical:

Recommend to have DHCP available on VXLAN transport VLANs, fixed IP also works though!
VXLAN port (UDP 8472) is opened on firewalls (if applicable)
Port 80 is opened from vShield Manager to the Hosts (used to download the “vib / agent”)
For Link Aggregation Control Protocol (LACP), 5- tuple hash distribution is highly recommended but not a hard requirement
MTU size requirement is 1600
Strongly recommended to have IGMP snooping enabled on L2 switches to which VXLAN participating hosts are attached. IGMP Querier must be enabled on router or L3 switch with connectivity to the multicast enabled networks when IGMP snooping is enabled.
If VXLAN traffic is traversing routers –> multicast routing must be enabled
- The recommended Multicast protocol to deploy for this scenario is Bidirectional Protocol Independent Multicast (PIM-BIDIR), since the Hosts act as both multicast speakers and receivers at the same time.

That should capture most requirements and recommendations. If anyone has any additions please leave a comment and I will add it.

** Please note, proxy arp is not a requirement for a VXLAN / VDS implementation, only when Cisco Nexus 1000v is used this is a requirement **

References:
VXLAN Primer by Kamau
vShield Administration Guide
Internal training ppt
KB 2050697 (note my article was used as the basis for this KB)

Limit the amount of eggs in a single basket through vSphere 5.1 DRS

Duncan Epping · Oct 1, 2012 ·

A while back I had discussion with someone and he asked me if it was possible to limit the amount of eggs in a single basket, in other words limit the amount of VMs per host. The reason this customer wanted to do this was to limit the impact of a failure. They had roughly 1500 VMs in their cluster and some hosts carried 50 VMs while other had 20 or 80. This is the nature of DRS though and totally expected.

If one of these hosts would fail, and lets say they had 80 VMs the impact of that would be substantial. To minimize the risk they wanted to limit the amount of VMs per host. I had thought about this before and had already asked the HA and DRS team if they could do anything around this. The DRS team started looking in to it and to my surprise they managed to get it in quick.

In VMworld 2012 session “VSP2825: DRS: Advanced Concepts, Best Practices and Future Directions” by Ajay Gulati and Aashish Parikh a solution is presented. (You can watch this session for free on youtube, highly recommended!) This solution is a new vSphere DRS advanced setting which is introduced in vSphere 5.1.

 LimitVMsPerESXHost

Note that when you configure this setting it might impact the performance of your virtual machines as it could limit the load balancing mechanism of your cluster. If you have no requirements to limit the amount of VMs per ESXi host, don’t do it. When this setting is configured, vSphere DRS will not allow migrations to a host which has reached the threshold and will also not admit new VMs to the host if it has reached the threshold.