dvswitch

Configuring VXLAN…

Duncan Epping · Oct 3, 2012 ·

Yesterday I got an email about configuring VXLAN. I was in the middle of re-doing my lab so I figured this would be a nice exercise. First I downloaded vShield Manager and migrated from regular virtual switches to a Distributed Switch environment. I am not going to go in to any depth around how to do this, this is fairly straight forward. Just right click the Distributed Switch and select “Add and Manage Hosts” and follow the steps. If you wondering what the use-case for VXLAN would be I recommend reading Massimo’s post.

VXLAN is an overlay technique and encapsulates layer 2 in layer 3. If you want to know how this works technically you can find the specs here. I wanted to create a virtual wire in my cluster. Just assume this is a large environment, I have many clusters and many virtual machines. In order to provide some form of isolation I would need to create a lot of VLANs and make sure these are all plumbed to the respective hosts… As you can imagine, it is not as flexible as one would hope. In order to solve this problem VMware (and partners) introduced VXLAN. VXLAN enables you to create a virtual network, aka a virtual wire. This virtual wire is a layer 2 segment and while the hosts might be in different networks the VMs can still belong to the same.

I deployed the vShield virtual appliance as this is a requirement for using VXLAN. After deploying it you will need to configure the network. This is fairly simple:

Login to the console of the vShield Manager (admin / default)
type “enable” (password is “default”)
type “setup” and provide all the required details
log out

Now the vShield Manager virtual appliance is configured and you can go to “https://<ip addres of vsm>/. You can login using admin / default. Now you will need to link this vShield Manager to vCenter Server:

Click “Settings & Reports” in the left pane
Now you should be on the “Configuration” tab in the “General” section
Click edit on the “vCenter Server” section and fill out the details (ip or hostname / username / password)

Now you should see some new shiny bright objects in the left pane when you start unfolding:

Now lets get VXLAN’ing

Click your “datacenter object” (in my case that is “Cork”)
Click the “Network virtualization” tab
Click “Preparation” –> “Connectivity“
Click “Edit” and tick your “cluster(s)” and click “Next“
I changed the teaming policy to “failover” as I have no port channels configured on my physical switches, depending on your infra make the changes required and click “finish“

An agent will now be installed on the hosts in your cluster. This is a “vib” package that handles VXLAN traffic and a new vmknic is created. This vmknic is created with DHCP enabled, if needed in your environment you can change this to a static address. Lets continue with finalizing the preparation.

Click “Segment ID“
Enter a pool of Segment IDs, note that if you have multiple vSMs this will need to be unique as a segment ID will be assigned to a virtual wire and you don’t want virtual wires with the same ID. I used “5000 – 5900”
Fill out the “Multicast address range“, I used 225.1.1.1-225.1.4.254

Now that we have prepped the host we can begin creating a virtual wire. First we will create a network scope, the scope is the boundary of your virtual network. If you have 5 clusters and want them to have access to the same virtual wires you will need to make them part of the same network scope

Click “network scopes“
Click the “green plus” symbol to “add a network scope“
Give the scope a name and select the clusters you want to add to this network scope
Click “OK“

Now that we have defined our virtual network boundaries aka “network scope” we can create a virtual wire. The virtual wire is what it is all about, a “layer 2” segment.

Click “networks“
Click the “green plus” symbol to “create a VXLAN network“
Give it a name
Select the “network scope“

In the example below you see two virtual wires…

Now you have created a new virtual wire aka VXLAN network. You can add virtual machines to it by simply selecting the network in the NIC config section. The question of course remains, how do you get in / out of the network? You will need a vShield Edge device. So lets add one…

Click “Edges“
Click the “green plus” symbol to “add an Edge“
Give it a name
I would suggest, if you have this functionality, to tick the “HA” tickbox so that Edge is deployed in an “active/passive” fashion
Provide credentials for the Edge device
Select the uplink interface for this Edge
Specify the default gateway
Add the HA options, I would leave this set to the default
And finish the config

Now if you had a virtual wire, and it needed to be connected to an Edge (more than likely) make sure to connect the virtual wire to the Edge by going back to “Networks”. Select the wire and then the “actions dial” and click “Connect to Edge” and select the correct edge device.

Now that you have a couple of wires you can start provisioning VMs or migrating VMs to them. Simply add them to the right network during the provisioning process.

Digging deeper into the VDS construct

Duncan Epping · Feb 23, 2012 ·

The following comment was made on my VDS blog and I figured while would investigate this a bit further:

It seems like the ESXi host only tries to sync the vDS state with the storage at boot and never again afterward. You would think that it would keep trying, but it does not.

Now lets look at the “basics” first. When an ESXi host boots it will get the data required to recreate the VDS structure locally by reading /etc/vmware/dvsdata.db and from esx.conf. You can view the dvsdata.db file yourself by doing:

net-dvs -f /etc/vmware/dvsdata.db

But is that all that is used? If you check the output of that file you will see that all data required for a VDS configuration to work is actually stored in there, so what about those files stored on a VMFS volume?

Each VMFS volume that holds a working directory (place where .vmx is stored) for at least 1 virtual machine that is connected to a VDS will have the following folder:

drwxr-xr-x    1 root     root          420 Feb  8 12:33 .dvsData

If you go to this folder you will see another folder. This folder appears to be some sort of unique identifier, and when comparing the string to the output of “net-dvs” it appears to be the identifier of the dvSwitch that was created.

drwxr-xr-x    1 root     root         1.5k Feb  8 12:47 6d 8b 2e 50 3c d3 50 4a-ad dd b5 30 2f b1 0c aa

Within this folder you will find a collection of files:

-rw------- 1 root root 3.0k Feb 9 09:00 106
-rw------- 1 root root 3.0k Feb 9 09:02 136
-rw------- 1 root root 3.0k Feb 9 09:00 138
-rw------- 1 root root 3.0k Feb 9 09:05 152
-rw------- 1 root root 3.0k Feb 9 09:00 153
-rw------- 1 root root 3.0k Feb 9 09:05 156
-rw------- 1 root root 3.0k Feb 9 09:05 159
-rw------- 1 root root 3.0k Feb 9 09:00 160
-rw------- 1 root root 3.0k Feb 9 09:00 161

It is no coincidence that these files are “numbers” and that these numbers resemble the port ID of the virtual machines stored on this volume. This is the port information of the virtual machines which have their working directory on this particular datastore. This port info is also what HA uses when it needs to restart a virtual machine which uses a dvPort. Let me emphasize that, this is what HA uses when it needs to restart a virtual machine! Is that all?

Well I am not sure. When I tested the original question I powered on the host without access to the storage system and powered on my storage system when the host was fully booted. I did not get this confirmed, but it seems to me that access to the datastore holding these files is somehow required during the boot process of your host, in the case of “static port bindings” that is. (Port bindings are more in-depth described here.)

Does this imply that if your storage is not available during the boot process virtual machines cannot connect to the network when they are powered on? Yes that is correct, I tested it and when you have a full power-outage and your hosts come-up before your storage you will have a “challenge”. As soon as the storage is restored you probably will want to restart your virtual machines but if you do you will not get a network connection. I’ve tested this 6 or 7 times in total and not once did I get a connection.

As a workaround you can simply reboot your ESXi hosts. If you reboot the host the problem is solved and your virtual machines can be powered on and will get access to the network. Rebooting a host can be a painfully slow exercise though, as I noticed during my test runs in my lab. Fortunately there is a really simple workaround: restarting the management agents! Before you power-on your virtual machines and after your storage connection has been restored do the following from the ESXi shell:

services.sh restart

After the services have been restarted you can power-on your virtual machines and network connection will be restored!

Side note, on my article there was one question about the auto-expand property of static port groups and whether this was officially supported and where it was documented. Yes it is fully supported. There’s a KB Article about how to enable it and William Lam recently blogged about it here. That is it for now on VDS…

Distributed vSwitches and vCenter outage, what’s the deal?

Duncan Epping · Feb 8, 2012 ·

Recently my colleague Venky Deshpande released a whitepaper around VDS Best Practices. This white paper describes various architectural options when adopting a VDS only strategy. A strategy of which I can see the benefits. On Facebook multiple people made comments around why this would be a bad practice instead of a best practice, here are some of the comments:

“An ESX/ESXi host requires connectivity to vCenter Server to make vDS operations, such as powering on a VM to attach that VM’s network interface.”

“The issue is that if vCenter is a VM and changes hosts during a disaster (like a total power outage) and then is unable to grant itself a port to come back online.”

I figured the best way to debunk all these myths was to test it myself. I am confident that it is no problem, but I wanted to make sure that I could convince you. So what will I be testing?

Network connectivity after Powering-on a VM which is connected to a VDS while vCenter is down.
Network connectivity restore of vCenter attached to a VDS after a host failure.
Network connectivity restore of vCenter attached to a VDS after HA has moved the VM to a different host and restarted it.

Before we start I think it is useful to rehash something, which is different types of portgroups which is described in more depth in this KB:

Static binding – Port is immediately assigned and reserved for it when VM is connected to the dvPortgroup through vCenter. This happens during the provisioning of the virtual machine!
Dynamic binding – Port is assigned to a virtual machine only when the virtual machine is powered on and its NIC is in a connected state. The Port is disconnected when the virtual machine is powered off or the virtual machine’s NIC is disconnected. (Deprecated in 5.0)
Ephemeral binding – Port is created and assigned to a virtual machine when the virtual machine is powered on and its NIC is in a connected state. The Port is deleted when the virtual machine is powered off or the virtual machine’s NIC is disconnected. Ephemeral Port assignments can be made through ESX/ESXi as well as vCenter.

Hopefully this makes it clear straight away that their should be no problem at all, “Static Binding” is the default and even when vCenter is down a VM which has been provisioned before vCenter went down can easily be powered on and will have network access. I don’t mind spending some lab hours on this, so lets put this to a test. Lets use the defaults and see what the results are.

First I made sure all VMs were connected to a dvSwitch. I powered of a VM and checked the “Network settings and this is what it revealed… a port already assigned even when powered off:

This is not the only place you can see port assignments, you can verify it on the VDS’s “ports” tab:

Now lets test this, as that is ultimately what it is all about. First test, Network connectivity after Powering-on a VM which is connected to a VDS while vCenter is down:

Connected VM to dvPortgroup with static binding (is the default and best practice)
Power off VM
Power off vCenter VM
Connect vSphere Client to host
Power on VM
Ping VM –> Positive result

You can even see on the command line that this VM uses its assigned port:

esxcli network vswitch dvs vmware list

Client: w2k8-001.eth0

DVPortgroup ID: dvportgroup-516

In Use: true

Port ID: 137

Second test, Network connectivity restore of vCenter attached to a VDS after a host failure:

Connected vCenter VM to dvPortgroup with static binding (is the default and best practice)
Power off vCenter VM
Connect vSphere Client to host
Power on vCenter VM
Ping vCenter VM –> Positive result

Third test, Network connectivity restore of vCenter attached to a VDS after HA has moved the VM to a different host and restarted it.

Connected vCenter VM to dvPortgroup with static binding (is the default and best practice)
Yanked the cable out of the ESXi host on which vCenter was running
Opened a ping to the vCenter VM
HA re-registered the vCenter VM on a different host and powered it on

The re-register / power-on took roughly 45 – 60 seconds

Ping vCenter VM –> Positive result

I hope this debunks some of those myths floating around. I am the first to admit that there are still challenges out there, these will hopefully be addressed soon, but I can assure you that your virtual machines will regain connection as soon as they are powered on through HA or manually… yes even when your vCenter Server is down.

Network loss after HA initiated failover

Duncan Epping · Mar 25, 2010 ·

I had a discussion with one of my readers last week and just read this post on the VMTN community which triggered this article.

When you create a highly available environment take into account that you will need to have enough vSwitch ports available when a failover needs to occur. By default a vSwitch will be created with 56 ports and in general this is sufficient for most environments. However when two of your hosts fail in a 10 host cluster you might end up with 60 or more VMs running on a single host. If this would happen several VMs would not have a vSwitch port assigned.

The most commonly used command when creating an automated build procedure probably is:

esxcfg-vswitch -a vSwitch1

This would result in a vSwitch named “vSwitch1” with the default amount of 56 ports. Now it is just as easy to set it up with 128 ports for instance:

esxcfg-vswitch -a vSwitch1:128

Always design for a worst case scenario. Also be aware of the overhead, some ports are reserved for internal usage. You might want to factor in some additional ports for this reason as for instance in the example above you will have 120 ports available for your VMs and not the 128 you specified.

dvSwitch?

Duncan Epping · Sep 24, 2009 ·

I receive the same question around dvSwitches almost every week; should I only use dvSwitches or go for a hybrid model? The whitepaper that has been released a couple of months ago clearly states that a hybrid model is a supported configuration but would I recommend it? Or would a pure vDS model make more senses?

Let me first start with the most obvious answer: it depends. Let’s break it down and create two categories:

Hosts with two NIC ports
Hosts with more than two NIC ports

Now most of you would probably say who the hell would only have two NIC ports? Think 10Gbe in blade environments for instance. With only two physical NIC ports available you would not have many options. You would have exactly two options(if not using Flex-10 of course):

Pure vDS
Pure vSwitch

Indeed, no hybrid option as you would still want to have full redundancy which means you will need at least 2 physical ports for any virtual switch. Now what would I recommend when there are only two physical NIC ports available; I guess it depends on the customer. There are multiple pros and cons for both models but I will pick the most obvious and relevant two for now:

PRO vDS: Operational benefits. Updating port groups, consistency and increased flexibility with vDS.
CON vDS: If vCenter fails there’s no way to manage your vDS

There it is; probably the most important argument on why or why not to run your Service Console on a vDS. If vCenter fails there’s no way to manage your vDS. For me personally this is the main reason why I would most like not recommend running your Service Console/VMkernel portgroups on a dvSwitch. In other words: Hybrid is the way to go…

<update 21-April-2011>
I guess it all comes down to what you are comfortable with and a proper operational procedure! But why? Why not just stick to Hybrid? I guess you could, but than again why not benefit from what dvSwitches have to offer? Especially in a converged network environment being able to use dvSwitches will make your life a bit easier from an operational perspective. On top of that you will have that great dvSwitch only Load Based Teaming to your disposal, load balancing without the need to resort to IP-Hash. I guess my conclusion is: Go Distributed… There is no need to be afraid if you understand the impact and risks and mitigate these with solid operational procedures.
</update 21-April-2011>