Virtualization networking strategies…

I was asked a question on LinkedIn about the different virtualization networking strategies from a host point of view. The question came from someone who recently had 10GbE infrastructure introduced in to his data center and the way the network originally was architected was with 6 x 1 Gbps carved up in three bundles of 2 x 1Gbps. Three types of traffic use their own pair of NICs: Management, vMotion and VM. 10GbE was added to the current infrastructure and the question which came up was: should I use 10GbE while keeping my 1Gbps links for things like management for instance? The classic model has a nice separation of network traffic right?

Well I guess from a visual point of view the classic model is nice as it provides a lot of clarity around which type of traffic uses which NIC and which physical switch port. However in the end you typically still end up leveraging VLANs and on top of the physical separation you also provide a logical separation. This logical separation is the most important part if you ask me. Especially when you leverages Distributed Switches and Network IO Control you can create a great simple architecture which is fairly easy to maintain and implement both from a physical and virtual point of view, yes from a visual perspective it may be bit more complex but I think the flexibility and simplicity that you get in return definitely outweighs that. I definitely would recommend, in almost all cases, to keep it simple. Converge physically, separate logically.

vSwitch Traffic Shaping, what is what?

I was troubleshooting an issue where vMotion would time-out constantly, I had no clue where it was coming from so I started digging. In this case the environment was using a regular vSwitch and 10GbE networking. When I took a closer look I noticed that some form of traffic shaping was applied, as unfortunately the Distributed vSwitch was not an option for this environment. Now traffic shaping was enabled and the peak value was specified and the rest was left to the default value… and unfortunately this is exactly what cause the problem.

So when it comes to vSwitch Traffic Shaping, what is what? There are 3 settings you can set per portgroup:

  • Average Bandwidth – specified in Kbps
  • Peak Bandwidth – specified in Kbps
  • Burst Size – specified in KB

So if you have a 10Gbps NIC port for your traffic this means you have a total of 10,485,760 Kbps. When you enable vSwitch Traffic Shaping by default it is set to have “Average Bandwidth” to 100,000 Kbps , Peak Bandwidth to 100,000 Kbps and Burst Size to 1024,00 KB. So what does that mean? Well it means that if you enable it and do not change the values that the traffic is limited to 100,000 Kbps. 100,000 Kbps is… yes roughly 100Mbps, even less to be more precise: 97.6Mbps. Which is not a lot indeed, and not even a supported configuration for vMotion.

So what if I simply bump up the Peak Bandwidth to lets say 5Gbps, as I do not want vMotion to ever consume more than half of the NIC port (note, vSwitch traffic shaping is only for egress aka outbound traffic). Well setting the peak bandwidth sounds like it may do something, but probably not what you would hope for as this is how the settings are applied:

By default the traffic stream will get what is specified by “Average Bandwidth”. However, it is possible to exceed this when needed by specifying a higher “Peak Bandwidth” value. Your traffic will be allowed to burst until the value of “Burst Size” has been exceeded. In other words, in the above example when only Peak Bandwidth is increased this would lead to the following: By default the traffic is limited to 100Mbps, however it can peak to 5Gbps but only for 100MB worth of data traffic. As you can imagine in the case of vMotion when the full memory content of a VM is transferred that 100MB is hit within a second, after which the vMotion process is throttled back to 100Mbps and the remainder of the VM memory takes ages to copy and eventually times out.

So if you apply traffic shaping using your vSwitch, make sure to think through the numbers. In the above scenario for instance, specifying a 5Gbps Average and Peak would be what was desired.

New beta of the vSphere 5.5 U1 Hardening Guide released

Mike Foley just announced the new release of the vSphere 5.5 U1 Hardening Guide. Note that it is still labeled as a “beta” as Mike is still gathering feedback, however the document should be finalized first week of June.

For those concerned about security, this is an absolute must read! As always, before implementing ANY of these recommendations make sure to test them on a test cluster and test expected functionality of both the vSphere platform and the virtual machines and applications running on top of it.

Nice work Mike!

Updating LSI firmware through the ESXi commandline

I received an email this week from one of my readers / followers on twitter who had gone through the effort of upgrading his LSI controller firmware. He shared the procedure with me as unfortunately it wasn’t well documented. I hope this will help others in the future, I know it will help me as I was about to look at the exact same for my VSAN environment, thanks for sharing this Tom!

— copy / paste from Tom’s document —

We do quite a bit of virtualization and storage validation and performance testing in the Taneja Group Labs (http://tanejagroup.com/). Recently, we were performing some tests with VMware’s VSAN and due to some performance issues we were having with the AHCI controllers on our servers we needed to revise our environment to add some LSI SAS 2308 controllers and attach our SSD and HDDs to the LSI card. However our new LSI SAS controllers didn’t come with the firmware mandated by the VSAN HCL (they had v14 and the HCL specifies v18) and didn’t recognize the attached drives.  So we set about updating LSI 2308 firmware. Updating the LSI firmware is a simple process and can be accomplished from an ESXi 5.5 U1 server but isn’t very well documented. After updating the firmware and rebooting the system the drives were recognized and could be used by VSAN. Below are the steps I took to update my LSI controllers from v14 to v18. [Read more…]

The Compatibility Guides are now updated with VSAN and vFlash info!

For those wanting to play with Virtual SAN (VSAN) and vSphere Flash Read Cache (vFRC / vFlash), the compatibility guides are being updated at the moment. Hit the following URL to find out what is currently supported and what not:

  • vmware.com/resources/compatibility/
  • For vSphere Flash Read Cache:
    • Select “VMware Flash Read Cache” from the drop down list titled “What are you looking for”.
    • Hit “update and view results”
  • For Virtual SAN:
    • Select “Virtual SAN (beta)” from the drop down list titled “What are you looking for”
    • Select “ESXi 5.5″ and click “Next”
    • Select a category (server, i/o controller, hdd, ssd), at the time of writing only server was available
    • Select the type of Server and click next
    • Now a list is presented of supported servers

I know both lists are short today, this is an on-going efforts and I know many vendors are now wrapping up and submitting their test reports, more to be added over the course of the next couple of weeks so keep on coming back to the compatibility guide.