Server

EZT Disks with VSAN, why would you?

Duncan Epping · Jan 26, 2015 ·

I noticed a tweet today which made a statement around the use of eager zero thick disks in a VSAN setup for running applications like SQL Server. The reason this user felt this was needed was to avoid the hit on “first write to block on VMDK”, it is not the first time I have heard this and I have even seen some FUD around this so I figured I would write something up. On a traditional storage system, or at least in some cases, this first write to a new block takes a performance penalty. The main reason for this is that when the VMDK is thin, or lazy zero thick, the hypervisor will need to allocate that new block that is being written to and zero it out.

First of all, this was indeed true with a lot of the older storage system architectures (non-VAAI). However, this is something that even in 2009 was dispelled as forming a huge problem. And with the arrival of all-flash arrays this problem disappeared completely. But indeed VSAN isn’t an all-flash solution (yet), but for VSAN however there is something different to take in to consideration. I want to point out, that by default when you deploy a VM on VSAN you typically do not touch the disk format even and it will get deployed as “thin” with potentially a space reservation setting which comes from the storage policy! But what if you use an old template which has a zeroed out disk and you deploy that and compare it to a regular VSAN VM, will it make a difference? For VSAN eager zero thick vs thin will (typically) make no difference to your workload at all. You may wonder why, well it is fairly simple… just look at this diagram:

If you look at the diagram then you will see that the acknowledgement will happen to the application as soon as the write to flash has happened. So in the case of thick vs thin you can imagine that it would make no difference as the allocation (and zero out) of that new block would happen minutes after the application (or longer) has received the acknowledgement. A person paying attention would now come back and say: hey you said “typically”, what does that mean? Well that means that the above is based in the understanding that your working set will fit in cache, of course there are ways to manipulate performance tests to proof that the above is not always the case, but having seen customer data I can tell you that this is not a typical scenario… or extremely unlikely.

So if you deploy Virtual SAN… and have “old” templates floating around and they have “EZT” disks, I would recommend overhauling them as it doesn’t add much, well besides a longer waiting time during deployment.

Lego VSAN EVO:RACK

Duncan Epping · Jan 24, 2015 ·

I know a lot of you guys have home labs and are always looking for that next cool thing. Every once in a while you see something cool floating by on twitter and in this case it was so cool I needed to share it with you guys. Someone posted a picture of his version of “EVO:RACK” leveraging Intel NUC, a small switch and Lego… How awesome is a Lego VSAN EVO:RACK?! Difficult to see indeed in the pics below, but if you look at this picture then you will see how the top of rack switch was included.

Lego VSAN EVO Rack NUC style… Version 2.. Note top of rack switch!!@pdxvmug @vmwarevsan @IntelNUC @vExpert pic.twitter.com/SYFa6leLxX

— Nicholas Farmer (@vmnick0) January 9, 2015

Besides the awesome tweet, Nick also shared how he has build his lab in a couple of blog posts which are worth reading for sure!

Enjoy,

E1000 VMware issues

Duncan Epping · Jan 23, 2015 ·

Lately I’ve noticed that various people have been hitting my blog through the search string “e1000 VMware issues”, I want to make sure people end up in the right spot so I figured I would write a quick article that points people there. I’ve hit the issues described in the various KB articles myself, and I know how frustrating it can be. The majority of problems seen with the E1000 and E1000E drivers have been solved with the newer releases. I always run the latest and greatest version so it isn’t something I encounter any longer, but you may potentially witness the following:

vmkernel.log entries with “Heap netGPHeap already at its maximum size. Cannot expand.”
PSOD with “E1000PollRxRing@vmkernel#nover+”
vmware.log entries with “[msg.ethernet.e1000.openFailed] Failed to connect ethernet0.”

These problems are witnessed with vSphere 5.1 U2 and earlier and patches have been released to mitigate these problems, if you are running one of those versions either use the patch or preferably upgrade to vSphere 5.1 Update 3 at a minimum when you are still running 5.0 or 5.1, or move up to the latest 5.5 release.

KB articles with more details can be found here:

You wanted VMTN back? VMUG to the rescue!

Duncan Epping · Jan 15, 2015 ·

I’ve written about VMTN in the past and discussed the return of VMTN many times within VMware with various people all the way up to our CTO. Unfortunately due to various reasons it never happened, but fortunately the VMUG organization jumped on to it not too long ago and managed to get it revamped. If you are interested in it then see the blurb below, visit the VMUG website and sign up. I can’t tell you how excited I am about this and how surprised I was that the VMUG team has managed to pull this off in a relatively short time frame. Thanks VMUG!

Source: VMUG – EVALExperience!
VMware and VMUG have partnered with Kivuto Solutions to provide VMUG Advantage Subscribers a customized web portal that provides VMUG Advantage Subscribers with self-service capability to download software and license keys. Licenses to available VMware products are regularly updated and posted to the self-service web portal. The licenses available to VMUG Advantage Subscribers are 365-day evaluation licenses that require a one-time, annual download. Annual product downloads ensure that Subscribers receive the most up-to-date versions of products.

Included products are:

VMware vCenter Server™ 5 Standalone for vSphere 5

VMware vSphere® with Operations Management™ Enterprise Plus

VMware vCloud Suite® Standard

VMware vRealize™ Operations Insight™

VMware vRealize Operations™ 6 Enterprise

VMware vRealize Log Insight™

VMware vRealize Operations for Horizon®

VMware Horizon® Advanced Edition

VMware Virtual SAN™

A new 365 entitlement will be offered with the renewal of your yearly VMUG Advantage Subscription. Software is provided to VMUG Advantage Subscribers with no associated entitlement to support services, and users may not purchase such services in association with the EVALExperience licenses.

DRS is just a load balancing solution…

Duncan Epping · Jan 15, 2015 ·

Recently I’ve been hearing this comment more and more, DRS is just a load balancing solution. It seems that some folks spread this FUD to diminish what DRS really is and does. Let me start by saying that DRS is not a load balancing solution. The ultimate goal of DRS is to ensure all workloads receive the resources they demand. Frank Denneman has a great post on this topic as this has led to some confusion in the past. I would advise reading it if you want to understand why exactly VMs are not moved while the cluster seems imbalanced. In short: why balance VMs when the VMs are not constraint? In other words, DRS has a VM centric view of the virtual world and not a host centric… In the end, it is all about your applications and how they perform and not necessarily about the infrastructure it is hosted on, DRS cares about VM/Application happiness. Also, keep in mind that there is a risk and a cost involved with every move you do.

Of course there is a lot of functionality that you leverage without thinking about it and take for granted. Things like Resource Pools (limits / reservations / shares), DRS Maintenance Mode (fully automated), VM Placement, Admission Control (yes DRS has one) and last but not least the various types of (anti) affinity rules. Also, before anyone starts shouting about active memory vs consumed (PercentIdleMBInMemDemand solves this) or %RDY taken in to account… DRS has many knobs you can twist.

But besides that, there is more. Something not a lot of people realize is that for instance HA and DRS are loosely coupled but tightly integrated. When you have both enabled on your cluster then HA will be able to call upon DRS for making the right placement decision and defragmenting resources when needed. What does that mean? Well lets assume for a second that you are running at full (or almost) capacity and a host fails while taking a host failure in to account by leveraging HA admission control. When the host fails HA will need to restart your VMs, but if there at some point is not enough spare capacity left to restart a VM on a given host? Well in that case HA will call upon DRS to make space available so that these VMs can be restarted. That is nice right?! And there is more smartness coming with considering HA / DRS admission control, hopefully I can tell you all about it soon.

Then of course there is also the case where resource pools are implemented. vSphere HA and DRS work in conjunction to ensure that when VMs are failed over that shares are flattened to avoid strange prioritisation during times of contention. HA and DRS do this as VMs always failover to the root resource pool of a host, but of course DRS will place the VMs back where they belong when it runs the first time after the failover has occurred. This especially is important when you have set shares on VMs individually in a resource pool model.

So when someone says DRS is just a simple load balancing solution take their story with a grain of salt…