VMware

New whitepaper available: vSphere Metro Storage Cluster Recommended Practices (6.5 update)

Duncan Epping · Oct 24, 2017 ·

I had many requests for an updated version of this paper, so the past couple of weeks I have been working hard. The paper was outdated as it was last updated around the vSphere 6.0 timeframe, and it was only a minor update. I looked at every single section and added in new statements and guidance around vSphere HA Restart Priority for instance. So for those running a vSphere Metro Storage Cluster / Stretched Cluster of some kind, please read the brand new vSphere Metro Storage Cluster Recommended Practices (6.5 update) white paper.

It is available on storagehub.vmware.com in PDF and for reading within your browser. Any questions and comments, please do not hesitate to leave them here.

The difference between an isolation and a partition with vSphere

Duncan Epping · Oct 10, 2017 ·

I have a lot of discussions with customers on the topic of stretched clusters, but also regular vSphere clusters. Something that often comes up is the discussion around what happens in an isolation or partition scenario. Fairly often customers (but also VMware employees) use those words interchangeably. However, a partition is not the same as an isolation. They are 2 different scenarios, and also as a result they have a different type of response associated with it. Before I explain the difference in the two responses to a situation like this, what is a partition and what is an isolation?

An isolation event is a situation where a single host cannot communicate with the rest of the cluster. Note: single host!
A partition is a situation where two (or more) hosts can communicate with each other, but no longer can communicate with the remaining two (or more) hosts in the cluster. Note: two or more!

Why is that such a big deal? Well the response in the case of these two scenarios are different. And the response/result is also determined by what types of configuration you have. Lets break down the scenarios one by one, including the type of infrastructure used (when it is relevant).

Isolation Event

When a host is isolated it will:

start an election process
- declare itself primary
ping the isolation address
declare itself isolated
power off / shut down VMs (when this is configured)
communicate through the connected datastores that it is isolated
the VMs will be restarted on the remaining hosts in the cluster

And then of course vSphere HA will be able to restart the VMs. Note that in the case of vSAN, it isn’t possible to write to the datastore when a host is isolated, so it won’t do that. Yet the workloads will still have been powered off / shutdown so it is safe for vSphere HA to restart them

Partition (traditional storage)

When two or more hosts are partitioned (they can communicate with each other) and the vSphere HA primary is not part of the partition it will:

start an election process
declare a primary in the partition
figure out what has happened to the hosts and VMs in the other partition
- restart any VMs that somehow were impacted, or appeared now to be powered off while the last known state was powered on
if all VMs are running, vSphere HA won’t try to restart any, this is the expected result!

Partition (vSAN stretched)

When the partition scenario happens in a stretched vSAN environment there’s an extra (potential) step. Along the way, vSAN will identify all VMs which have no accessible components and kill those VMs so they can be restarted in the partition which has quorum. In this scenario, you have 3 locations, two for data and 1 for the witness. If a data site loses access to the other locations then the data site is partitioned (the hosts can still communicate with each other within the site), as such the isolation response is not triggered. However, vSAN will still kill these VMs as they are rendered useless (lost access to disk).

I know it is just semantics, but nevertheless, I do feel it is important to understand the difference between an isolation and a partition, especially as the response (and who responds) is different in these situations. Hope it helps,

Which disk controller to use for vSAN

Duncan Epping · Sep 28, 2017 ·

I have many customers going through the plan and design phase for implementing a vSAN based infrastructure. Many of them have conversations with OEMs and this typically results in a set of recommendations in terms of which hardware to purchase. One thing that seems to be a recurring theme is the question which disk controller a customer should buy. The typical recommendation seems to be the most beefy disk controller on the list. I wrote about this a while ago as well, and want to re-emphasize my thinking. Before I do, I understand why these recommendations are being made. Traditionally with local storage devices selecting the high-end disk controller made sense. It provided a lot of options you needed to have a decent performance and also availability of your data. With vSAN however this is not needed, this is all provided by our software layer.

When it comes to disk controllers my recommendation is simple: go for the simplest device on the list that has a good queue depth. Just to give an example, the Dell H730 disk controller is often recommended, but if you look at the vSAN Compatibility Guide then you will also see the HBA330. The big difference between these two is the RAID functionality offered on the H730 and the cache on the controller. Again, this functionality is not needed for vSAN, by going for the HBA330 you will save money. (For HP I would recommend the H240 disk controller.)

Having said that, I would at the same time recommend customers to consider NVMe for the caching tier instead of SAS or SATA connected flash. Why, well for the caching layer it makes sense to avoid the disk controller. Place the flash as close to the CPU as you can get for low latency high throughput. In other words, invest the money you are saving on the more expensive disk controller in NVMe connected flash for the caching layer.

Sharing the “Top 10 things to know about vSAN” slides…

Duncan Epping · Sep 19, 2017 ·

I was asked by a few people to share the slides for our Top 10 vSAN session at VMworld. Instead of sending the slides around via email I figured I would simply throw it up on slideshare and share it here.

List all “thick” swap files on vSAN

Duncan Epping · Sep 6, 2017 ·

As some may know, on vSAN by default the swap file is a fully reserved file. This means that if you have a VM with 8GB of memory, vSAN will reserve 16GB capacity in total for it. 16GB? Yes, 16GB as the FTT=1 policy is also applied to it. In vSAN 6.2 we introduced the ability to have swap files created “thin” or “unreserved” I should probably say. You can simply do these by setting an advanced setting on each host in your cluster. (SwapThickProvisionDisabled) Now when you have set this and power-off/power-on your VMs the swap file is recreated and the swap file will be thin. Jase McCarty wrote a script that will set the setting for you in each host of your cluster, but the problem of course is how do you know which VM has the “new unreserved” swap file and which VM still has the fully reserved swap file. This is what a customer asked me last week.

I was sitting next to William at a session and I asked him this question. William went at it and knocked out a Python script which lists all VMs in a cluster which have a fully reserved swap file. Very useful for those who are moving to “unreserved / sparse” swap. This way you can figure out which VMs still need a reboot and reclaim that (unused) disk capacity.

Note, the “sparse” / “unreserved” swap files are only intended for environments which do not overcommit on memory. If you do overcommit on memory please ensure you have disk capacity available, as you will need the disk capacity as soon as the hypervisor wants to place memory pages in the swap file. If there’s no disk capacity available it will result in the VM failing.

Thanks William for knocking out this script so fast…