• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

7.0 u3

Cleaning up old vSAN File Services OVF files on vCenter Server

Duncan Epping · Oct 3, 2022 · 1 Comment

There was a question last week about the vSAN File Services OVF Files, the question was about the location where they were stored. I did some digging in the past, but I don’t think I ever shared this. The vSAN File Services OVF is stored on vCenter Server (VCSA) in a folder, for each version. The folder structure looks as show below, basically each version of an OVF has a directory with required OVF files.

[email protected] [ ~ ]# ls -lha /storage/updatemgr/vsan/fileService/

total 24K

vsan-health users 4.0K Sep 16 16:09 .

vsan-health root  4.0K Nov 11  2020 ..

vsan-health users 4.0K Nov 11  2020 ovf-7.0.1.1000

vsan-health users 4.0K Mar 12  2021 ovf-7.0.2.1000-17692909

vsan-health users 4.0K Nov 24  2021 ovf-7.0.3.1000-18502520

vsan-health users 4.0K Sep 16 16:09 ovf-7.0.3.1000-20036589

[email protected] [ ~ ]# ls -lha /storage/updatemgr/vsan/fileService/ovf-7.0.1.1000/

total 1.2G

vsan-health users 4.0K Nov 11  2020 .

vsan-health users 4.0K Sep 16 16:09 ..

vsan-health users 179M Nov 11  2020 VMware-vSAN-File-Services-Appliance-7.0.1.1000-16695758-cloud-components.vmdk

vsan-health users 5.9M Nov 11  2020 VMware-vSAN-File-Services-Appliance-7.0.1.1000-16695758-log.vmdk

vsan-health users  573 Nov 11  2020 VMware-vSAN-File-Services-Appliance-7.0.1.1000-16695758_OVF10.mf

vsan-health users  60K Nov 11  2020 VMware-vSAN-File-Services-Appliance-7.0.1.1000-16695758_OVF10.ovf

vsan-health users 998M Nov 11  2020 VMware-vSAN-File-Services-Appliance-7.0.1.1000-16695758-system.vmdk

I’ve asked the engineering team, and yes, you can simply delete obsolete versions if you need the disk capacity.

How to convert a standard cluster to a stretched cluster while expanding it!

Duncan Epping · Sep 27, 2022 · Leave a Comment

On VMTN a question was asked about how you could convert a 5-node standard cluster to a stretched cluster. It is not documented in our regular documentation, probably as the process is pretty straightforward, so I figured I would write it down. When you create a stretched cluster you will need a Witness Appliance in a third location first. I would recommend deploying that Witness Appliance before doing anything else.

After you deployed the Witness Appliance add the additional hosts to vCenter Server. DO NOT yet add them to the cluster yet though! First, configure each host separately. After you have configured each host, place the host into maintenance mode. After the host is placed into maintenance mode, move it into the cluster and do not take it out of maintenance mode!

Now, when all hosts are part of the cluster you can create the Stretched Cluster. This process is simple, you pick the hosts that belong to each location, and then you select the witness. After the cluster has been created you simply take the hosts out of maintenance mode and you should be good! Note, you take the host out of maintenance after the Stretched Cluster has been created to ensure that you don’t have any rebalancing happening while you are creating the stretched cluster. Simply avoiding unneeded resyncs from occuring.

Do note, all VMs will have the same storage policy assigned still, so you will need to change that policy to ensure that the vSAN objects are placed and replicated according to your requirements! (RAID1 across locations and RAID-1/5/6 within a location for instance.)

New book: VMware vSAN 7.0 U3 Deep Dive

Duncan Epping · May 9, 2022 · 8 Comments

Yes, we’ve mentioned it a few times already on Twitter that we were working on it, but today Cormac and I are proud to announce that the VMware vSAN 7.0 U3 Deep Dive is available via Amazon on both ebook as well as paper! We had the pleasure of working with Pete Koehler again as a technical editor, the foreword was written by John Gilmartin (SVP and GM for Cloud Storage and Data), the cover was created by my son (Aaron Epping), and it is once again fully self-published! We changed the format (physical dimension) of the book to be able to increase the size of the screenshots, as we realize that most of us are middle-aged by now, we feel it really made a huge difference in readability.

VMware’s vSAN has rapidly proven itself in environments ranging from hospitals to oil rigs to e-commerce platforms and is the top player in the hyperconverged space. Along the way, it has matured to offer unsurpassed features for data integrity, availability, space efficiency, stretched clustering, and cloud-native storage services. vSAN 7.0 U3 has radically simplified IT operations and supports the transition to hyperconverged infrastructures (HCI). The authors of the vSAN Deep Dive have thoroughly updated their definitive guide to this transformative technology. Writing for vSphere administrators, architects, and consultants, Cormac Hogan, and Duncan Epping explain what vSAN is, how it has evolved, what it now offers, and how to gain maximum value from it. The book offers expert insight into preparation, installation, configuration, policies, provisioning, clusters, architecture, and more. You’ll also find practical guidance for using all data services, stretched clusters, two-node configurations, and cloud-native storage services.

Although we pressed publish, sometimes it takes a while before the book is available in all Amazon stores, but it should just trickle in the upcoming 24-48 hours. The book is priced at 9.99 USD (ebook) and 29.99 USD (paper) and is sold through Amazon only. Get it while it is hot, and we would appreciate it if you would use our referral links and leave a review when you finish it. Thanks, and we hope you will enjoy it!

  • Paper book – 29.99 USD
  • Ebook – 9.99 USD

Of course, we also have the links to other major Amazon stores:

  • United Kingdom – Kindle – Paper
  • Germany – Kindle – Paper
  • Netherlands – Kindle – Paper
  • Canada – Kindle – Paper
  • France – Kindle – Paper
  • Spain – Kindle – Paper
  • India – Kindle
  • Japan – Kindle – Paper
  • Italy – Kindle – Paper
  • Mexico – Kindle
  • Australia – Kindle – Paper
  • Or just do a search!

Stretched cluster witness failure resilience in vSAN 7.0

Duncan Epping · Mar 17, 2022 · 2 Comments

Cormac and I have been busy the past couple of weeks updating the vSAN Deep Dive to 7.0 U3. Yes, there is a lot to update and add, but we are actually going through it at a surprisingly rapid pace. I guess it helps that we had already written dozens of blog posts on the various topics we need to update or add. One of those topics is “witness failure resilience” which was introduced in vSAN 7.0 U3. I have discussed it before on this blog (here and here) but I wanted to share some of the findings with you folks as well before the book is published. (No, I do not know when the book will be available on Amazon just yet!)

In the scenario below, we failed the secondary site of our stretched cluster completely. We can examine the impact of this failure through RVC on vCenter Server. This will provide us with a better understanding of the situation and how the witness failure resilience mechanism actually works. Note that the below output has been truncated for readability reasons. Let’s take a look at the output of RVC for our VM directly after the failure.

VM R1-R1:
Disk backing:
[vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx
RAID_1
RAID_1
Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221
votes: 1, usage: 0.1 GB, proxy component: false)
Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201
votes: 1, usage: 0.1 GB, proxy component: false)
RAID_1
Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238
votes: 1, usage: 0.1 GB, proxy component: true)
Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236
votes: 1, usage: 0.1 GB, proxy component: true)
Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231
votes: 3, usage: 0.0 GB, proxy component: false)
Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214
votes: 1, usage: 0.0 GB, proxy component: false)
Witness: 0b013262-5616-95c5-8b52-005056968de9 , host: 10.202.25.228
votes: 1, usage: 0.0 GB, proxy component: false)

As can be seen, the witness component holds 3 votes, the components on the failed site (secondary) hold 2 votes, and the components on the surviving data site (preferred) hold 2 votes. After the full site failure has been detected, the votes are recalculated to ensure that a witness host failure does not impact the availability of the VMs. Below shows the output of RVC once again.

VM R1-R1:
Disk backing:
[vsanDatastore] 0b013262-0c30-a8c4-a043-005056968de9/R1-R1.vmx
RAID_1
RAID_1
Component: 0b013262-c2da-84c5-1eee-005056968de9 , host: 10.202.25.221
votes: 3, usage: 0.1 GB, proxy component: false)
Component: 0b013262-3acf-88c5-a7ff-005056968de9 , host: 10.202.25.201
votes: 3, usage: 0.1 GB, proxy component: false)
RAID_1
Component: 0b013262-a687-8bc5-7d63-005056968de9 , host: 10.202.25.238
votes: 1, usage: 0.1 GB, proxy component: false)
Component: 0b013262-3cef-8dc5-9cc1-005056968de9 , host: 10.202.25.236
votes: 1, usage: 0.1 GB, proxy component: false)
Witness: 0b013262-4aa2-90c5-9504-005056968de9 , host: 10.202.25.231
votes: 1, usage: 0.0 GB, proxy component: false)
Witness: 47123362-c8ae-5aa4-dd53-005056962c93 , host: 10.202.25.214
votes: 3, usage: 0.0 GB, proxy component: false)

As can be seen, the votes for the various components have changed, the data site now has 3 votes per component instead of 1, the witness on the witness host went from 3 votes to 1, and on top of that, the witness that is stored in the surviving fault domain now also has 3 votes instead of 1 vote. This now results in a situation where quorum would not be lost even if the witness component on the witness host is impacted by a failure. A very useful enhancement to vSAN 7.0 Update 3 for stretched cluster configurations if you ask me.

vSAN 7.0 U3 enhanced stretched cluster resiliency, what is it?

Duncan Epping · Oct 4, 2021 · 9 Comments

I briefly discussed the enhanced stretched cluster resiliency capability in my vSAN 7.0 U3 overview blog. Of course, immediately questions started popping up. I didn’t want to go too deep in that post as I figured I would do a separate post on the topic sooner or later. What does this functionality add, and in which particular scenario?

In short, this enhancement to stretched clusters prevents downtime for workloads in a particular failure scenario. So the question then is, what failure scenario? Let’s take a look at this diagram first of a typical stretched vSAN cluster deployment.

If you look at the diagram you see the following: Datacenter A, Datacenter B, Witness. One of the situations customers have found themselves in is that Datacenter A would go down (unplanned). This of course would lead to the VMs in Datacenter A being restarted in Datacenter B. Unfortunately, sometimes when things go wrong, they go wrong badly, in some cases, the Witness would fail/disappear next. Why? Bad luck, networking issues, etc. Bad things just happen. If and when this happens, there would only be 1 location left, which is Datacenter B.

Now you may think that because Datacenter B typically will have a full RAID set of the VMs running that they will remain running, but that is not true. vSAN looks at the quorum of the top layer, so if 2 out of 3 datacenters disappear, all objects impacted will become inaccessible simply as quorum is lost! Makes sense right? We are not just talking about failures right, could also be that Datacenter A has to go offline for maintenance (planned downtime), and at some point, the Witness fails for whatever reason, this would result in the exact same situation, objects inaccessible.

Starting with 7.0 U3 this behavior has changed. If Datacenter A fails, and a few (let’s say 5) minutes later the witness disappears, all replicated objects would still be available! So why is this? Well in this scenario, if Datacenter A fails, vSAN will create a new votes layout for each of the objects impacted. It basically will assume that the witness can fail and give all components on the witness 0 votes, on top of that it will give the components in the active site additional votes so that we can survive that second failure. If the witness would fail, it would not render the objects inaccessible as quorum would not be lost.

Now, do note, when a failure occurs and Datacenter A is gone, vSAN will have to create a new votes layout for each object. If you have a lot of objects this can take some time. Typically it will take a few seconds per object, and it will do it per object, so if you have a lot of VMs (and a VM consists of various objects) it will take some time. How long, well it could be five minutes. So if anything happens in between, not all objects may have been processed, which would result in downtime for those VMs when the witness would go down, as for that VM/Object quorum would be lost.

What happens if Datacenter A (and the Witness) return for duty? Well at that point the votes would be restored for the objects across locations and the witness.

Pretty cool right?!

  • Go to page 1
  • Go to page 2
  • Go to Next Page »

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

May 24th – VMUG Poland
June 1st – VMUG Belgium

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2023 · Log in