Yellow Bricks

Deleting the vCLS VMs using Retreat Mode starting with vSphere 8.0 U2

Duncan Epping · Sep 22, 2023 · 2 Comments

I posted about “retreat mode” and how to delete the vCLS VMs when needed a while back, including a quick demo. Back then you needed to configure an advanced setting for a cluster if you wanted to delete the VMs for whatever reason. (Usually for troubleshooting purposes people would do a delete/recreate.) Starting with vSphere 8.0 U2 you can now use the UI to enable retreat mode on a per cluster level. How do you do this? well fairly straight forward:

Click on the cluster you would want to delete the VMs for
Click on Configure
Click on “General” under “vSphere Cluster Services”
Click on “EDIT VCLS MODE”
Click on “Retreat Mode” and click “OK”

Now the VMs will be deleted, if you want to recreate the VMs, follow the same procedure, but change “Retreat Mode” to “System Managed”. I tested the process yesterday and created a quick demo for you:

Scalable Snapshots demo with the vSAN 8.0 Express Storage Architecture

Duncan Epping · Sep 5, 2023 · 1 Comment

Starting with vSAN 8 a brand new architecture was introduced called “Express Storage Architecture”. Over the last year or so a lot of information has been shared about ESA and the benefits of ESA. One of the things which ESA introduces is much-improved snapshot scalability.

With vSAN OSA, and with VMFS, when you create a snapshot you typically immediately see a performance degradation. This is because both VMFS and vSAN OSA still operate using the redo-log based snapshot mechanism. This means that with vSAN OSA when you create a snapshot a new object is created and writes are re-directed. It also means that reads will be coming from various files, if you have one or more snapshots. This mechanism is, unfortunately, not very effective. Let me borrow a diagram that is part of a post John Nicholson wrote to demonstrate that old logic.

With vSAN 8 ESA the mechanism has changed and no longer does vSAN, or vSphere for that matter, create an additional object. vSAN ESA handles this on a meta-data level. In other words, instead of redirecting writes and traversing files for reads, vSAN now leverages a highly efficient B-Tree structure and pointers to keep track of which block is associated with which snapshot.

Not only is this more efficient from a capacity perspective, but more importantly it is very efficient from a performance standpoint. I ran half a dozen tests in my lab, and what I saw was a below 2% performance impact between a VM without a snapshot and a VM with one or multiple snapshots. I could NOT see a significant difference between the first or the fifth snapshot. I do want to point out that my lab is not officially certified to run vSAN ESA, nevertheless, I was very impressed with the results.

During the last run, I actually recorded the whole exercise. In this demo, I show the creation of one snapshot, while the VM is running a benchmark (HCIBench). Now, during the testing, I created not one but various snapshots and of course, I deleted all of them as well. You have all probably experienced extensive stun times during the deletion of a snapshot at times, and this is where vSAN ESA shines. The stun times have been reduced by 100 times, and that is something I am sure each of you will appreciate. Why have they been reduced drastically? Well, simply because we no longer have to copy data from one vSAN object to another. This makes a huge difference, not just for stun times, but also for performance in general (latency, IOPS, throughput). If you are interested, have a look at the demo!

MAXimizing vSAN’s potential with the Express Storage Architecture (vSAN Max)

Duncan Epping · Aug 31, 2023 · 4 Comments

Last week at VMware Explore a few vSAN features and offerings were announced, one of them being vSAN Max! All week I have been having conversations with customers who were highly excited about the new solution. For those who did not read the announcements, or listened to the Unexplored Territory Podcast episode on the topic, let me go over what was announced and what vSAN Max is.

As most of you know, vSAN is a hyperconverged storage platform delivered via VMware’s flagship product vSphere. This means that if you have vSphere running, vSAN is literally two clicks away from being enabled. You will need local storage devices, and those local devices then will be formed into a shared datastore on top of which you can run your VMs. Although HCI solutions work for most customers, at certain levels of scale it may be preferred to have a disaggregated solution and share a dedicated storage platform with one or multiple vSphere clusters. This is what vSAN Max brings to the table.

Looking at the above diagram a few things stand out when it comes to vSAN Max. First of all, it says “Storage Only” and secondly it mentions “Supports high-density ESA ReadyNodes”. There are a few things to unwrap here. Firstly, vSAN Max is based on vSAN Express Storage Architecture, aka vSAN ESA. This means that it is a single tier of storage, based on NVMe flash devices. On top of that, it also means that all available data services will also be available on vSAN Max: Fault Domains, Stretched Clustering, vSAN File Services, iSCSI, Compression, Encryption etc. All of these are also included by default in the license by the way, it is just a single edition from a licensing point of view and it will include vSphere. In other words, vSphere + vSAN Enterprise by default, and licensed on capacity instead of CPU/Cores.

Secondly, it mentions “high-density”, vSAN Max starts at 200TB per host, and has a minimum of 6 hosts per cluster. This means that the starting capacity is 1.2 Petabytes for a vSAN Max cluster. The maximum number of hosts within a cluster is 32 at the time of writing (but 24 hosts being the recommended maximum), and it will support up to 8.6 Petabytes and around 3.4 million IOPS.

It also mentions ReadyNodes, and let me stress this, ReadyNodes! We still see a lot of customers picking random components for their vSAN cluster and then being surprised that Skyline Health reports the cluster is not supported. For vSAN Max there will be a separate set of vSAN ReadyNode configurations. These configurations will have for instance 100Gbps network cards, and as mentioned a minimum of 200TB per host.

Now, this doesn’t mean that the connecting clusters need to be running 100GbE, they can be even 1Gbps connected, that’s up to you and the requirements you have from a performance perspective. The 100GbE connections will be used for intra-cluster communications, so the switching architecture also needs to cater to this.

Knowing all of this, you may wonder what the use cases are for vSAN Max. As Pete Koehler mentioned, it can be used for anything, but is primarily targeted at those with high capacity requirements and who prefer a centralized model, but still want to manage their storage platform through vCenter Server and use all the bells and whistles that come with it (and with VROps for instance).

Hopefully, that provides some insights in terms of what to expect when vSAN Max goes “general availability” I will follow up with some short demos showing what it will look like, although that will probably be relatively boring as it will look very similar to vSAN ESA. In the meanwhile, there’s a bunch of material on the VMware website that you can check out.

VMware Explore 2023, my personal top recommended sessions!

Duncan Epping · Jul 13, 2023 · Leave a Comment

This year I wasn’t planning on creating a list with my personal favorite sessions at Explore, it just takes so much time to go through the agenda, read the outlines and then pick what I feel would be interesting for me personally. However, this week I had a handful of people asking if I was going to post a list again as they typically share the list with their customers. So I figured that I would go through the exercise again. First, let me share the three sessions that I will be part of:

Tech Deep Dive: vSAN 8 ESA–Performance and Resilience Without Compromise [CEIB1101LV]
Pete Koehler and I will be diving deep into all things vSAN 8 ESA, and will make sure to bring some demos as well to this 45 minute break out session. This just may include some details of … well I can’t talk about that really.
Increasing Availability and Resilience with VMware vSAN Stretched Clusters [CEIB1099LV]
Vivek Pamadi and I won the People’s Choice Award, and as such we get to present on the topic of vSAN Stretched Clusters at Explore, and in this session we will not only cover the current, but we will also reveal some potential future capabilities which will make your life as a stretched cluster customer easier!
Career Advancement Panel Discussion: Tips for Success from Tech Leaders [VIB2845LV]
I submitted a personal/career development session, but as the number of speaking slots is limited I was asked to join this panel session to talk about my experiences within the IT industry.

Those are the three sessions I will be part of, and hopefully, you will sign up for these. The vSAN sessions historically have been packed, usually repeated, so sign up early before they fill up!

Now, let’s list some of the sessions I am going to try to attend personally, that is if they are not full. Yes the list is focussed on areas I am interested in, and is not necessarily a representation of the broad diversity in terms of topics discussed at the event. But I am sure a few others will produce a similar list at some point.

Technology Innovation Showcase [K2906LV]
This was one of my favorite sessions in 2022, Kit Colbert and Chris Wolf will be talking futures/innovation and VMware. This means they will be sharing insights on what our R&D teams are currently working on and exploring. In my opinion, this is a must see!
State of Alaska: Rapid Cloud Migration from the Last Frontier [CEIB2447LV]
I do not know this speaker, Niel Smith, but the first sentence of the abstract just caught my interest: “If you think you can’t use public cloud because of latency, old applications, storage performance, network constructs, firewall policies, and backups, think again.” And I love to hear good customer stories.
State of Union for VMware Home Labs [CODEB2757LV]
William Lam will be talking for 45 minutes extremely fast on the topic of Home Labs. You all know his name, you all know his blog, and you all know his passion for home labs. This will be good, and I would encourage all the lab geeks to join this one!
Cloud Data Infrastructure: Strategy and Product Portfolio [VIB1643LV]
My former manager, Christos Karamanolis, is hosting this session and talking about the Cloud Data Infrastructure strategy. Last year they introduced Project Moneta, will be interesting to see what we can expect in the upcoming year(s)
Cyber Secure Storage and Data: VMware’s Next Storage Frontier? [CEIB2594LV]
I am very intrigued by the title, cyber secure storage and data, hosted by Sazzala Reddy and Vijay Ramachandran. I’ve seen some previews of the content, I suspect that this will be one of my favorite sessions this year. I am very excited about what is going to be shared.
Scaling and Deploying ESXi at the Edge with Desired State Management and GitOps [VIB2169LV]
Last year Project Keswick was introduced by Alan Renouf (listen to the UT podcast on this topic), it seems that this session is the follow-up to that, how to deploy ESXi at the edge at scale. Alan Renouf will be hosting this session with Sadaat Malik and Elliott Davis
Unleashing vSAN ESA – An Exploration of HCI and dHCI Deployment Options [CEIB1777LV]
Kris and Kalyan are diving deep into the world of disaggregated HCI in this session, with a focus on how vSAN ESA has evolved this space for VMware, and will continue to evolve.
NSX Security Mindset: The Architect’s Forum Part 1 and 2
Chris McCain (and friends) is hosting a two-part workshop where they will be talking about securing your environment and how to respond to attacks. This is on top of mind for many organizations out there, and although these sessions are lengthy (90 minutes each), I highly encourage to attend these.
Balancing Risk and Performance for Sustainable ICT Networks [CEIB2839LV]
Sustainability is on the radar of most CIOs and CTOs these days, but what can YOU as an architect or administrator contribute? Frank and Valentin are going to be discussing exactly that!
45 Minutes of NUMA – A CPU Is Not a CPU Anymore [CODEB2761LV]
This session has been a top-rated session for years, and for a good reason, Frank Denneman and Yu Wang will dive deep into CPU architectures and discuss how you can get the most out of your infrastructure by understanding the CPU architecture and NUMA topology!

Unexplored Territory #049 and #050, all about multi-cloud and cloud native workloads!

Duncan Epping · Jul 12, 2023 · Leave a Comment

I was working on my VMware Explore presentations so I forgot to post #049, figured I would post both at the same time for those who hadn’t seen these yet. In episode 049 we had two guests for the very first time, Gerrit Lehr and Andrea Siviero. Andrea and Gerrit talked us through the Multi-Cloud Adoption Framework and explained why customers are interested in this service and how it helps them meet their business goals. Listen to the full episode via Spotify (bit.ly/3Ny1EXE), Apple (bit.ly/449s2xA), or via the embedded player below.

Episode 050 focusses on Self-Managed Tanzu Mission Control, and we had Corey Dinkens as our guest. Corey discussed what Tanzu Mission Control is about, what the use case is, how customers are consuming it today, and why a self-managed solution makes sense for some customers compared to the SaaS offering. Interesting stuff if you ask me. Listen via Spotify (bit.ly/3XHU3dE), Apple (bit.ly/3XLm7g5), or use the embedded player below.