VMware

New book: VMware vSAN 6.7 U1 Deep Dive

Duncan Epping · Dec 12, 2018 ·

Cormac Hogan and I have been working late nights and weekends over the past months to update our vSAN book material. Thanks Cormac, it was once again a pleasure working with you on this project! As you may know, we released two versions of a vSAN based book through VMware Press. The book was titled vSAN Essentials. As mentioned before, after restructuring and rewriting a lot of the content we felt that the title of the book didn’t match the content, so we decided to rebrand it to vSAN 6.7 U1 Deep Dive. After receiving very thorough reviews by Frank Denneman and Pete Koehler (Thanks guys!) we managed to complete it this week after we added a great foreword by our business unit’s SVP and General Manager, Yanbing Li.

Cormac and I decided to take the self-publishing route for this book, which allows us to set a great price for the ebook and enable the Amazon matchbook option, giving everyone who buys the paper version through Amazon the option to buy the e-book with a nice discount! As prices will vary based on location I am only going to list the USD prices. Please check your local Amazon website for localized prices. Oh, and before I forget, I would like to recommend buying the ebook flavor! Why? Well:

“On average, each printed book releases 8.85 pounds of carbon dioxide into the environment. Together, the newspaper and book-printing industries cut down 125 million trees per year and emit 44 million tons of CO2.”

We appreciate all support, but we prefer the cleanest option from an environmental stance, this is also the reason we priced the ebook a lot cheaper than the paper version. Anyway, here are the links to the US store, we hope you enjoy the content, and of course as always an Amazon review would be appreciated! Interestingly, it seems we already reached number 1 in the category Virtualization and the category Storage before this announcement, thanks everyone, we really appreciate it! (Please note, as an Amazon Associate I earn from below qualifying purchases.)

Paper version – 39.95 USD
Ebook version – 9.99 USD
Match book price – 2.99 USD for the ebook!
(you need to buy the paper edition first before you see this discount, and this may not be available in all regions, unfortunately.)

UPDATE:

It appears that some Amazon stores take a bit longer to index the content, so listing all the different versions below for the different stores that sell it:

CTO3509BU: Embracing the Edge: Automation, Analytics, and Real-Time Business

Duncan Epping · Sep 7, 2018 ·

This is the last post in this VMworld Sessions series. Although the title lists “CTO3509BU: Embracing the Edge: Automation, Analytics, and Real-Time Business” which is by Chris Wolf and Daniel Beveridge, I would also highly recommend watching Daniel’s other session titled “CTO2161BU Smart Placement of Workloads in Tomorrow’s Distributed Cloud“. Both sessions discuss a similar topic and this Edge vs Cloud and where workloads and data should be placed. Both very interesting topics if you ask me, and definitely topics I am starting to explore more.

Chris discussed the various use cases around Edge Computing and the Technology Drivers, some of these very obvious but some of them not so much. What often is skipped is the business continuity aspect of edge but also things like network costs, limitations, and even data gravity. It is good to see that Chris addressed these. Some people still seem to be under the impression that every workload can run in the cloud, but in many cases it simply isn’t possible to send data to the cloud. Could be that the volume is too high, could be that the time it takes to transfer and analyze is too long (transaction execution time), or maybe it is physically impossible. It could also be that the application is mission-critical, meaning that the service can’t rely on a connection to the internet.

As a company, VMware is aiming to provide a solution for Edge and IoT, yet work closely with the very rich partner ecosystem and the main focus is providing a “native experience” for developers. Which provides customers choice as it avoids lock-in. Now I don’t want to start a lock-in discussion here as one could claim that it is difficult to migrate between platforms, and this is always the case, if not only because of the operational aspects (tooling/processes). A diagram which explains the different initiatives was then presented, and I like this diagram a lot as it differentiates between “device edge” and “compute edge”, on top of that it shows a differentiation between the device edge focussed on things vs people (big difference).

Next discussed is IoT management, Chris explains how Pulse 2.0 will be capable of managing up to 500 million managed objects. Pulse provides central management across different IoT device manufacturers. Instead of having a point solution for each manufacturer we introduced an abstraction layer and automate things like updates etc. (Sounds familiar?) Then ESXi for ARM is briefly touched upon, as Christ mentioned this is not for general purpose intended. VMware is looking for very specific use cases, if you are one of those partners/customers that has a use case for ESXi on ARM then please reach out to us and let’s discuss the use case and opportunity!

First, a new project is introduced by Daniel, it is called Project Nebula. Nebula brings an IoT marketplace, in this marketplace you can select various IoT services (which come in the form of containers), which are then sent to the IoT gateways. It looks pretty cool, as Daniel shows how he simply pushed various IoT services down to capable IoT gateways. So there’s a validation there if the edge services can run on the specific devices. On top of that, a connection to specific cloud services can also be provided so that certain metrics can be send up and analyzed when needed. Pretty smooth, I also like the fact that it provides monitoring, even down to the sensor and individual service as shown in the second screenshot below.

Next, it is briefly discussed why vSphere/VMware is the right platform, and then they jump into the momentum there is around cloud services and edge computing today. A brief overview of Amazon RDS on VMware is given and more importantly why this is a valuable solution, especially the replication of instances from on-premises to cloud and across regions. Of course, AWS Greengrass is mentioned, VMware also has a story in this space. You can run Greengrass on-premises in a highly available manner and it is dead simple to implement. For those who have not seen the announcements around that, read up here. Next Chris and Daniel go over various use cases, I guess Daniel likes wine as he explains how a Winery leverages AWS Lamba and Greengrass to analyze data coming from sensors which then drives control systems. On top of that, based on customer (and sommelier) ratings of wine, leveraging the data provided by sensors and matching that with customer behavior the winery can predict which barrels will score higher and most likely sell better etc. Very interesting story.

Compute edge is discussed next, this is where project dimension comes in to play, however first Chris defines the difference between the different option people have for consuming certain services. When does Cloud, Compute or Device Edge make sense? It is all about “time” or “latency”, how fast do you or the process need a response from the system? Transaction time window within 500ms and latency lower than 5ms? Well then you need to process at “device edge layer”, if a transaction time of below 1s is acceptable and latency of around 20ms then the “compute edge would work. Is a transaction time of larger than 1s okay and latency of higher than 20ms, then the cloud may be an option. As I said, it all revolves around how long you can wait.

Project Dimension delivers a compute edge solution which runs on-premises but is managed by VMware and delivered as a service. What I also liked is that the “micro” and “nano” data center is discussed, meaning that there potentially will be an option in the future to buy small form factor solutions which allow you to run a handful of VMs. More importantly, these solutions will consume less power and require less cooling. These things can make a big difference, especially as many Edge locations don’t have a data center room. Again ESXi for ARM is mentioned, this sounds very interesting, would be interesting to see if there are plans to mix this with Project Dimension over time, but that is just me thinking out loud.

From a networking perspective of course VeloCloud is discussed, and some very cool projects where cloud networks can be utilized and per traffic type certain routes can be used based on availability and performance (I probably should say QoS).

That was it for now as I don’t want to type out the whole session verbatim, for more specifics please watch the two sessions, worth your time, TO3509BU: Embracing the Edge: Automation, Analytics, and Real-Time Business” and/or “CTO2161BU Smart Placement of Workloads in Tomorrow’s Distributed Cloud“.

HCI2164BU – HCI Management, current and futures

Duncan Epping · Sep 5, 2018 ·

This session by Christian Dickmann and Junchi Zhang is usually one of my favorites in the HCI track, mainly because they show a lot of demos and in many cases show you what ends up being part of the product in 6-12 months. The session revolved all around management, or as they called it in the session “providing a holistic HCI experience”.

After a short intro Christian showed a demo around what we currently have around the installation of the vCenter Server Appliance and how we can deploy that to a vSAN Datastore, followed by the Quickstart functionality. I posted a demo of Quickstart earlier this week, let me post it here as well so you have an idea of what it is/does.

In the next demo, Christian showed how you can upgrade the firmware of a disk controller using Update Manager. Pretty cool, but afaik still limited to a single disk controller, hopefully, more will follow soon. But more importantly, after that demo ended he started talking about “Guided SDDC Update & Patching”, and this is where it got extremely interesting. We all know that it isn’t easy to upgrade a full stack, and what Christian was describing would be doing exactly that. Do you have Horizon? Sure, we will upgrade that as well when we do vCenter / ESXi / vSAN etc. Do you have NSX as part of your infra? Sure, that is also something we will take into account and upgrade it when required. This would also include firmware upgrades for NICs, disk controllers etc.

Next Christian showed the Support Insight feature, which is enabled through the Customer Experience Improvement Program. His demo then showed how to create a support request right from the H5 Client. The process shows that the solution understands the situation and files the ticket. Then it shows what the support team sees. It allows the support team to quickly analyze the environment, and more importantly inform the customer about the solution. No need to upload log bundles or anything like that, that all happens automatically. That’s not where it stop, you will be informed in the H5 client about the solution as well. Cool right?

Next Junchi was up and he discussed Capacity Management first. As he mentioned it appears to be difficult for people to understand the capacity graphs provided by vSAN. Junchi proposes a new model where it is clear instantly what the usable space is and by what current capacity is being consumed. Not just on a cluster level, but also at a VM level. This should also include what-if scenarios for usage projection. Junchi then quickly demoed the tools available that help with sizing and scaling.

Next Native File Services was briefly discussed, Data Protection and Cloud Native Storage. What does the management of these services look like? The file services demo that Junchi showed was really slick. Fill out IP details and Domain details and have File Services running in a minute or two natively on vSAN. Only thing you would need to do is create file shares and give folks access to the file shares. Also, monitoring will go through the familiar screens like the health check etc.

Last but not least Junchi discusses the integration with vRealize Automation on-premises and SaaS-based, a very cool demo showing how Cloud Assembly (but also vRA) will be able to leverage storage policies and new applications are provided using blueprints which have these policies associated with them.

That was it, if you like to know more, watch the session online, or attend it in EMEA!

HCI1603BU – Tech Preview of Native vSAN Data Protection

Duncan Epping · Sep 4, 2018 ·

The second session I watched was HCI1603BU Tech Preview of Native vSAN Data Protection by Michael Ng. I already discussed vSAN Data Protection last year, but considering the vSAN Beta is coming up soon that includes this functionality I felt it was worth covering again. Note that the beta will be a private beta, so if you are interested please sign up, you may be one of the customers getting selected for the beta.

Michael started out with an explanation about what an SSDC brings to customers, and how a digital foundation is crucial for any organization that wants to be competitive in the market. vSAN, of course, is a big part of the digital foundation, and for almost every customer data protection and data recovery is crucial. Michael went over the various vSAN use cases and also the availability and recoverability mechanisms before introducing Native vSAN Data Protection.

Now it is time for the vSAN Native Data Protection introduction. Michael first explains that we will potentially have a solution in the future where we can simply create snapshots locally through specifying the number of local snapshots you want in policy. On top of that, in the future, we will potentially provide the option to specify the snapshots (plus a full copy) will need to be offloaded to secondary storage. Secondary storage could be NFS, S3 Object Storage (both on-premises and in the cloud). Also, it should be possible to replicate VMs and snapshots to a DR location through policies.

What I think is very compelling is the fact that the native protection comes as part of vSAN/vSphere, there’s no need to install an appliance or additional software. vSAN Data Protection will be baked into the platform. Easy to enable and easy to consume through policy. The first focus is vSAN Local Data Protection.

vSAN Local Data Protection will provide Crash and Application-consistent snapshots at an RPO of 5 minutes and with a low RTO. On top of that, it will be possible to instant clone the snapshot. Meaning that you can restore the snapshot as an “instant clone”, this could be interesting when you want to test a certain patch or upgrade for instance. You can even specify during the recovery that the NIC doesn’t need to be connected. Application consistency is achieved by leveraging VSS providers on Windows and on Linux the VMware Tools pre- and post-scripts are being used.

What enables vSAN Data Protection is a new snapshotting technology. This new technology provides a lot better performance than traditional vSphere (or vSAN) snapshots. It also provides for better scale, meaning that you can go way above the 32 limit we currently have.

Next Michael demoed vSAN Data Protection, which is something I have done on various occasions if you are interested in what it looks like just watch the session. If I have time I may record a demo myself just so it is easier to share with you.

What I personally hadn’t seen yet were the additional performance views added. Very useful as it allows you to quickly check what the impact is of snapshots on general performance. Is there an impact? Do I need to change my policy?

Last but not least various questions were asked, most interesting parts was the following:

“file level restore” is on the roadmap but the first feature they will tackle is offloading to secondary storage.
“consistency groups” is something that is being planned for, especially useful when you have applications or services spanning VMs.
Integration with vRealize Automation, some of it is planned for the first release, everything is SPBM based which already have APIs. Being planned for is “self-service restore”
100 snapshots per VM is tested for the first release

Good session, worth watching!

HCI1998BU – Enable High-Capacity Workloads with Elastic vSAN on VMware Cloud

Duncan Epping · Sep 4, 2018 ·

I just watched the session by Rakesh and Peng on Elastic vSAN, also known as “EBS Backed vSAN”. This session was high on my list to watch live at VMworld, but unfortunately, I couldn’t attend it due to various other obligations. If you are interested in the full session, make sure to watch it here, it is free. If you want to read a short summary then have a look below.

EBS backed vSAN is exactly what you expect it to be, having said that I do want to point out that EBS backed vSAN is supported for vSAN in VMware Cloud on AWS only. On top of that, it is recommended to run workloads on it which require high capacity. You could, for instance, consider leveraging EBS backed vSAN as a high capacity target for DR as a Service. But of course this could also be used in cases where there is sufficient CPU/Memory capacity available, but only storage needs to scale in VMware Cloud on AWS. 10TB is the capacity limit per host in VMC today, EBS backed vSAN removes this limit. With EBS backed vSAN you can increase the host per 15, 20, 25, 30 or 35TB per host. Which means you can deliver up to 140TB of capacity in a single 4 node cluster, for 16 nodes that is 560TB!

What is great about this solution is that it also solves another problem. Everyone knows that a host failure results in resyncing data. And depending on how much capacity the host was delivering this could take a long time. With EBS backed vSAN this problem does not exist any longer. When a host fails the EBS volumes simply will be mounted to another host, or a new host when this is introduced. This is a huge benefit if you ask me, even when there’s a high change rate as this happens within seconds.

One thing to point out as a constraint though is that today in VMC you can’t run the management workloads on EBS backed vSAN just yet. Rakesh did mention that this is being tested.

Next, the architecture was discussed, this is where Peng took over. He mentioned that the IOPS limit is set to 10K (regardless of the size) and the throughput is limited at 160MBps. All of this delivered typically with sub-millisecond latency, which is very impressive. Also, Peng mentioned that EBS backed vSAN provided very consistent and predictable performance in all tests. On top of that, EBS backed vSAN is also very reliable and highly available, even when compared to flash devices.

What I found interesting is the architecture, vSAN gets presented a SCSI device, however EBS is network attached and an EBS protocol client was implemented and then presented as an NVMe target through the PCI-e interface. The PCI-e interface allows for multi-volume, hot-add and hot-remove. This is what allows the EBS devices to be removed from a host which has failed (or has a failure) and then added to a healthy host.

When EBS backed vSAN is enabled each host will have 3 disk groups, and each disk group will have 3-7 capacity disks. Note that it is recommended to use RAID-5 for space efficiency and “Compression only mode” is enabled on these disk groups. Considering the target workloads, and the architecture (and EBS performance constraints) it didn’t make sense to use deduplication, hence the vSAN team implemented a solution where it is possible to have only compression enabled. Some I/O amplification is not an issue when you run all-flash and have hundreds of thousands of IOPS per device, but as stated EBS is limited to 10k IOPS per device, which means you need to be smart about how you use those resources.

During the Q&A one thing that was mentioned, which I found interesting, is that although today EBS backed vSAN needs to be introduced in certain increments across the whole cluster, that will not be the case in the future. In the future, according to Peng, it should be possible to add EBS volumes to disk groups on particular hosts even, allowing for full and optimal flexibility,

And for those who didn’t know, the VMworld Hands-On Labs was running on top of EBS backed vSAN and performance above expectations!