A while ago I had the pleasure to join David S. Linthicum from GigaOm on their Voices in Cloud Podcast. It is a 22 minute podcast where we discuss various VMware efforts in the cloud space, edge computing and of course HCI. You can find the episode here, where they also have the full transcript for those who prefer to read instead of listen to a guy with a Dutch accent. It was a fun experience for sure, I always enjoy joining podcast’s and talking tech… So if you run a podcast and are looking for a guest, don’t hesitate to reach out!
A few weeks ago VMware announced Project nanoEDGE on their blog virtual blocks. I had a whole bunch of questions the following days from customers and partners interested in understanding what it is and what it does. I personally prefer to call project nanoEDGE “a recipe”. In the recipe, it states which configuration would be supported for both vSAN as well as vSphere. Lets be clear, this is not a tiny version of VxRail or VMware Cloud Foundation, this is a hardware recipe that should help customers to deploy tiny supported configurations to thousands of locations around the world.
Project nanoEDGE is a project by VMware principal system engineer Simon Richardson. The funny thing is that right around the time Simon started discussing this with customers to see if there would be interest in something like this, I had similar discussions within the vSAN organization. When Simon mentioned he was going to work on this project with support from the VMware OCTO organization I was thrilled. I personally believe there’s a huge market for this. I have had dozens of conversations over the years with customers who have 1000s of locations and are currently running single-node solutions. Many of those customers need to deliver new IT services to these locations and the requirements for those services have changed as well in terms of availability, which makes it a perfect play for vSAN and vSphere (with HA).
So first of all, what would nanoEDGE look like?
As you can see, these are tiny “desktop alike” boxes. These boxes are the Supermicro E300-9D and they come in various flavors. The recipe currently explains the solution as 2 full vSAN servers and 1 host which is used for the vSAN Witness for the 2 node configuration. Of course, you could also run the witness remotely, or even throw in a switch and go with a 3 node configuration. The important part here is that all used components are on both the vSphere as well as the vSAN compatibility guide! The benefit of using the 2-node approach is the fact that you can use cross-over cables between the vSAN hosts and avoid the cost of a 10GbE Switch as a result! So what is in the box? The bill of materials is currently as follows:
- 3x Supermicro E300-9D-8CN8TP
- The box comes with 4x 1GbE NIC Port and 2x 10GbE NIC Port
- 10GbE can be used for direct connect
- It has an Intel® Xeon® processor D-2146NT – 8 cores
- 6 x 64GB RAM
- 3x Intel SSD M.2 960GB
- 3x Toshiba NVMe M2 256GB
- I would recommend the Intel P4801 100/200/375GB devices instead as I can’t find the Toshiba device on the vSAN HCL!
- 3x Supermicro SATADOM 64GB
- 1 x Managed 1GbE Switch
From a software point of view the paper lists they tested with 6.7 U2, but of course, if the hardware is on the VCG for 6.7 U3 than it will also be supported to run that configuration. Of course, the team also did some performance tests, and they showed some pretty compelling numbers (40.000+ read IOPS and close to 20.000 write IOPS), especially when you consider that these types of configurations would usually run 15-20 VMs in total. One thing I do want to add, the bill of materials lists M.2 form factor flash devices, this allows nanoEdge to avoid the use of the internal unsupported AHCI disk controller, this is key in the hardware configuration! Note that although the original BoM lists Toshiba M.2 devices, I would highly recommend using the Intel P4801 instead as already mentioned above, so my recommendation for a supported configuration would be:
- 3 x Intel S4510 M.2 device for capacity (as per vSAN HCL)
- 3 x Intel P4801 M.2 device for caching (as per vSAN HCL)
There are many other options on the vSAN HCL for both caching as well as capacity, so if you prefer to use a different device, make sure it is listed here.
I would recommend reading the paper, and if you have an interest in this solution please reach out to your local VMware representative for more detail/help.
Yesterday I tweeted something and I want to reiterate it to make sure that those who are just following the blog, and not my twitter account, also are aware. On the vSAN Compatibility Guide (VCG) there were already a number of single-socket servers, but most of these were limited in terms of CPU/MEM resources. Last week two new servers were added to the VCG. These servers are based on the AMD EPYC Rome CPUs and can have up to 64 cores. Yes, 64 cores per CPU. They can go up to 2TB worth of memory, depending on the DIMMS used, also while on the topic of memory, the NUMA implementation completely changed with AMD EPYC Rome, but I am sure Frank Denneman will have something to say about that soon. Why would I bring these servers up? Well, for those looking to do 2-node vSAN configurations or smaller vSAN clusters, they could be a great alternative solution! Heck, I would consider them in general I think.
Two new Dell – AMD EPYC Rome based ReadyNode configs were recently added to the vSAN HCL. Single socket, 32 or 64 cores. Pretty sweet! https://t.co/FwppsLfWMQ
— Duncan Epping (@DuncanYB) October 7, 2019
I’ve had this question over a dozen times now, so I figured I would add a quick pointer to my blog. What is causing the error “vSphere HA agent on this host could not reach isolation address” to pop up on a 2-node direct connect vSAN cluster? The answer is simple, when you have vSAN enabled HA uses the vSAN network for communication. When you have a 2-node Direct Connect the vSAN network is not connected to a switch and there are no other reachable IP addresses other than the IP addresses of the vSAN VMkernel interfaces.
When HA tries to test if the isolation address is reachable (the default gateway of the management interface) the ping will fail as a result. How you can solve this is simply by disabling the isolation response as described in this post here.
I noticed this question on Reddit about .PNG which were located in VM folders on a datastore. The user wanted to remove the datastore from the cluster but didn’t know where these files were coming from and if the VM required those files to be available in some shape or form. I can be brief about it, you can safely delete this .PNG files. These files are typically created by VM Monitoring (part of vSphere HA) when a VM is rebooted by VM Monitoring. This is to ensure you can troubleshoot the problem potentially after the reboot has occurred. So it takes a screenshot of the VM to for instance capture the blue screen of death. This feature has been in vSphere for a while, but I guess most people have never really noticed it. I wrote an article about it when vSphere 5.0 was released and below is the screenshot from that article where the .PNG file is highlighted. For whatever reason I had trouble finding my own article on this topic so I figured I would write a new one on it. Of course, after finishing this post I found the original article. Anyway, I hope it helps others who find these .PNG files in their VM folders.
Oh, and I should have added, it can also be caused by vCloud Director or be triggered through the API, as described by William in this post from 2013.