This week I had the pleasure of talking to fellow dutchy Harold Buter. Harold is the CTO for Peasoup and we had a lively discussion about Virtual SAN and why Peasoup decided to incorporate Virtual SAN in their architecture, what struck me was the fact that Peasoup Hosting was brought to life partly as a result of the Virtual SAN release. When we introduced Virtual SAN, Harold and his co-founder realized that this was a unique opportunity to build something from the ground up while avoiding big upfront costs typically associated with legacy arrays. How awesome is that, a new product that results in to new ideas and in the end a new company and product offering.
The conversation of course didn’t end there, lets get in to some more details. We discussed the use case first. PeaSoup is a hosting / cloud provider. Today they have two clusters running based on Virtual SAN. They have a management cluster which hosts all components needed for a vCloud Director environment and then they have a resource cluster. The great thing for PeaSoup was that they could start out with a relatively low investment in hardware and scale fast when new customers on-board or when existing customers require new hardware.
Talking about hardware PeaSoup looked at many different configurations and vendors and for their compute platform decided to go with Fujitsu RX300 rack mount servers. Harold mentioned that by far these were the best choice for them in terms of price, build quality and service. Personally it surprised me that Fujitsu came out as the cheapest option, it didn’t surprise me that Fujitsu’s service and build quality was excellent though. Specs wise the servers have 800GB SSDs, 7200 RPM NL-SAS disks and 256GB memory and of course two CPUs (Intel 2620 v2 – 6 core).
Harold pointed out that the only down side of this particular Fujitsu configuration was the fact that it only came with a disk controller that is limited to “RAID O” only, no passthrough. I asked him if they experienced any issues around that and he mentioned that they had 1 disk failure so far and that is resulted in having to reboot the server in order to recreate a RAID-0 set for that new disk. Not too big of a deal for PeaSoup, but of course if possible he would prefer to prevent this reboot from being needed. The disk controller by the way is based on the LSI 2208 chipset and it is one of things PeaSoup was very thorough about, making sure it was supported and that it had a high queue depth. The “HCL” came up multiple times during the conversation and Harold felt that although doing a lot of research up front and creating a scalable and repeatable architecture takes time, it also results in a very reliable environment with predictable performance. For a cloud provider reliability and user experience is literally your bread and butter, they couldn’t afford to “guess”. That was also one of the reasons they selected a VSAN Ready Node configuration as a foundation and tweaked where their environment and anticipated workload would require it.
Key take away: RAID-0 works perfectly fine during normal usage, only when disks need to be replaced a slight different operational process is required.
Anticipated is a keyword once again as it has been in many of the conversations I’ve had before, it is often unknown what kind of workloads will run on top of these infrastructures which means that you need to be able to be flexible in terms of scaling up versus scaling out. Virtual SAN provides just that to PeaSoup. We also spoke about the networking aspect. As a cloud provider running vCloud Director and Virtual SAN networking is a big aspect of the overall architecture. I was interested in knowing what kind of switching hardware was being used. PeaSoup uses Huawei 10GbE switches(CE6850), and each Server is connected with at least 4 x 10GbE port to these switches. PeaSoup dedicated 2 of these ports to Virtual SAN, which wasn’t a requirement from a load perspective (or from VMware’s point of view) but they preferred this level of redundancy and performance while having a lot of room to grow. Resiliency and future proof are key for PeaSoup. Price vs Quality was also a big factor in the decision to go with Huawei switches, Huawei in this case had the best price/quality ratio.
Key take away: It is worth exploring different network vendors and switch models. Prices greatly variate between vendors and models which could lead to substantial cost savings without impacting service / quality
Their host and networking configuration is well documented and can be easily repeated when more resources are needed. They even have discount / pricing documented with their suppliers so they know what the cost will be and can assess quickly what is needed and when, and of course what the cost will be. I also asked Harold if they were offering different storage profiles to provide their customers a choice in terms of performance and resiliency. So far they offer two different policies to their customers:
- Failures to tolerate = 1 // Stripe Width = 2
- Failures to tolerate = 1 // Stripe Width = 4
So far it appears that not too many customers are asking about higher availability, they recently had their first request and it looks like the offering will include “FTT=2” along side “SW=2 / 4” in the near future. On the topic of customers they mentioned they have a variety of different customers using the platform ranging from companies who are in the business of media conversion, law firms to a company which sells “virtual private servers” on their platform.
Before we wrapped up I asked Harold what the biggest challenge for them was with Virtual SAN. Harold mentioned that although they were a very early adopter and use it in combination with vCloud Director they have had no substantial problems. What may have been the most challenging in the first months was figuring out the operational processes around monitoring. Peasoup is a happy Veeam customer and they decided to use Veeam One to monitor Virtual SAN for now, but in the future they will also be looking at the vR Ops Virtual SAN management pack, and potentially create some custom dashboards in combination with LogInsight.
Key take away: Virtual SAN is not like a traditional SAN, new operational processes and tooling may be required.
PeaSoup is an official reference customer for Virtual SAN by the way, you can find the official video below and the slide deck of their PEX session here.
Just a thought, but when I create RAID0 for a vSAN deployment on a Cisco server (using an LSI controller) I use the “storcli” command as described in the KB:2111266.
Wouldn’t that be enough to avoid the reboot in order to recreate the RAID0 when replacing a faild hard drive?
Yes — technically. LSI has “vmware-esx-storcli”, and you can also use “sas2flash” to update LSI firmware. Tools usually available for most operating systems. Then there are things like Dell Open Manage (DRAC), and many other iLO’s capable of array management. iLOs are an indispensable tool for out-of-band (OOB) management when it’s applicable. Knowing that your actions could take a device down, where normal physical access is difficult, is correctable by accessing the device with a console server…
Since Pass-Through(IT) assumes disk management capabilities, adding a manual task like array management is deprecated. RAID0 adds complexity where there need not be. Things are complex enough! 🙂
Martin Gavanda says
Well Raid 0 gives you write-back which is tipically benefit in terms of latency. MegaCli works on ESXi too.
if you look at the KB I mentioned you will see that VMware best practice is to disable write cache on the controller, so no write back with RAID0.
John Nicholson says
Which makes sense especially given that if power is not restored to a system with a BBU long enough write cache (and data) could be lost, vs persistent flash cache. The benefit for writes in NAND flash over DRAM is negligible considering it has to go over Ethernet for an ack anyways. For reads objects I think have a 1MB RAM cache anyways, thats in theory (faster) than a PCI attached device so having a more dirrect path to the flash (IE cut through) in theory may actually yield better results (Beyond just being more resilient).
Duncan Epping says
Best Practice is to disable write-back caching. As we already have a write caching mechanism we don’t want these two to interfere and need to be absolutely certain that what we think we wrote to persistent media is actually stored on persistent media.
Duncan Epping says
Yes, potentially these tools could be used.
technolg is really good and thanks for sharing