devops

Underlying Infrastructure for your pets and cattle

Duncan Epping · Oct 10, 2014 ·

Last week on twitter Maish asked a question that got me thinking. Actually, I have been thinking about this for a while now. The question deals with how you design your infrastructure for the various types of workloads (pets and cattle). Whether your workload falls in the “pet” category or the “cattle” category. (If you are not familiar with the terms pets/cattle read this article by Massimo)

Pets vs. cattle – I think that people are mistaken as to what that exactly means for the underlying infrastructure.

— Maish Saidel-Keesing (@maishsk) October 5, 2014

I asked Maish what it actually means for your infrastructure, and at the same time I gave it some more thought over the last week. Cattle is the type of application architecture which handles failures by providing a distributed solution, it scales out instead of up typically, the VMs are disposable as they usually won’t hold state. With pets this is different, they typically scale up and resiliency is often provided by either a 3rd party clustering mechanism or the infrastructure under neath, in many cases contain state and recoverability is key. As you can imagine both types of workloads have different requirements of the infrastructure. Going back to Maish’s question, I guess the real question is if you can afford the “what it means for the underlying infrastructure”. What do I mean with that?

If you look at the requirements of both architectures, you could say that “pets” will typically demand more from the underlying infrastructure when it comes to resiliency / recoverability. Cattle will have less demands from that perspective but flexibility / agility is more important. You can imagine that you could implement two different infrastructure architectures for these specific workloads, but does this make sense? If you are Netflix, Google, Youtube etc then it may make sense to do this due to the scale they operate at and the fact that IT is their core business. In those cases “cattle” is what drives the business, and there are back-end systems. Reality is though that for the majority this is not the case. Your environment will be a hybrid, and more than likely “pets” will have the overhand as that is simply what the state of the world is today.

That does not mean they cannot co-exist. That is what I believe is the true strength of virtualization, it allows you to run many different types of workloads on the same infrastructure. Whether that is your Exchange environment or your in-house developed scale out web application which serves hundreds of thousands of customers does not make a difference to your virtualization platform. From an operational perspective the big benefit here is that you will not have to maintain different run books to manage your workloads. From an ops perspective they will look the same on the outside, although they may differ on the inside. What may change though is the services required for those systems, but with the rich ecosystem available for virtualization platforms these days that should not be a problem. Need extra security / microsegmentation? VMware NSX can provide the security isolation needed to run these applications smoothly. Sub milliseconds latency requirements? Plenty of storage / caching solutions out there that can deliver this!

Will the application architecture shift that is happening right now impact your underlying infrastructure? We have made these huge steps in operational efficiency in the last 5 years, and with SDDC we are about to take the next big step, and although I do believe that the application architecture shift will result in infrastructure changes lets not make the same mistakes we made in the past by creating these infrastructure silos per workload. I strongly believe that repeatability, consistency, reliability and predictability are key and this starts with a solid, scalable and trusted foundation (infrastructure).

Project Fargo aka VMFork – What is it?

Duncan Epping · Oct 7, 2014 ·

I have seen various people talking about Project Fargo (also known as VM Fork or Instant Clone) and what struck me is that many are under the impression that Project Fargo is the result of the CloudVolumes acquisition. Lets set that straight first, Project Fargo is not based on any technology developed by the CloudVolumes team. Project Fargo has been developed in house and as far as I can tell is an implementation of Snowflock (University of Toronto / Carnegie Mellon University), although I know that in house they have been looking at techniques like these for a long time. Okay, now that we have that out of the way, what is Project Fargo?

Simply said: Project Fargo is a solution that enables you to rapidly clone a running VM. When I say “rapidly clone”, I mean RAPIDLY… Within seconds. Yes, that is extremely fast for a running VM. What should be noted here of course is the fact that it is not a full clone. I guess this is where the “VMFork” name comes in to play, the “parent” virtual machine is quiesced and forked and a “child” VM is born. This child VM is leveraging the disk and memory of the parent (for reads), this is why it is so extremely fast to instantiate… as I said literally seconds, as it “only” needs to create empty delta files, create a VMX and instantiate the process, and do some networking magic as you do not want to have VMs popping up on the network with the same MAC address. Note that the child VM starts where the parent VM left off, so there is no boot process it is instant on! (just like you suspend and resume it) I can’t reveal too much around how this works, yet, but you can imagine that a technique like “fast suspend resume” (FSR), which is the corner stone of features like Storage vMotion, is leveraged.

The question then arises, what if the child wants to write data to memory or disk? This is where the “copy on write” technique comes in to play. Of course the child won’t be allowed to over write shared memory pages (or disk for that matter) and as such a new page will be allocated. For those having a hard time visualizing it, note that this is a conceptual diagram and not how it actually is implemented, I should have maybe drawn the different layers but it would make it too complex. In this scenario you see a single parent with a single child, but you can imagine there could also be 10 child VMs or more, you can see how efficient that would be in terms of resource sharing! And even for the pages which would be unique compared to the parent, if you clone many similar VMs there is a significant chance that TPS will be able to collapse those even! One thing to point out here is that the parent VM is quiesced, in other words it’s sole purpose is allowing for the quick creation of child VMs.

Cool piece of technology I agree, but what would the use case be? Well there are multiple use cases, and those who will be attending VMworld should definitely visit the sessions which will discuss this topic or watch them online (SDDC3227, SDDC3281, EUC2551 etc). I think there are 2 major use cases: virtual desktops and test/dev.

The virtual desktop (just in time desktops) use case is pretty obvious… You create that parent VM, spin it up and it gets quiesced and you can start forking that parent when needed. This will almost be instant, very efficient and also reduce the required resource capacity for VDI environments.

With test/dev scenarios you can imagine that when testing software you don’t want to wait for lengthy cloning processes to finish. Forking a VM will allow you to rapidly test what has been developed , within seconds you have a duplicate environment which you can use / abuse any way you like and destroy it when done. As the disk footprint is small, create/destroy will have a minimal impact on your existing infrastructure both from a resource and “stress” point of view. It basically means that your testing will take less time “end-to-end”.

Can’t wait for it to be available and to start testing it, especially when combined with products like CloudVolumes and Virtual SAN this feature has a lot of potential.

** UPDATE: Various people asked questions around what would happen with VMFork now TPS is disabled by default in upcoming versions of vSphere. I spoke with the lead engineer on this topic and he assured me there is no impact on VMFork. The disabling of TPS will be overruled per VMFork group. So the parent and childs belonging to the same group will be able to leverage TPS and share pages. **

vFabric Application Director

Duncan Epping · May 29, 2012 ·

I am not going to pretend I am the devops expert here but I was playing around with vFabric Application Director last week and I thought it was really cool solution. I can really see the value for development teams, but also for large support services teams and even education org’s. Being able to create deployment / configuration plans for application stacks by simply dragging and dropping is something I would have loved to have when I supported various development and support teams in a previous life.

I could have saved a lot of time… during physical machine and virtual machine provisioning, installation of software (automated if I was lucky), configuration and figuring out how it all worked with this weird database flavor or new Operating System. With App Director, yes you will need to figure all of it out once, you can easily repeat the same steps over and over again. You can select different Guest OS’s, different databases, different apps etc.

If you are in the same boat as I once was I would suggest watching this video and giving App Director a test run just to figure out if it can simplify your life!