VMworld 2015: Site Recovery Manager 6.1 announced

This week Site Recovery Manager 6.1 was announced. There are many enhancements in SRM 6.1 like the integration with NSX for instance and policy driven protection, but personally I feel that support for stretched storage is huge. When I say stretched storage I am referring to solutions like EMC VPLEX, Hitachi Virtual Storage Platform and IBM San Volume Controller(etc). In the past, and you can still today, when you had these solutions deployed you would have a single vCenter Server with a single cluster and moved VMs around manually when needed, or let HA take care of restarts in failure scenarios.

As of SRM 6.1 running these types of stretched configurations is now also supported. So how does that work, what does it allow you to do, and what does it look like? Well in contrary to a vSphere Metro Storage Cluster solution with SRM 6.1 you will be using two vCenter Server instances. These two vCenter Server instances will have an SRM server attached to it which will use a storage replication adaptor to communicate to the array.

But why would you want this? Why not just stretch the compute cluster also? Many have deployed these stretched configurations for disaster avoidance purposes. The problem is however that there is no form of orchestration whatsoever. This means that all workloads will come up typically in a random fashion. In some cases the application knows how to recover from situations like that, in most cases it does not… Leaving you with a lot of work, as after a failure you will now need to restart services, or VMs, in the right order. This is where SRM comes in, this is the strength of SRM, orchestration.

Besides doing orchestration of a full failover, what SRM can also do in the 6.1 release is evacuate a datacenter using vMotion in an orchestrated / automated way. If there is a disaster about to happen, you can now use the SRM interface to move virtual machines from one datacenter to another, with just a couple of clicks, planned migration is what it is called as can be seen in the screenshot above.

Personally I think this is a great step forward for stretched storage and SRM, very excited about this release!

Rubrik 2.0 release announced today

Today the Rubrik 2.0 release was announced. I’ve written about who they are and what they do twice now so I am not going to repeat that. If you haven’t read those articles please read those first. (Article 1 and article 2) Chris Wahl took the time to brief me and the first thing that stood out to me was the new term that was coined namely: Converged Data Management. Considering what Rubrik does and has planned for the future I think that term is spot on.

When it comes to 2.0 there are a bunch of features that are introduced, I will list them out and then discuss some of them in a bit more detail:

  • New Rubrik appliance model r348
    • Same 2U/4Node platform, but leveraging 8TB disks instead of 4TB disks
  • Replication
  • Auto Protect
  • WAN Efficient (global deduplication)
  • AD Authentication – No need to explain
  • OpenStack Swift support
  • Application aware backups
  • Detailed reporting
  • Capacity planning

Lets start at the top, a new model is introduced next to the two existing models. The 2 other models are also both 2U/4Node solutions but use 4TB drives instead of the 8TB drives the R348 will be using. This will boost capacity for  single Brik up to roughly 300TB, in 2U this is not bad at all I would say.

Of course the hardware isn’t the most exiting, the software changes fortunately are. In the 2.0 release Rubrik introduces replication between sites / appliances and global dedupe which ensures that replication is as efficient as it can be. The great thing here is that you backup data and replicate it straight after it has been deduplicated to other sites. All of this is again policy driven by the way, so you can define when you want to replicate, how often and for how long data needs to be saved on the destination.

Auto-protect is one of those features which you will take for granted fast, but is very valuable. Basically it will allow you to set a default SLA on a vCenter level, or Cluster – Resource Pool – Folder, you get the drift. Set and forget is basically what this means, no longer the risk of newly provisioned VMs which have not been added to the backup schedule. Something really simple, but very useful.

When it comes to applications awareness Rubrik in version 2.0 will also leverage a VSS provider to allow for transactional consistent backups. This applies today for Microsoft Exchange, SQL, Sharepoint and Active Directory. More can be expected in the near future. Note that this applies to backups, for restoring there is no option (yet) to restore a specific mailbox for instance, but Chris assured me that this on their radar.

When it comes to usability a lot of improvements have been made starting with things like reporting and capacity planning. One of the reports which I found very useful is the SLA Compliancy reporting capability. It will simply show you if VMs are meeting the defined SLA or not. Capacity planning is also very helpful as it will inform you what the growth rate is locally and in the cloud, and also when you will be running out of space. Nice trigger to buy an additional appliance right, or change your retention period or archival policy etc. On top of that things like object deletion, task cancellation, progress bars and much more usability improvements have made it in to the 2.0 release.

All in all an impressive release, especially considering the 1.0 was released less than 6 months ago. It is great to see a high release cadence for an industry which has been moving extremely slow for the past decades. Thanks Rubrik for stirring things up!

Automating vCloud Director Resiliency whitepaper released

About a year ago I wrote a whitepaper about vCloud Director resiliency, or better said I developed a disaster recovery solution for vCloud Director. This solution allows you to fail-over vCloud Director workloads between sites in the case of a failure. Immediately after it was published various projects started to implement this solution. As part of our internal project our PowerCLI guru’s Aidan Dalgleish and Alan Renouf started looking in to automating the solution. Those who read the initial case study probably have seen the manual steps required for a fail-over, those who haven’t read this white paper first

The manual steps in the vCloud Director Resiliency whitepaper is exactly what Alan and Aidan addressed. So if you are interested in implementing this solution then it is useful to read this paper new white paper about Automating vCloud Director Resiliency as well. Nice work Alan and Aidan!

VMware View Infrastructure Resiliency whitepaper published

One of the white papers I worked on in 2012 when I was part of Technical Marketing was just published. This white paper is about VMware View infrastructure resiliency. It is a common question from customers, and now with this white paper you can explore the different options and understand the impact of these options. Below is a link to the paper and the description is has on the VMware website.

VMware View Infrastructure Resiliency: VMware View 5 and VMware vCenter Site Recovery Manager
“This case study provides insight and information on how to increase availability and recoverability of a VMware View infrastructure using VMware vCenter Site Recovery Manager (SRM), common disaster recovery (DR) tools and methodologies, and vSphere High Availability.”

I want to thank Simon Richardson, Kris Boyd, Matt Coppinger and John Dodge for working with me on this paper. Glad it is finally available!

Demo time – vCloud Director 5.1 disaster recovery demo

When I was playing with the new vCloud Director 5.1 and Site Recovery Manager 5.1 I figured I would record a demo of the DR solution that Chris Colotti and I developed. The demo is fairly straight forward and hopefully helps you in the process of building a resilient cloud infrastructure. In this demo I have included:

  • vSphere 5.1
    • vSphere Replication
  • vCloud Director 5.1
  • Site Recovery Manager 5.1