DRS rules still active when DRS disabled?

I just received a question around DRS rules and why they are still active when DRS is disabled. I was under the impression this was something I already blogged about, but I cannot find it. I know some others did, but they reported this behaviour as a bug… which it isn’t actually.

Below is a screenshot of the VM/Host Rules screen for vSphere 6.0, it allows you to create rules for clusters… Now note I said “clusters” not DRS in specific. In 6.0 the wording in the UI has changed to align with the functionality vSphere offers. These are not DRS rules, but rather cluster rules. Whether you use HA or DRS, these rules can be used when either of the two is configured.

Note that not all types of rules will automatically be respected by vSphere HA. One thing which you can now also do in the UI is specify if HA should ignore or respect rules, very useful if you ask me and makes life a bit easier:

Cloud native inhabitants

When ever I hear the term “cloud native” I think about my kids. It may sounds a bit strange as many of you will think about “apps” probably first when “cloud native” is dropped. Cloud native to me is not about an application, but about a problem which has been solved and a solution which is offered in a specific way. A week or so ago someone made a comment on twitter around how “Generation X” will adopt cloud faster than the current generation of IT admins…

Some even say that “Generation X” is more tech savvy, just look at how a 3 year old handles an iPad, they are growing up with technology. To be blunt… that has nothing to do with the technical skills of the 3 year old kid, but is more about the intuitive user interface that took years to develop. It comes natural to them as that is what they are exposed to from day 1. They see there mom or dad swiping a screen daily, mimicking them doesn’t require deep technical understanding of how an iPad works, they move their finger from right to left… but I digress.

My kids don’t know what a video tape is and even a CD to play music is so 2008, which for them is a lifetime, my kids are cloud native inhabitants. They use Netflix to watch TV, they use Spotify to listen to music, they use Facebook to communicate with friends, they use Youtube / Gmail and many other services running somewhere in the cloud. They are native inhabitants of the cloud. They won’t adopt cloud technology faster, for them it is a natural choice as it is what they are exposed to day in day out.

Startup intro: Rubrik. Backup and recovery redefined

Some of you may have seen the article by The Register last week about this new startup called Rubrik. Rubrik just announced what they are working on and announced their funding at the same time:

Rubrik, Inc. today announced that it has received $10 million in Series A funding and launched its Early Access Program for the Rubrik Converged Data Management platform. Rubrik offers live data access for recovery and application development by fusing enterprise data management with web-scale IT, and eliminating backup software. This marks the end of a decade-long innovation drought in backup and recovery, the backbone of IT. Within minutes, businesses can manage the explosion of data across private and public clouds.

The Register made a comment, which I want to briefly touch on. They mentioned it was odd that a venture capitalist is now the CEO for a startup and how it normally is the person with the technical vision who heads up the company. I can’t agree more with The Register. For those who don’t know Rubrik and their CEO, the choice for Bipul Sinha may come as a surprise it may seem a bit odd. Then there are some who may say that it is a logical choice considering they are funded by Lightspeed… Truth of the matter is that Bipul Sinha is the person with the technical vision. I had the pleasure to see his vision evolve from a couple of scribbles on the whiteboard to what Rubrik is right now.

I still recall having a conversation with Bipul talking about the state of the “backup industry”, and I recall we agreed the different components of a datacenter had evolved over time but that the backup industry was still very much stuck in the old world. (We agreed backup and recovery solutions suck in most cases…) Back when we had this discussion there was nothing yet, no team, no name, just a vision. Knowing what is coming in the near future and knowing their vision I do think this quote from the press release embraces best what Rubrik is working on and it will do:

Today we are excited to announce the first act in our product journey. We have built a powerful time machine that delivers live data and seamless scale in a hybrid cloud environment. Businesses can now break the shackles of legacy and modernize their data infrastructure, unleashing significant cost savings and management efficiencies.

Of course Rubrik would not be possible without a very strong team of founding members. Arvind Jain, Arvind Nithrakashyap and Soham Mazumdar are probably the strongest co-founders one can wish. The engineering team has deep experience in building distributed systems, such as Google File System, Google Search, YouTube, Facebook Data Infrastructure, Amazon Infrastructure, and Data Domain File System. Expectations just raised a couple of notches right?!

I agree that even the statement above is still a bit fluffy so lets add some more details, what are they working on? Rubrik is working on a solution which combines backup software and a backup storage appliance in to a single solution and initially will target VMware environments. They are building (and I hate using this word) a hyperconverged backup solution and it will scale from 3 to 1000s of nodes. Note that this solution will be up and running in 15 minutes and includes the option to age out data to the public cloud. What impressed me most is that Rubrik can discover your datacenter without any agents, it scales-out in a fully automated fashion and will be capable of deduplicating / compressing data but also offer the ability to mount data instantly. All of this through a slick UI or you can leverage the REST APIs , fully programmable end-to-end.

I just went over “instant mount” quickly, but I want to point out that this is not just for “restoring VMs”. Considering the REST APIs you can also imagine that this would be a perfect solution to enable test/dev environments or running Tier 2/3 workloads. How valuable is it to have instant copies of your production data available and test your new code against production without any interruption to your current environment? To throw a buzzword in there: perfectly fit for a devops world and continuous development.

That is about all I can say for now unfortunately… For those who agree that backup/recovery has not evolved and are interested in a backup solution for tomorrow, there is an early access program and I urge you to sign up to learn more but also help shaping the product! The solution is targeting environments of 200 VMs and upwards, make sure you meet those requirements. Read more here and/or follow them on twitter (or Bipul).

Good luck Rubrik, I am sure this is going to be a great journey!

What does support for vMotion with active/active (a)sync mean?

Having seen so many cool features being released over the last 10 years by VMware you sometimes wonder what more they can do. It is amazing to see what level of integration we’ve see between the different datacenter components. Many of you have seen the announcements around Long Distance vMotion support by now.

When I saw this slide something stood out to me instantly and that is this part:

  • Replication Support
    • Active/Active only
      • Synchronous
      • Asynchronous

What does this mean? Well first of all “active/active” refers to “stretched storage” aka vSphere Metro Storage Cluster. So when it comes to long distance vMotion some changes have been introduced for sync stretched storage. With stretched storage writes can come from both sides at any time to a volume and will be replicated synchronously. Some optimizations have been done to the vMotion process to avoid writes during switchover to avoid any delay during the process as a result of replication traffic.

For active/active asyncronous the story is a bit different. Here again we are talking about “stretched storage” but in this case the asynchronous flavour. One important aspect which was not mentioned in the deck is that async requires Virtual Volumes. Now, at the time of writing there is no vendor yet who has a VVol capable solution that offers active/active async. But more important, is this process any different than the sync process? Yes it is!

During the migration of a virtual machine which uses virtual volumes, with an “active/active async” configuration backing it, the array is informed that a migration of the virtual machine is taking place and is requested to switch from asynchronous replication to synchronous. This to ensure that the destination is in-sync with the source when the VM is switched over from side A to side B. Besides switching from async to sync when the migration has completed the array is informed that the migration has completed. This allows the array to switch the “bias” of the VM for instance, especially in a stretched environment this is important to ensure availability.

I can’t wait for the first vendor to announce support for this awesome feature!

Another way to fix your non compliant host profile

I found out there is another way to fix your non compliant host profile problems with vSphere 6.0 when you have SAS drives which are detected as shared storage while they are not. This method is a bit more complicated though and there is a command line script that you will need to use: /bin/sharedStorageHostProfile.sh. It works as follows:

  • Run the following to dump all your local details in a folder on your first host
    /bin/sharedStorageHostProfile.sh local /folder/youcreated1/
  • Run the following to dump all your local details in a folder for your second host, you can do this on your first host if you have SSH enabled
    /bin/sharedStorageHostProfile.sh remote /folder/youcreated2/ <name or ip of remote host>
  • Copy the outcome of the second host to folder where the outcome of your first host is stored. You will need to copy the file “remote-shared-profile.txt”.
  • Now you can compare the outcomes by running:
    /bin/sharedStorageHostProfile.sh compare /folder/youcreated1/
  • After comparing you can run the configuration as follows:
    /bin/sharedStorageHostProfile.sh configure /folder/youcreated1/
  • Now the disks which are listed as cluster wide resources but are not shared between the hosts will be configured as non-shared resources. If you want to check what will be changed before running the command you can simply do a “more” of the file the info is stored in:
    more esxcli-sharing-reconfiguration-commands.txt
    esxcli storage core device setconfig -d naa.600508b1001c2ee9a6446e708105054b --shared-clusterwide=false
    esxcli storage core device setconfig -d naa.600508b1001c3ea7838c0436dbe6d7a2 --shared-clusterwide=false

You may wonder by now if there isn’t an easier way, well yes there is. You can do all of the above by running the following simple command. I preferred to go over the steps so at least you know what is happening.

/bin/sharedStorageHostProfile.sh automatic <name-or-ip-of-remote-host>

After you have done this (first method or second method) you can now create your host profile of your first host. Although the other methods I described in the post of yesterday are a bit simpler, I figured I would share this as well as you never know when it may come in handy!