An introduction to Storage DRS

Today someone asked for a Storage DRS intro, I wrote one for our book a year ago and figured I would share it with the world. I still feel that Storage DRS is one of the coolest features in vSphere 5.0 and I think that everyone should be using this! I know there are some caveats (1, 2) when you are using specific array functionality or for instance SRM, but nevertheless… this is one of those features that will make an admin’s life that much easier! If you are not using it today, I highly suggest evaluating this cool feature.

*** out take from the vSphere 5.0 Clustering Deepdive ***

vSphere 5.0 introduces many great new features, but everyone will probably agree with us that vSphere Storage DRS is most the exciting new feature. vSphere Storage DRS helps resolve some of the operational challenges associated with virtual machine provisioning, migration and cloning. Historically, monitoring datastore capacity and I/O load has proven to be very difficult. As a result, it is often neglected, leading to hot spots and over- or underutilized datastores. Storage I/O Control (SIOC) in vSphere 4.1 solved part of this problem by introducing a datastore-wide disk-scheduler that allows for allocation of I/O resources to virtual machines based on their respective shares during times of contention.

Storage DRS (SDRS) brings this to a whole new level by providing smart virtual machine placement and load balancing mechanisms based on space and I/O capacity. In other words, where SIOC reactively throttles hosts and virtual machines to ensure fairness, SDRS proactively makes recommendations to prevent imbalances from both a space utilization and latency perspective. More simply, SDRS does for storage what DRS does for compute resources.

There are five key features that SDRS offers:

  • Resource aggregation
  • Initial Placement
  • Load Balancing
  • Datastore Maintenance Mode
  • Affinity Rules

Resource aggregation enables grouping of multiple datastores, into a single, flexible pool of storage called a Datastore Cluster. Administrators can dynamically populate Datastore Clusters with datastores. The flexibility of separating the physical from the logical greatly simplifies storage management by allowing datastores to be efficiently and dynamically added or removed from a Datastore Cluster to deal with maintenance or out of space conditions. The load balancer will take care of initial placement as well as future migrations based on actual workload measurements and space utilization.

The goal of Initial Placement is to speed up the provisioning process by automating the selection of an individual datastore and leaving the user with the much smaller-scale decision of selecting a Datastore Cluster. SDRS selects a particular datastore within a Datastore Cluster based on space utilization and I/O capacity. In an environment with multiple seemingly identical datastores, initial placement can be a difficult and time-consuming task for the administrator. Not only will the datastore with the most available disk space need to be identified, but it is also crucial to ensure that the addition of this new virtual machine does not result in I/O bottlenecks. SDRS takes care of all of this and substantially lowers the amount of operational effort required to provision virtual machines; that is the true value of SDRS.

However, it is probably safe to assume that many of you are most excited about the load balancing capabilities SDRS offers. SDRS can operate in two distinct modes: No Automation (manual mode) or Fully Automated. Where initial placement reduces complexity in the provisioning process, load balancing addresses imbalances within a datastore cluster. Prior to vSphere 5.0, placement of virtual machines was often based on current space consumption or the number of virtual machines on each datastore. I/O capacity monitoring and space utilization trending was often regarded as too time consuming Over the years, we have seen this lead to performance problems in many environments, and in some cases, even result in down time because a datastore ran out of space. SDRS load balancing helps prevent these, unfortunately, common scenarios by making placement recommendations based on both space utilization and I/O capacity when the configured thresholds are exceeded. Depending on the selected automation level, these recommendations will be automatically applied by SDRS or will need to be applied by the administrator.

Although we see load balancing as a single feature of SDRS, it actually consists of two separately-configurable options. When either of the configured thresholds for Utilized Space (80% by default) or I/O Latency (15 milliseconds by default) are exceeded, SDRS will make recommendations to prevent problems and resolve the imbalance in the datastore cluster. In the case of I/O capacity load balancing, it can even be explicitly disabled.

Before anyone forgets, SDRS can be enabled on fully populated datastores and environments. It is also possible to add fully populated datastores to existing datastore clusters. It is a great way to solve actual or potential bottlenecks in any environment with minimal required effort or risk.

Datastore Maintenance Mode is one of those features that you will typically not use often; you will appreciate it when you need. Datastore Maintenance Mode can be compared to Host Maintenance Mode: when a datastore is placed in Maintenance Mode all registered virtual machines, on that datastore, are migrated to the other datastores in the datastore cluster. Typical use cases are data migration to a new storage array or maintenance on a LUN, such as migration to another RAID group.

Affinity Rules enable control over which virtual disks should or should not be placed on the same datastore within a datastore cluster in accordance with your best practices and/or availability requirements. By default, a virtual machine’s virtual disks are kept together on the same datastore.

For those who want more details, Frank Denneman wrote an excellent series about Datastore Clusters which might interest you:

Part 1: Architecture and design of datastore clusters.
Part 2: Partially connected datastore clusters.
Part 3: Impact of load balancing on datastore cluster configuration.
Part 4: Storage DRS and Multi-extents datastores.
Part 5: Connecting multiple DRS clusters to a single Storage DRS datastore cluster.
Part 6: Aggregating datastores from multiple storage arrays into one Storage DRS datastore cluster.

Some other articles that might be of use:

The following video will give an overview of the above mentioned features… worth checking.

Why is my pathing policy limited to “fixed” or “MRU” with things like MSCS cluster?

Yesterday I received an email from someone. He wanted to know why he was limited to using either the “fixed” or “MRU” pathing policy for the LUNs attached to his MSCS cluster. In his environment they used round-robin for everything and not being able to configure all of them with the same policy was against their internal policy. The thing is that if round-robin would be used and the path would switch (by default every 1000 I/Os) the SCSI-2 reservation would need to be re-acquired on this LUN. (MSCS uses SCSI-2 reservations for their cluster devices) As you can imagine that could cause a lot of stress on your array and could lead to all sorts of problems. So please do not ignore this recommendation! Some extra details can be found in the following KB articles:

KB article about SvMotion / VDS / HA problem republished with script to mitigate!

Just a second ago the GSS/KB team republished the KB article that explain the vSphere 5.0 problem around SvMotion / vDS / HA. I wrote about this problem various times and would like to refer to that for more details. What I want to point out here though is that the KB article now has a script attached which will help preventing problems until a full fixed is released. This script is basically the script that William Lam wrote, but it has been fully tested and vetted by VMware. For those running vSphere 5.0 and using SvMotion on VMs attached to distributed switches I urge you to use the script. I expect that the PowerCLI version will also be attached soon.

http://kb.vmware.com/kb/2013639

vCenter Infrastructure Navigator throws the error: “an unknown discovery error has occurred”

I was deploying vCenter Infrastructure Navigator (VIN) in my lab today and the following error came up after I wanted to check dependencies for a virtual machine:

Access failed, an unknown discovery error has occurred

I rebooted several services but nothing seemed to solve it. Internally I bumped on a thread which had the fix for this problem: DNS. Yes I know always DNS right. Anyway, I used “DHCP” for my VIN appliance and this DHCP server pointed to a DNS server which did not have the IP/name of my ESXi hosts listed. Because of this the discovery didn’t work as VIN tries to resolve the names of the hosts as they were added to vCenter Server. I configured VIN with a fixed IP and pointed the VIN appliance to the right DNS server. Problem solved.

Navigating your application landscape…

I was on a holiday the last two weeks and slowly catching up on everything that happened. Some of you might think it wasn’t a lot, but in the world of cloud and virtualization it was. Not only was there a huge EUC launch event but also a new version of vCenter Infrastructure Navigator was released. Somehow it has been amazingly quiet around this product. Something I didn’t really understand, especially not after reading the release notes of version 1.1 of vCenter Infrastructure Navigator. Two things stood out:

  • vCloud Director support
  • Infrastructure Navigator discovers VMware services, such as Site Recovery Manager (SRM) Server, VMware View Server, VMware vCloud Director Server, and VMware vShield Manager Server.

For those who don’t know, Infrastructure Navigator is an application awareness plugin for vCenter Server. This enables you to  get a better understanding of what is running on top of your virtual infrastructure. A lot of you may say, well why would I care? Think about DR for a second. What is the most challenging part of creating a DR Plan? Indeed, figuring out all dependencies. That is exactly where vCenter Infrastructure Navigator comes in to play as shown in the screenshot below, which I stole from Ben Scheerer. Ben wrote an excellent blog about some of the cool new features in vCenter Infrastructure Navigator, I am not going to repeat those just read his. It is worth it if you are serious about providing the best service to your (internal) customers!

 

 

Problems using the vCenter Web Client

I was doing some upgrades in my lab and ran in to an issue. Whenever I started the vCenter Web Client I got a message that the vCenter Inventory Service wasn’t running. I looked at my Services section in Windows 2008 and found that it wasn’t started. Starting it gave me a new error: 1067. This is very generic but I figured I would google it anyway. That actually brought me to our own documentation, yes I should check that first next time, and it mentioned I could reset the inventory service as follows:

  • Stop the service (was already stopped)
  • Delete the entire contents of the Inventory_Service_Directory/data directory
  • Change directory to Inventory_Service_directory/scripts
  • Run the createDB.bat command, with no arguments, to reset the vCenter Inventory Service database
  • Run the register.bat command to update the stored configuration information of the Inventory Service
    register.bat vcenter-tm01.testlab.local 443
  • Restart the vCenter Inventory Service

I also had to re-register the Web Client to vCenter Server. This is what I had to do:

  • admin-cmd.bat register https://vcenter-tm01.testlab.local:9443/vsphere-client https://vcenter-tm01.testlab.local administrator password
Hope it helps,

 

Octopus Desktop Background

I’ve been asked by many to share this but never got around to it, I have it as my wallpaper for months and still love it! Today, last day of my two week holiday, I figured I would make some time and share it with the world… Just click the pic below to get the 1920×1200 version. Definitely the best VMware logo ever!

Page 1 of 19412345...102030...Last »