Scott Drummonds just posted a new blog article which deals about an upcoming VMware PSO offering. When Scott Drummonds is involved you know the topic of this offering is performance. In this case it’s performance related to SQL databases and I/O bottlenecks, which is probably the most reported issue. As Scott explains briefly they were able to identify the issue rather quickly by monitoring the physical servers and the virtual environment.
I guess the quote of Scott’s article captures the essence:
In the customer’s first implementation of the virtual infrastructure, both SQL Servers, X and Y, were placed on RAID group A. But in the native configuration SQL Server X was placed on RAID group B. This meant that the storage bandwidth of the physical configuration was approximately 1850 IOPS. In the virtual configuration the two databases shared a single 800 IOPS RAID volume.It does not take a rocket scientist to realize that users are going to complain when a critical SQL Server instances goes from 1050 IOPS to 400. And this was not news to the VI admin on-site, either. What we found as we investigated further was that virtual disks requested by the application owners were used in unexpected and undocumented ways and frequently demanded more throughput than originally estimated. In fact, through vscsiStats analysis (Using vscsiStats for Storage Performance Analysis), my contact and I were able to identify an “unused” VMDK with moderate sequential IO that we immediately recognized as log traffic. Inspection of the application’s configuration confirmed this.