My former colleague and virtualization buddy Ernst Cozijnsen over at Capgemini Outsourcing has done some extensive testing on virtual machine deployments with VAAI enabaled storage. The storage he has used for the deployment tests was an EMC Symmetrix VMAX, Here are his findings:
A while ago a colleague of mine pointed me to a blog post which included a link to an pdf document released by EMC talking about VAAI efficiency with EMC Symmetrix VMAX arrays. The document basically covers 3 new features which have been released in the “Enginuity 5875” firmware for the VMAX also known as the “75 code” in conjunction with vSphere 4.1.
These features are:
- Hardware accelerated Full Copy
- Hardware accelerated Block Zero
- Hardware accelerated locking
Without repeating all the technical details already mentioned in the pdf (which is a must read for all nerds) we’ve put the pedal to the metal to see if VAAI really makes a difference. To make a good comparison between old and new we have set up 3 (almost) identical THICK templates: W2K8 R2 ent 30GB + 5GB hdd containing +/- 12GB of actual data. The source and target Datastores are 1T tier2 (VMAX) LUN’s connected via 4 storage paths (Powerpath VE enabled) which are formatted with a 2MB block size.
The 1st template “Ernst-oldtemplate” is virtual hardware version 4 created on ESX 3.5, with an aligned OS disk and default storage allocation (this template was cloned and freshly installed to mimic template 2 and 3).
The 2nd “Ernst-nonzero” has hardware version 7 created on ESXi4.1, with an aligned OS disk and eagerzeroedthick formatted vmdk’s.
The 3rd “Ernst-zeroed” has hardware version 7 created on ESXi4.1, with an aligned OS disk and zeroedthick formatted vmdk’s.
To quote EMC from their document: “The improvement in deployment times range from 7 to 18 times faster”.
Deploying the 1st template:
The job completed in +/- 9 minutes. During this time we observed the ESXi host itself via esxtop and as shown below. The SCSI command reads issued, spiked to sometimes 6550 per second meaning all data blocks are read with some form of intelligence.
Thursday, June 16. 2011
VAAI and Deployment - a Practical Example
Deploying the 2nd template:
The job completed in 2 minutes exactly. (pretty impressive) Also here we observed the ESXi host and we see that the SCSI command’s issued dramatically decreased to a few commands per second.
Deploying the 3rd template:
The job completed in 1 minute 37 seconds, very nice ;-)
Here we've observed a decline of commands being issued from the ESXi host towards the VMAX array. Besides these test we also did much smaller deployments which resulted in vCenter not picking up the performance data at all because the job finished within 20 seconds (being the refresh rate of the graphs).
So yes…. VAAI delivers bigtime in this area (blockzero + hardware assisted locking) if combined with a VMAX Array. The tests shown above clearly state that ESXi 4.1 offloads a lot of disk related I/O to the Array if configured correctly.
Technicians saying: “We need high performing heavy workload I/O” ……… I say ‘Bring it ON!”