Automate everything!

HOWTO: Speed up vRA template deployments

[ Since flowgrab is no longer amongst us you can get it from dropbox now ] I recently did some scalability and load testing on a vRA deployment. One of the problems I encountered was that the template deployments,which seemed reasonably fast suddenly became a bottleneck. So obviously I created a workflow to fix that. So here is howto: speed up vRA template deployments. TL;DR: if you’re too lazy to read an just want the workflow click here. Let me explain the scenario first: Usually when you deploy a virtualmachine from a vSphere template it is reasonably quick. After all it just copies a file from disk to disk. If you happen to have an all flash array (my customer does) it is probably pretty fast.  Especially when your array is VAAI compatible. Because that would offload the whole copy action to the array.A typical VAAI accelerated template deployment takes around 4 seconds on an EMC X-IO. But what happens if your template is in one cluster and needs to be deployed to another cluster which has a separate storage array. Then VAAI can’t  do anything for you and the template will be copied over the network. Which could be rather slow compared to a VAAI accelerated deployment. More in the region of 2 minutes instead of 4 seconds. So before we get to the solution let me answer a few questions: 1: Why do you have the template on another cluster? Well… this customer has more than 10 different clusters with their own storage. Why that is is a completely different sotry. Anyhow, I wasn’t going to copy the template to each of them manually. Keeping them up to date would be a nightmare.  Second reason is that that strategy would require another blueprint for each cluster. Which is another nightmare to maintain. 2: Why don’t you use a template LUN shared between all clusters? This was actually not physically possible and also otherwise unwanted. It also wouldn’t fix the VAAI issue since it would require copying from one storage bo to another. It would be faster then copying over the network, that’s for sure. 3: 2 minutes isn’t that long. Why bother? Actually, consider the fact that vRA by default only does 2 deployments in parallel. That means that if you kick-off 50 deployments it takes at least 48 minutes before it even start deploying the last request. That is an unacceptable  long delay and even causes time-outs in vRA. I tried bumping up the number of parallel deployments but that slowed the deployments so much they never finished within vRA time-outs. So…. I didn’t want a separate blueprint for each target cluster and I didn’t want to copy the template manually. The solution that remains is having the template copied by a workflow and then overwrite the template that the blueprint is going to use. Can we do that? You we can!  :). Turns out there is a custom propery called __clonefrom which contains the template name. If you overwrite this property during the buildingMachine state it will just use that machine to clone from. to automate this process I created a workflow that:

  1. gets the template name from the __clonefrom property
  2. gets the cluster name of the cluster where vRA is going to deploy the machine
  3. add the cluster name to the tempate name and checks if such template exists
  4. If it doesn’t exist it will clone the template that is configured in the blueprint to the target cluster and adds the cluster name to the template name so we can find it next time we deploy something to the same cluster.
  5. overwrite the __cloneid property and then let vRA do it’s jobs

selectTemplate workflow That’s it. This will make sure you have VAAI enabled deployments on each cluster. In my case it decreased the template deployment time from around 2 minutes to 4 seconds. This is so fast tha vsphere deployment is done before vRA can kick-off the next one. you can download the workflow from dropbox. Use at you own risk! It’s tested wit vRA 6.2.1 and vSphere 5.5. Should work on any vRA 6.x version or even 5.x but I didn’t test that. Not sure about vRA 7. I’ll let you know when I had a change to test that.