Implementing VMware SRM: Pay Attention to that Man Behind the Curtain

Now that I’ve sat the VMware Site Recovery Manager (SRM) class, done the labs, and had some design and implementation time with the product I am reminded of a scene from the movie The Wizard of Oz. “Pay no attention to that man behind the curtain” is a famous line from the movie which comes from the scene when Dorothy and gang discover that the mighty and powerful Wizard they fear is really just an elaborate machine controlled by an ordinary man.

I am not suggesting that SRM is a sham. In fact, it provides automation of virtual infrastructure fail over between sites that is truly wizard-like. Understand however, VMware SRM software is just the last piece of the total data center recovery “machine”. Many organizations may be seeking the semblance of automated site fail over, but have they really considered in detail what it takes to start up their business critical systems at a secondary location?

A simple determination of readiness for SRM’s wizardry is answering this question: “Can you create (or have you already created) a document listing the complete shut down and start up of your business infrastructure?” Many call this a disaster recovery playbook or runbook. Better yet, have you provisioned and tested the physical resources you need to actually fail over to another location based on the runbook that was created? If and when that’s done, there are numerous business continuity technologies considered other than SRM. Those that have already realize that SRM along with consolidation to virtual infrastructure will replace several sections of their runbook and several pieces of hardware. The point, however, is that SRM does not replace the contents of the entire runbook.

For those that are considering SRM, take the time to at least put on paper every possible step you would need to restart your business in another data center. This includes the mundane like providing power and cooling, racks, internet access, and telephones as well as the mission critical like employee access, email, and business processing. Analyze how much time it would take to rebuild and restart systems, and what recovery point objectives (RPO) and recovery time objectives (RTO) are acceptable for each system or service. When it’s time to map your disaster site fail over solution to the runbook you will clearly see the efficiencies and the speed that VI 3.5 and SRM allow.

Therefore, unlike the line from The Wizard of Oz, you better pay close attention to the man and machine behind the curtain in order to achieve the prodigious expectation and pyrotechnic results of a SRM implementation.

About the author

Rich Brambley

Leave a Comment