Thursday, October 31, 2013

The Importance of BCP (Backup and Continuity Plan)

(To the tune of "American Pie")

"A few short hours ago
I can still remember how my heart sank when I clicked that file
But I knew that this was the chance
To show we'd planned this in advance
And we'd be up and running in a while"
 
"Though monitoring made me shiver
With every alert it delivered
Bad news on the wire
Some rumours of a fire"
 
"I can't remember if we tried
To contact kit that was inside
But BCP got justified
The day the DC died"
 
"So fail back to the secondary site
All the data replicated through the hours of night
Mirrored storage holds it all secure and tight
Proving planning for disaster was right
Planning for disaster was right"
 
My deepest apologies to the good Mr D. McLean, but the topic of Business Continuity Planning is not usually one that makes people's hearts leap with joy. In fact it quite often fills them with dread. But, like an insurance policy, you only realise how important a proper BCP is when you actually need it. By then, it is usually too late.
 
So, what is a Business Continuity Plan?
 
Quite simply, it is a plan to show, and detail, how you plan to continue running your business in the event of a disaster. It covers everything from where staff would work if they were unable to access their normal offices, how they would communicate (telephones, fixed or mobile), what aspect of the business needed to be working first and, of course, access to, or recovery of, computer-held data and systems vital to the business function. The data aspect is what I intend to concentrate on here.
 
What constitutes a disaster?
 
This varies, depending on your business, but could be anything from a burst water pipe flooding your offices, to a fire in your data centre or even terrorist activity (on July 7, 2005 a number of businesses in The City invoked their BCP plans in response to the sad events of that day elsewhere in London).
 
What can you do to ensure your business can survive a disaster?
 
That depends on how long your business can function without access to its most valuable resource; Data. If you can keep going with manual systems whilst your IT team source, build and recover replacement servers and storage then that is great, but you will be in the minority.   Most organisations would be hard-pressed to run for a day without their IT systems and many would be in trouble after a few hours.
 
The first step in protecting your data is to ensure regular backups are taken. These backups could be to physical tape, or to a virtual tape library. If data is backed up to tape then you will need to ensure that the tapes are stored somewhere safe and secure. It is no good just putting them on a shelf next to the system they are intended to restore one day. Tapes should be stored off-site, either with a specialist third party, of which there are many, or at your disaster recovery site, where they will be readily available in the event of DR.
 
If you back up to VTL or use some other disk-based backup process then consider mirroring or replicating this to your DR site. Tivoli Storage Manager V6.3 offers node replication to a secondary server, thus ensuring backup data is available from more than one source in the event of a disaster. IBM's ProtecTIER can be clustered, with nodes at multiple sites replicating their data within the cluster.
 
So, you have all that in place, all your data gets backed up to multiple, seperate locations overnight, critical data is replicated real-time over redundant fibre links, you have got all the bases covered. Congratulations! Now, have you tried restoring some of that data? Do you know for certain that your finance database can be recovered? Have you tried pulling the connection on your fibre switch to make sure it fails over to the secondary link?
 
It is all very well investing the time and effort to build resilience, but you need to know it works, and that means testing it.
 
Schedule a specific date (or dates) to fail over specific, critical systems to the DR site and make sure you can recover them. Make sure all the people involved know what is expected of them. Make sure everything is documented!  I can not stress the importance of documentation too heavily. If your one and only SAP guru is in hospital after a skiing accident when your disaster strikes, you are going to be hard pushed to get your CRM system up and running without detailed documentation.
 
Testing lets you find the bugs, omissions and plain, simple mistakes in your processes, without the pressure of a CEO breathing down your neck. It gives you time to perfect the procedures and build confidence in both your staff and your systems.
 
If you do not test, then you might be lucky and get away with it, but chances are you will be digging a hole for yourself and your business. If you do not even have a plan then that hole will be about 6 feet deep, with a headstone at one end!
 
So set up that plan, make sure everyone understands their role, and most importantly, test it regularly. That way you will not end up like Mr McLean's "good old boys"; drinking whisky and lamenting the death of the "music".
 
Should you require more information on backup and continuity plans, please do not hesitate to contact Celerity.
 
Tony Lloyd - Technical Consultant - Celerity Limited

 To read this article on Celerity's website please click here

No comments:

Post a Comment