by September 3, 2009 0 comments



Adequate disaster recovery and business continuity planning is no longer a
prerequisite for large enterprises with extremely mission critical business
operations. It is now accepted worldwide as a basic requirement for every
organization and business. We look at key steps you need to take along with your
DR plan to ensure you are able to recover data after a disaster has struck

Disaster recovery and business continuity planning should reflect the real
and on-going needs of the business activity or function. Each company today
needs to have a disaster recovery plan, which is designed in accordance with the
company’s corporate vision and business need. This involves choice and
implementation of processes, policies and infrastructure as per the disaster
recovery plan, which in turn pertains to application and network elements that
are essential to DR, and consequently the business. However having a DR plan is
not enough. Disaster can strike in any form, it can be a natural disaster (fire,
earthquake etc) or technology related disaster (eg malicious virus attack, power
issues, UPS blowouts, etc) or even related to humans (bomb blasts, pandemics,
etc).

Risk evaluation
Make sure you do a complete risk evaluation, ie identify events that can
leave a negative impact your organization. This should cover everything from
natural, manmade, technological, which can intentional or accidental, internal
as well as external. Along with the type of threat a risk evaluation should also
cover dangers associated with it, how it can happen, probability of occurrence
of each threat, probability of impact and factors that can be controlled. You
would also need to identify single points of failures—these can be anything
from Internet lines to electricity equipment.
You would also need to determine the frequency of these threats—this might
vary according to your geographical area. For instance, if you are a company
based in Mumbai, then rains/floods are one type of threat for which you must
prepare. Way back in 2005, when the Mumbai rains created havoc, many companies
were caught unprepared and offices were flooded with water.

We know a company that flew its employees to its other location, to ensure
that its business is not impacted. A lot of companies create risk assesment
forms which are sent to key people in each department and ask them to identify
possible risks in their department. Of course many of these will be non-IT
related but might play an important role in the BCP.

You would also need to identify and prepare a list of applications and the
duration for which downtime is affordable. This can be a very difficult task, as
in every case there will user’s whore affected by application downtime. At the
same time there are applications like email without which business will come to
stand still.

M Sudhir Reddy CIO, MindTree advises CIOs on what not to do in DR
Always build your DR ‘technology upwards’, and ‘business
downwards’.

Fundamentally, DR is an extension of your company’s technology backbone. It
is not something that flows out of business demand. That said, DR can
effectively be used to gauge the optimum requirement of IT resources for
your company. In an unfortunate event of disaster striking, your IT
equipment would run on a mirror image which may not be as proficient as the
real technology backbone. However, it might then give you a signal that you
have actually over-provisioned.

Do not clog your DR
Over-provisioning your DR is suicidal in today’s times. A certain percent of
degradation is acceptable when you are aware that your operations are
running in DR mode. For enterprises that operate within industry domains
that do not demand 100% agility, DR is a necessary investment, but
temporarily they can do with a little degradation.

Backing up tapes is not DR
Many small enterprises — especially in India are of the impression that
Disaster Recovery only means having a mechanism of tape backups in place.
The investment does not end there. Tape documents only ensure the first half
of DR — to document and archive data. The second half of DR is to seamlessly
retrieve data and continue day to day operations till complete recovery
happens.

Have regular DR drills
An enterprise needs to carry out regular DR drills, to ensure that their DR
plan actually works and if there are any flaws they should be rectified in time.
DR drills can be carried out every six months or every weekend, depending on the
complexity of the organization. Many companies ensure that the work done on
weekends is done from their DR site, to ensure both primary and secondary sites
are in sync. However this is possible if you have a hot D.R site. If you have
outsourced DR, than having DR drills becomes even more important, as this can
easily tell you how ready is your service provider to handle a disaster.

C Kajwadkar Chief Architect and Vice President — Availability Services,
Netmagic Solutions

Technology Disasters vs Human Disasters
Disasters can be classified into two — technology and human. The former
arise out of unaccepted fault in the performance of the IT team — across
various components — from server overloads to sudden spikes to power issues
and even software malfunctions. Some of these can be prevented in an
automated manner by using personalized tools and services that predict
mishaps based on past experience and present technology performance. In
other cases, the CIO will have to intervene and do crisis management, but
the tools and services would reduce the time taken for this.

The second kind is human disasters, which include environmental mishaps
like drought, socio-economic unrest, pandemics like Swine Flu and other
human activities that lead to manpower in an organization getting affected
suddenly. For instance, in a BPO environment, if the strength reduces by 50%
due to Swine Flu spread, the IT equipment is not affected but individuals
who interact with IT will indirectly affect business, and this might demand
that the IT goes into DR mode, unless of course the company can put up with
productivity loss. In a mission critical environment, DR can perform routine
tasks in an ‘auto pilot’ mode for a reasonable time before the pandemic
clears away. In this case, the IT or the DR service provider does not
directly face a technology challenge, but more of a manpower challenge that
affects IT and demands a DR mode operation. The preparedness for human
disasters is important in the current scenario.

However when having DR drills its important to make sure that no business is
affected due to a DR drill be it a regular or a surprised one.

Countering the human factor
There are times when IT and everything is in place, yet the causative
factors for disaster are humans. Currently, with Swine Flu already declared a
pandemic, enterprises are re-visiting the human factor and drawing strategies to
make sure it doesn’t impact their businesses. Most companies are finding the
answer in workforce mobility, ie in case the situation gets worse, its employees
can do most of their work from home. While it might answer some of the woes,
remote access cannot work for all industries, such as BPO, manufacturing, etc.
This is where people part in DR kicks in. One strategy is to have enough bench
strength, to counter the situation. But with the recent economic slowdown, where
companies are looking to get most out of a single employee, there aren’t enough
people on the bench. The most commonly used strategy now is to train employees
to have multiple skills. Another strategy is to have another location ready, if
one of the locations is hit. This can be easy for industries such as BPOs, where
call load can be easily shifted to another location if one site is hit. Plan for
Swine flu cannot be created in isolatation because the whole organization is
vulnerable to it. Therefore, the CIO needs to find out from all business unit
heads what’s important for them in an emergency. What part of their business can
be taken caer of remotely or can be autoamted with the use of any existing
technologies. Companies are also creating separate team which will spread
awareness among their groups and drive out plans to if they get into such a
situation.

Sunil Chandna CEO, Stellar Information Systems on the possibility of data
recovery after disaster strikes.
These are some of the tricky situations you might land up
in after disaster has taken its toll:

Flooded Drives
Here, the probability of recovery is very high as drives are
hermetically sealed. However, if a drive is lying under salt (sea) water for
a much longer time (over 1 week) then its seals get punctured and the degree
of difficulty increases exponentially. Recovery time for such jobs is
longer, Example — the recent cyclone Gonu in Oman caused havoc on some
low-lying areas and we did recover many drives that were submerged for over
2 weeks under sea water. The Tsunami in South India caused similar problems
to may hotel resorts — but here the drives were salvaged quickly and all
data was successfully recovered.

Fire
The hard drive platters can take a reasonable amount of heat during
fire. The heat resistance may vary depending on the platters type —
aluminum/toughened glass, etc. However, melted drives can not be recovered.

These people
are not wearing surgical masks to protect against Swine Flue! This is
Stellar’s Clean room to recover data from hard disks in The Netherlands
.

Earthquake
Sometimes the drive may get twisted or even broken down into pieces.
These situations are beyond help.

A hard drive is very delicate and complex equipment that needs utmost
care after damage. The best thing a user can do is not to try and use any
self help methods. He should go to a professional company that has the
infrastructure (a class 100 clean room), expertise and trained manpower.
There are specific procedures to clean, open and process such drives.

What if you were to lose data?
Most times in the aftermath of disaster, despite having backups you lose
data and God forbid your backup media is hit, ie tapes or disks, are affected.
In some cases it is possible to retrieve data from environmental disasters (see
box ‘possibility of data recovery after disaster strikes’). Whenever you are
confronted with such a situation, it is better to contact a data recovery
company. Most data recovery companies also provide on-site technical assistance,
to ensure that media is not damaged further in handling and transportation. A
clean room is basically a work area with controlled environment parameters such
as temperature, humidity, etc.

For hard drive data recovery, class 100 clean room is the standard, wherein
there are less than 100 particles larger than 0.5 microns in a cubic feet of
air. Similarly, many times there are situations where backup tapes get broken or
corrupted due to some operation error. Data recovery companies also offer tape
restoration services, which are again carried in a clean room.

Swapnil Arora and Vishnu Anand

No Comments so far

Jump into a conversation

No Comments Yet!

You can be the one to start a conversation.

<