Implementation Guides

Business Continuity Planning

PCQ Bureau

07 Jan 2004 06:29 IST

New Update

It is often difficult to take the first step to identify the risks and the business impact of such an eventuality. The second chasm to fill is to overcome a mindset that something like this will ever happen to us. Even though 9/11 has brought the stark reality to our faces, critics still argue if such a thing is any reality for India.

Imagine a scenario when a major newspaper carries the story of how the entire customer database of a leading bank was posted on the Internet. For the said bank, the above is enough to loose more than one fourth of its business in the first two weeks, more than several hundred lawsuits to address to, more than 20% share value decline, intangible yet significant loss of market goodwill and credibility loss amongst a host of other things. Just a small act of failure of one of your servers or even a mis-configured server can cause this havoc.

The foundation of BCP for any organization should be laid on three cornerstones: prepare, respond and recover.

PREPARE

This is the phase when you decide to lay down a formal continuity plan. Each organization has to determine its own objectives for initiating BCP. It is also important to determine a suitable and executive level sponsor for this project. In most cases, the prepare part is to do with people’s mindsets and process re-engineering and will require the support of all the senior management you can get.

Inventory and risk analysis

The first step of BCP is to create a top-level critical inventory of the resources of the organization. Start with facilities, processes, applications, people and intellectual property.

Facilities would include plant, machinery, physical locations, network and other infrastructure owned or managed by the company.

Processes are the key to any organization, and it is the key role of a BCP to provide resilience to business against process failures. Identify the key processes of your organization–for a service organization, the process of customer service, customer acquisition and/or complaint management would be key, likewise for a manufacturing organization the sourcing, plant maintenance and/or the part manufacturing process could be key. You must identify and document such processes.

Application inventory comes next. While for some organization their ERP or CRM application may be key, a recruitment company may identify the candidate database management application as the critical application.

Coming to people, common sense would tell us the more a person is senior, the more valuable he becomes for a company. However, an objective assessment of the skill sets is critical. Further, this is a dynamic list of people varying with time. A project manager of a multi-million dollar company may be extremely critical during the execution of the project, while a finance wiz may be critical just before an important
IPO.

For some sustained resources, such as those who hold the IPRs of the company, perhaps a long-term view is to be drafted. In today’s world of crime and terror, it is not difficult to see, why organizations should not be prepared against kidnapping or detention of their CEO, making them unavailable to execute. The risk analysis document should also tell us about the potential impact of loosing any of the inventoried items.

Gap analysis and resource enhancement

The risk analysis and business impact document will lead us to a list of resources that are critical to our business operations. It is at this point of time that an alternate strategy is laid out to replenish a resource if it becomes unavailable due to a disaster. If any specific location is critical, an alternate location is mapped, staffed with skeletal staff, and decked with the necessary infrastructure to conduct the operations. I know of one company, which had created a resource pool of Wi-Fi cards and access points, to be used in case LAN becomes unavailable. Their claim to fame was, they would get the LAN up within 60 mins of failure, irrespective of the complex switching requirements. In the same manner, most enterprises now go for a two links from two different service providers to mitigate the risk of loosing one circuit. Many companies are creating vendor resources available as backup resource. One company has placed orders with a PC rental company to hire about 20% of its current installed computer base, just in case there is a need in the future.

Likewise, there is a necessity to identify alternate processes in case of disasters.

The document should answer questions such as: If the HO becomes unavailable, how do we continue to provide approvals to the regional offices, or how would we take care of our customers, if the complaint management process fails.

People are the most difficult to manage yet the most valuable resource entities. It is important to identify key alternate resources for every resource. This is also the time to pro-actively review the Key Man insurance policy, many insurance companies offer. For many organizations the only solution lies in building an alternate skill bank through fresh recruitments or enhancing skills of the current employees.

As you can see from above, the focus in this segment of the project is to look for alternatives.

Quality assurance, testing and maintenance

The best laid out plans may go haywire if not prepared with meticulous care and tried and tested several times. It is important to invite external consultants to provide additional views to readiness and gap analysis. Further, this is the right time to appoint suitable auditors to ensure that the process implementation is appropriate and alternate resources are indeed real. It is definitely recommended to do dry runs for the proposed plans. Additionally, these plans need to be revisited, the currency re-evaluated and documentation updated with the changing business scenario.

RESPOND

Assuming appropriate preparations have been taken into account, there will be real-life situations that will test these.

Declaration triggers

A disaster could be partial or total failure of the identified resources. It is extremely important to categorize the disaster and therefore trigger the necessary reactive steps. There has to be a well-defined policy to declare disaster for any specific resource–the type of disaster and the level of disaster.

Communication

The next step in addressing any disaster is to communicate. Communicate with all the necessary stakeholders: employees, specific senior management, shareholders, customers, vendors and associates, and the general public, in that order. It is, of course, assumed that the authorized communicator is also pre-defined and configured.

Incident response

An incident response team should get pressed into service and necessary alternate resources brought to life. The actual steps would be different for different organizations under different circumstances, but the basic principle remains the same. The incident response team will make available alternate resources for the failed resource. If the disaster is IT related, it is generally considered good practice to launch a forensic analysis on the disaster to gain information on the hack attempt, etc.

Resource mobilization

The incident response allows us to analyze the situation and trigger the specific processes for resource deployment. If the HO premises have been rendered non-functional, then the HO organization needs to be relocated. Space, facilities, infrastructure have to be re-created. Some of such mobilization is listed below.

Activating DR (Disaster Recovery) sites. For organizations that can afford to create a total DR site, this is the time to activate them

Kicking in backup support system as main system. These may be off-site backup, warm backup and/or cold backup
Additional hardware deployment

Reciprocal agreements with other players in the market

Vendor supplied equipment. In some cases, where vendors hold the key to mobilization, appropriate documented processes should set in and resources furbished

RECOVERY

Identify return path

Most alternate infrastructure is not designed to run forever. A company cannot afford to permanently work from a makeshift arrangement, hence, once the basic operations are set in motion, a clear cut roadmap needs to be drawn out to document the return path. Typically, organizations take more than 7 to 10 days to come up with such a recommendation from a state of total corruption. This plan needs to be approved by the respective stakeholders as well as senior executives of the organization.

Identify permanent changes

There are situations where the damage is of a permanent nature, such as permanent loss of data, people or infrastructure. Under these circumstances it is required to document these changes, and to ensure that dependencies of the other resources or processes are managed. Some times, these may result in re-engineering other processes.

Implement the recovery

The organization implements the return path or changes to the current systems and procedures to execute the recovery.

Enhance existing plans and documentation

Every incident should act as a source of learning point. Be it minor or major incident, a formal documentation of the same should be an integral part of the process. Further, for a large corporation it is also important to have a proper version control of the same, so that learning from one region can be shared with other regions.

BCP is not a project that you execute within a time frame; it is a process that has roots in almost all segments of business. Also, BCP does not belong to the IT department; it is a cross department function. Remember, it is wrong to believe that disaster happens to our neighbors and not us.

Alok Sinha Chief of Information Security, Bharti Group

Stay connected with us through our social media channels for the latest updates and news!