Trends Watch

Doing Business on the Cloud: A Checklist

PCQ Bureau

01 Mar 2012 06:36 IST

New Update

While virtual and cloud environments differ in significant ways from physical systems, the overall approach to designing a fail-safe private or hybrid cloud is not considerably different from the approach for a traditional IT environment. Here's a checklist:

1. A continuity plan: A surprising number of organizations fail to implement a basic business continuity plan when moving to virtual and cloud environments. According to the Symantec 2011 Virtualization and Evolution to the Cloud Survey, India findings, Seventy-seven percent listed security as a significant/extreme challenge to implementing server virtualization and Eighty-three percent of organizations that have implemented hybrid/private clouds cited performance as a significant/extreme challenge. The virtual environments are not covered by disaster recovery (DR) plans.

Thus when making your plan, start at the beginning: think through your business needs and make sure you build the cloud to address these practical matters. Companies affected by public cloud outages did not identify potential single points of failure and develop fallback plans. IT organizations need to work with their business teams to understand the up- and downstream dependencies of each critical business service. Only after pinpointing these critical junctures and the organizations acceptable amounts of downtime, or recovery time objective (RTO), and data loss, or recovery point objective (RPO), can you put the right solution in place. Finally, treat mission-critical data and applications the same across all environments (physical, virtual, cloud) in terms of DR assessments and planning, to include ensuring the data and applications will meet the RTO and RPO requirements.

2. Visibility: Availability is only possible with visibility. Note which database will the application server point to? Can you “see” the health of each application running on virtual machines? When designing a solution, deploy solutions that give you visibility across your physical and virtual infrastructures.

3. Centralized control: In order for your cloud to deliver centralized IT operations, administrators must be able to monitor, manage, and report on multiple business services across different platforms, ideally from a single location. The challenge is that a typical data center is composed of different servers running different operating systems (OSes), and now, various virtual platforms including VMwareR, Hyper-VR etc. Implementing availability solutions that work across all platforms reduces complexity and increases reliability, with the additional benefit of minimizing training and administration costs. Ensure centralized reporting, to get to understand the leading causes of downtime in your environment as proper reporting tools make it easier to resolve problems and streamline the operations of your virtualized systems.

4. Automated recovery: In a world where minutes of downtime mean lost customers and hours of downtime mean negative news headlines, an effective business continuity plan must include automated recovery of your business services. In addition to speedier fault detection and recovery times, automated solutions reduce your reliance on personnel during an emergency. An effective solution will also detect faults in an application and all its dependent components, including the associated database, OS, network, and storage resources. In the event of an outage, the solution must be able to restart the application, connect it to the appropriate resources, and resume normal operations even if-as in a typical cloud environment- the various components sit on separate virtual and physical infrastructures. Look for solutions that can orchestrate and automate recovery across these OSes and virtual platform chasms, and enable you to get recovery times down to seconds without requiring human intervention. This best practice holds true for DR too.

5. DR testing: With server and application builds constantly changing, it is important to periodically test your DR strategy; that is the only way to guarantee a successful recovery in the event of a system- or site-wide outage. Because testing can be expensive and cause the downtime you are trying to prevent, IT organizations should consider non-disruptive testing tools. These solutions can simulate failovers, test configurations, verify that patches have been installed, and even notify you of errors that may cause recovery problems, including alerting you that volumes for a new application are not being replicated to the DR site. These non-disruptive testing tools will help you maintain productivity while identifying potential issues.

Therefore, it is absolutely, necessary to 'Build it correctly'. In order to achieve more from our investment and benefit what technology and innovation has to offer we need to be fully aware and be prepared to ensure smooth and efficient functioning of our business.

Stay connected with us through our social media channels for the latest updates and news!