Developments such as Web 2.0, mobile computing and wireless hotspots mean that application and system availability requirements become more and more critical. In turn, the processes and tools required to protect those applications have evolved as well.
Today, there are a myriad of technologies offering different approaches to data protection, application availability, high availability and disaster recovery. These technologies typically have at least one thing in common: they are IT-based solutions that are built to protect IT assets. When it comes to business continuity, it is imperative that choosing the right solution is a business decision based on the level of risk and disruption that can be tolerated by the different parts of the business.
For example, email is ubiquitous and preserving access to email through any type of disruption should be a priority, with 100 percent uptime the goal. Database applications such as sales order processing or online collaboration and content management may also require 100 percent uptime as the impact of downtime will be too much of a risk to the business. Other applications, such as purchase order processing, may demand no data loss, but a recovery time in the region of one hour may be acceptable. There may also be applications that are non-critical, where data can be recreated from original sources, or that are low risk and downtime measured in hours, or even days, is acceptable.
No one size fits all
Business continuity requirements will vary according to business type and function. There is unlikely to be a “one size fits all” solution for all applications used in business.
Ultimately, the risk to the business will be the driving factor. Assessing business need requires taking into account multiple factors. Data protection with extended recovery times may be acceptable for some functions, immediate data access for others. Protection through planned maintenance may be vital in some instances, 100 percent availability through disasters for others. Technology selection must address gaps between business expectations and existing IT capability. Closing the business continuity gap ensures IT delivers what business expects.
What are the options?
There are two approaches to business continuity: recovery centric or availability centric. Quite different technology is used to deliver the two approaches.
Today there are two classes of technology which can be adopted in a recovery centric strategy: back-up or replication. Both are typically focused on data protection.
Ranging from legacy tape technology to continuous data protection, there are a complete set of back-up technologies that will protect data. Whether held in tape format or on disk, recovering from a back-up will require rebuilding databases and file systems then reconnecting with applications, which themselves may need rebuilding. Although back-up technology can approach a Recovery Point Objective (RPO) of zero data loss, a Recovery Time Objective (RTO) measured in seconds will not be achievable. This is because of the focus on data protection and the separation (or lack of) application protection. Of course, back-up provides great flexibility for disaster recovery as tapes can easily be protected off site, and shipped to alternative sites on demand, but recovery of the business service will likely take days.
Today, replication is rapidly becoming an alternative approach for availability. Host or storage-based replication allows exact copies of operational data to be taken. Synchronous replication provides for no data loss, but considerations such as performance, cost and bandwidth requirements for off site protection must be taken into account. More widely spread is asynchronous replication, which has much lower operational implications and provides near zero data loss. The only loss would occur from potential transactions in flight at the time a failure occurred.
Why choose replication?
The big attraction of replication is that data recovery is not required. The online copy of data can be used immediately for failover. This is likely to require manual intervention, or significant scripting, and may require applications to be rebuilt. There is also a risk that application datasets may be missing from the replica copy if administrative processes have broken down and application upgrades have failed to be identified to administrators.
Protecting data off site for disaster recovery also requires closer consideration. There will be bandwidth considerations, and remote systems must be available to hold an operational copy of the data.
A recovery centric strategy will, by definition, be disruptive to the business. Recovery centric approaches are applicable to less important applications as business services will stop while recovery takes place. Although the level of disruption will be reduced with a replication/failover solution, it will still not be suitable for delivering an acceptable level of availability for mission critical applications. For such applications, an application or user centric approach is required.
Clustering technology
Historically, such approaches have depended on clustering technology. Clustering allows several machines to run the same copy of the application which is accessing its data on shared storage. Clusters may consist of multiple physical and/or virtual machines and provide a platform that protects against physical or virtual machine failure. In some situations, it may also address availability for planned operations where individual machines in the cluster may be disconnected, allowing maintenance to take place.
The limitations of cluster centric approaches relate to application and processor failure. Failure situations that address the whole site, such as natural disasters, power outages and facility upgrades are not covered. Because clusters rely on shared storage and shared facilities, it is important to guard against failures at that level. In turn, this means protecting the storage from being a single point of failure. This can be costly, requiring storage virtualisation and/or replication to be implemented concurrently. Additionally, virtual clusters may suffer from corruption of shared application images.
Provisioning applications across machines from the same virtual image will not guard against application corruption, and not allow application maintenance, thus limiting the level of high availability that can be delivered.
As mentioned in the introduction, there is an increasing realisation that there is a disconnect between the reliance of the businesses on business critical applications and the IT approach to business continuity. The business continuity gap exists because the solutions discussed above ignore the needs of the end-user- uninterrupted access to applications regardless of the cause of failure.
Continuous availability
Results of a recent survey indicate that in regards to email, over half of organisations depend on the users to notify IT of an issue. By this time, email access has been interrupted. Addressing the needs of the user has resulted in a new discipline of continuous availability.
Continuous availability solutions typically use redundancy of data and hardware, combined with data replication, in a “shared nothing” approach. While replication solutions share this approach, the difference comes when looking at the impact on the user, and hence the business. Continuous availability solutions will provide pro-active application awareness.
Application availability will be monitored through embedded best practice facilities with a degree of self-healing provided, changes in application configuration and data dependencies will be catered for, and automation will be an option to avoid the need for manual intervention. The level of protection will embrace the end-to-end service, not just an individual software component such as Exchange.
The choice of availability strategy will depend on many factors. Taking into account complexity in operation, total cost of ownership, skills available and the risk to the business of failure may mean combinations of the above technology are required to address business risk.
—Nick Ogle is Regional Sales Director for Neverfail Group (www.neverfailgroup.com) in Australia.
People who read this, also liked:
The importance of backing up