Wednesday, January 16, 2008

CDP Makes Backup Better, Faster, Cheaper

Continuous Data Protection Makes Backup Better, Faster, Cheaper
By Eran Farajun

The data backup world has changed dramatically in recent years. No change has been more dramatic or rapid than the shift from traditional tape-based backup technology to disk-to-disk (D2D) backup. Disk-based backup has enabled shorter backup windows and more rapid data recovery which has opened the way for more sophisticated backup and recovery software technologies that were not possible with tape backup systems. Software vendors have responded to the technology potential of disk-based backup with new enhanced functionality, such as point-in-time snapshots and local and remote replication in an effort to reduce the vulnerability of data loss in between scheduled backup sessions.

Beyond the pure speed advantages, disk backup is also the right technology at the right time to address the convergence of two business trends: the necessity for 24/7 data access in a global wired economy and the increasing use and importance of remote offices. According to the Enterprise Strategy Group an estimated 60 percent to 70 percent of mission-critical data is stored and used at offsite locations. Enterprise IT managers face the challenge of how to protect and manage all remote data in an era of tight budget constraints and the reality that the geographically distributed locations typically lack the IT staff to manage, monitor and verify backup operations.

Continuous Data Protection (CDP) Gains Momentum

CDP is the disk-based backup and recovery strategy gaining traction in data centers of various sizes. The traction is especially visible among users of Exchange, where the management and compliance challenges are driving elements of the CDP marketplace.

A CDP product is one that will continuously monitor an object for changes and will preserve copies of all prior versions of the object. The user will have the ability to view and access these prior versions, as required. The time to perform recovery changes is shifted from hours or days to seconds or minutes. The backup window is no longer a problem because there is no longer the concept of a backup window.

CDP is a cross between disk-based backup and replication. CDP continually captures all changes made to a file, and engages in tagging (versioning) objects so that they can be specifically rolled back to a particular point in time. The business value of CDP lies in the ability to restore data objects to a point before a data corruption or interruption event takes place. CDP protects/captures data as it is written to disk. One of the great myths of CDP is the unspoken assertion that CDP is for every kind of data, all the time. This is of course untrue, since the value of data changes as a matter of time, urgency and business dynamics.

One important scenario to keep in mind when considering the implementation of CDP is that of centralized backup for the remote or branch office. Too often, basic IT tasks like monitoring the backup server and changing tapes can be missed when assigned to remote office clerical staff not skilled in IT. Using a CDP strategy over the WAN to protect branch office file servers removes the requirement for tape drive and media handling at the remote site.

What about Recovery?

There are two general principles that govern all recovery policy-making: the recovery point objective (RPO) and the recovery time objective (RTO). The RPO defines how much data you are willing to lose when you recover data.

The RTO defines how long it will take to recover your business processes from a data failure. This includes not only the data recovery, but restarting the servers or applications that depend on that data. These recovery considerations must also be applied to local and remote recovery strategies.

A true CDP product protects every data change as it takes place, and the RPO approaches zero. On the other hand, with the vast amount of data being recoverable, how you choose the recovery point effects your RTO.

Some recovery points are based on time, a particular hour or minute. More useful, however, they are event-based. Since every data change is protected, a loss event can be absorbed and yield a recovery event.

Implementing CDP

CDP solutions are designed to be block-based, file-based or application-based. Block- and file-based CDP solutions have the advantage of functioning with a range of different applications, while application-based CDP is optimized and tightly integrated with a specific application, such as Microsoft Exchange. Potential CDP buyers should also be aware of the level of recovery granularity a particular CDP solution provides, as all CDP products are not created equal on this issue. Some products only support recovery of servers, volumes or folders and lack the granularity to recover a single file or email message.

CDP is deployed most frequently as an appliance or as a software solution running on a server or switch with agents. A dedicated CDP appliance can deliver good performance without impacting application servers, but the hardware can be extremely high, especially when an enterprise needs to scale its CDP capabilities and add more appliances. The software solutions are billed using various licensing strategies, frequently per server. The software solutions also involve agents that must reside on each server to be protected. The more servers a user has, the more agents that have to be purchased and managed…a stumbling block to the SMB, a potential struggle for the enterprise, that might have to manage agents on hundreds (or thousands) of servers. There is a significantly better way.

Host-based CDP software eliminates the hardware expense of a CDP appliance but comes with its own set of cost and complexity issues. The software solutions require that agents be installed on each server to be protected, creating management overhead and additional costs. The pricing model for this type of CDP is typical of most enterprise backup software that charges a license fee for each server or database that is protected, regardless of how often the CDP functionality is actually used by each server.

The third CDP architectural alternative is to simply incorporate the CDP functionality as a feature in a full-featured backup and recovery software suite, which has proven itself to be the simplest, cost-effective and most practical approach to CDP.

CDP as a Feature

CDP as a feature should be designed as a remote office CDP with the capability to work over a WAN. The CDP functionality should simply be integrated as a feature with no additional cost to customers or separate CDP application or appliance to purchase.

Apart from offering the CDP functionality in the software, CDP as a feature should include a robust feature set with retention policy management and the ability to perform data restores without interrupting CDP backups.

CDP as a feature should include a two-stage continuous backup. Backup starts with a change event, and granularity is available to the pace at which data is written to disk in a consistent state. Local servers aggregate the changes, deduplicate, compress and encrypt.

Having an agentless nature is significant. The disadvantage to an agent-based architecture is that you have to manage agents installed on perhaps hundreds or even thousands of client machines. Agentless solutions do not require another application running in the background greedily consuming IT resources, such as memory and CPU cycle time, or prone to being used in a hack.

Selecting a CDP Product at a Glance:

One way to determine if CDP as a feature is right for your remote or branch office location you have to ask yourself a set of qualifying questions, such as: Are you worried about meeting remote site service level agreements (SLAs) established by the CIO? Do you need to measure the business impact of downtime at remote sites? Do you have rapidly changing data that is critical to business operations? Are you worried about shrinking backup windows to protect that data? If you answered yes to one or more of these questions, you should be seriously investigating CDP technology for your remote sites.

There are several features a robust CDP product brings to the market. These include:

* Support of heterogeneous storage and server environments. Today's customers are refusing to be locked into a single vendor for their storage and server solution. Users should select a CDP product that doesn't restrict them to only a subset of their possible storage and server environments.
* Awareness of applications and their environments. Application recovery is becoming more complex and time consuming, users should chose a product that integrates application specifics into the CDP recovery process.
* Non-invasive to the application or server that is being protected. A CDP product should attempt to minimize any impact to the application’s I/O throughput or CPU load. This is best done by keeping the CDP footprint on the application server to a minimum, and moving any 'heavy-lifting' to an external server or appliance.
* Built on a scalable, reliable platform. If the CDP product is hosted on an appliance platform, the user should have the ability to add additional appliances that can scale their CDP capacity as data protection needs grow.
* Supports a federated application environment. Many of today's complex applications (such as SAP R3) utilize servers and storage that span multiple hosts. Customers should choose a CDP product that supports these systems, as it provides the user with a consistent, federated image for recovery.
* Supports business policies and SLAs. Companies assign different values to their different applications. A CDP product that is flexible in its support of differing protection and recovery policies can provide a better overall solution.
* Can be extended. Look for a CDP product that has functionality that can be easily extended by the customer to meet business needs.
* Tightly integrated with business continuity technologies. A CDP product that supports application clusters and remote replication provides a stronger solution than a CDP product that only provides a stand-alone solution.

Conclusion

Continuous Data Protection is the latest piece in an enterprise’s data protection arsenal. The CDP model has been redefined and the cost of CDP deployment has been driven to zero by integrating CDP as a feature in a distributed backup software platform. CDP is not a replacement for other data protection technology. Instead, it complements existing backup, replication and snapshot technology to bring advanced backup and recovery capabilities to improve the protection of customer data.

Eran Farajun is executive vice president at Asigra.
www.asigra.com

No comments: