Data Protection and Availability Service Catalogue – New Year, New Division, New Value PropositionPosted on January 20, 2014 by TheStorageChap in Application Availability, Data Protection
So a New Year and a new Division as VPLEX and RecoverPoint move out of what was the Enterprise Storage Division (now Enterprise & Mid-Tier Storage Division – EMSD) and into the Data Protection and Availability Division (DPAD), the new name for last years BRS Division. DPAD now has solutions for Archiving, Backup, Continuous Data Protection, Replication and Continuous Availability, and the question becomes how do you articulate a clear and concise strategy to customers around data protection and availability. The answer, a Data Protection and Availability Service Catalogue.
For me Data Protection and Availability go hand in hand and customers need both; in most cases for the same applications. Data Protection and Availability needs to be considered in the context of a Data Protection and Availability Service Catalogue with Application Owners or the Business choosing an appropriate level of service for their data and applications.
Traditionally, when considering the requirements for Protection and Availability we talk about the Recovery Point Objective (RPO), the maximum amount of data you can lose and the Recovery Time Objective (RTO), the maximum time it can take to recover. But when considering a Data Protection and Availability Service Catalogue these two metrics alone do not provide all of the information required to adequately design and deploy appropriate solutions.
It is instead necessary to break RPO and RTO into additional matrices.
The RPO should be considered in two terms:-
- Period of Recovery (POR) – the time frame over which recovery is required
- Granularity of Recovery (GOR) – the granularity of recovery during the POR
The RTO should be also considered in two terms:-
- Time to Decision (TTD) – the time it takes the business to decide to recover
- Time to Recovery (TTR) – the time it takes to actually recover
Finally one additional metric needs to be considered, the Backup Time Objective (BTO).
These five items enable application owners to adequately express their business requirements for Data Protections and Availability, and enable IT or Service Providers to design solutions that enable adequate data protection and availability.
So how does this work in practice? In the simplest context consider creating a Data Protection and Availability Service Catalogue of Platinum, Gold, Siler and Bronze that adheres to something like the following:-
|Widespread business stoppage with significant revenue impact.
Risk to human health/environment.
Public, wide-spread damage to organisations reputation.
|Synchronously Replicated – Automatic Failover,
Granular Corruption Protection,
|Direct revenue impact
Direct negative customer satisfaction
Non-public damage to organisations reputation
|Synchronously Replicated – Automated Failover,
Granular Corruption Protection,
|Indirect revenue impact
Indirect negative customer satisfaction
Significant employee productivity degradation
|Asynchronously Replicated – Automated Failover,
|Moderate employee productivity degradation||Backup|
The Business dictates that certain levels require the added protection of off-site replication and the Application Owner chooses the most appropriate level of service from the Data Protection and Availability Service Catalogue. But this alone is only part of the solution, it is still necessary to define the application requirement, which is where the metrics POR, GOR, TTD, TTR and BTO come in to play.
Below are examples of four applications that would fall into each of the above categories.
In this case the business has defined an RPO of 24 hours and a RTO of 12 hours, but alone this only shows part of the requirement. In fact in this case the application owner wants to be able to recover over a period of 7 Days at a granularity of every 24 hours. In this case the business can take 3 hours to make a decision to recover and IT can take up to 9 hours to actually recover the data. There is a backup window of 8 Hours. There is no requirement for replication within the Business Supporting classification.
In this example of Business Core, there is an RPO of 6 hours and a RTO of less than 60 Mins. Without any further insights this could result in a number of different solutions. In fact the application owner requires two levels of Data Protection and Availability. Firstly they wish to be able to recover at a granularity of every 6 Hours over a 24 Hour period. During this period a decision to recover needs to be made within 15 Mins and recovery needs to be completed within 45 Mins. They also want to be able to recover over a 6 Month window with a granularity of every 24 hours. The solution for the first requirement could be Array based Snapshots and the solution to the second requirement would be traditional backup.
The business has decided that off-site replication is required to support recovery in the event of array or site failure. Asynchronous replication is the most appropriate technology to meet the TTD of 15 minutes and TTR of 45 minutes.
In this example of Business Essential the Application Owner has defined an RPO of less than 15 Mins. and a RTO of less than 60 Mins. However using the expanded metrics, actually they have two requirements. Firstly to be able to recover over a period of 48 Hours at less than 15 Minute intervals and within less than 60 Minutes, and secondly to be able to recover at 24 Hour intervals over the period of 1 Year. The backup window is only 4 Hours for this application and the business had decided that replication is a requirement.
Taking a snapshot less than every 15 Minutes is probably not practical, so a Continuous Data Protection (CDP) would be required locally. The granularity of recovery within the first 48 Hours of recovery is also required in the event of a site failure so a technology that provides site to site replication with granular recovery at the remote site would also be required, for example RecoverPoint.
Again traditional backup technology would be required to meet the longer term RPO and RTO requirement.
In this example of a Business Critical application, the Application Owner has defined an RPO and RTO of Zero. But by leveraging the expanded metrics it is clear that there are a number of requirements in regard to Data Protection and Availability.
Firstly they need to be able to recover to any write over a 24 Hour period so will need a Continuous Data Protection solution locally. In regard to remote site replication they have a requirement for Zero data loss and fully Automatic failover to remove the need for the business to have to make a failover decision, something best served by a solution that enables Active/Active sites, such as VPLEX Metro.
Secondly they also have a backup requirement and thirdly a long term archive requirement.
These are only examples, but by leveraging an expanded set of metrics alongside a Data Protection and Availability Service Catalogue it becomes easier to classify an application’s requirements and then define the Data Protection and Application Availability Service Catalogue necessary to support the business.
What is clear, is that Data Protection and Availability go hand in hand and should always be considered together.