High Availability in RDS
Designing for high availability, fault tolerance and cost efficiency
Business Continuity and Disaster Recovery for SAP on AWS
The course is part of this learning path
** Not all content covered in the course introduction has been added to the course at this time. Additional content is scheduled to be added to this course in the future. **
In this section of the AWS Certified: SAP on AWS Specialty learning path, we introduce you to strategies for configuring high availability and disaster recovery for SAP workloads on AWS.
- Understand how to configure high availability with Amazon RDS
- Identify backup and disaster recovery strategies using the AWS Cloud
- Describe various approaches for business continuity and diaster recovery for SAP workloads on AWS
The AWS Certified: SAP on AWS Specialty certification has been designed for anyone who has experience managing and operating SAP workloads. Ideally you’ll also have some exposure to the design and implementation of SAP workloads on AWS, including migrating these workloads from on-premises environments. Many exam questions will require a solutions architect level of knowledge for many AWS services. All of the AWS Cloud concepts introduced in this course will be explained and reinforced from the ground up.
General Backup and Recovery in AWS. A big part of reliability is about backing up all your data to durable storage as well as implement recovery procedures according to your pre-defined Recovery Time Objective or RTO, Recovery Point Objectives or RPO, and Mean Time To Recovery or MTTR metrics. The main guiding principle is to assume everything fails, and then, design backwards with enough redundancy to eliminate single points of failure that can bring down an entire system.
Your applications should continue to deliver results even if the underlying infrastructure fails or is replaced. It is important to define requirements around your expected Recovery Time Objective, Recovery Point Objective, and Mean Time To Recovery metrics. RTO answers the question of how quickly your systems must recover. It can be measured in days, hours, minutes, seconds, or even fractions of a second. RPO answers the question of how much data can you afford to lose.
In other words, for how long can data collection stop on your systems and not impact your business severely? Clearly, the recovery point objective is going to be a part of the total recovery time objective. The Mean Time To Recovery, as the name suggests, represents the mean amount of time for the recovery of your systems after a major failure. These expectations in terms of metrics will dictate how you need to invest in your architecture to meet them.
The architecture, backup schedules, frequency, and data retention periods are dictated by the RTO, RPO, and MTTR metrics. Recoverability is often an item that requires improvement. In the event of a natural disaster, some components can become unavailable or as a result, your primary data source becomes unavailable. In this case, you need to be able to restore service quickly and without losing data. The most important detail is to test your recovery procedures before you actually meet them in a real situation. The Mean Time To Recovery metric is obtained during testing or "game day" scenarios. You will want to make sure to schedule and perform enough tests to have a reliable metric.
The guiding principle is to: "Test beyond destruction to make sure recovery procedures are automatic, successful, and as expected." Backup and Recovery of SAP workloads on AWS assumes that you are familiar with implementing and operating SAP solutions, including familiarity with the general SAP backup and restore recommendations as explained in the SAP technical operations manual. The most significant change in backing up an SAP system on AWS compared to a traditional implementation, is the backup destination. AWS uses Amazon S3 instead of tape as the final resting place for your datasets.
As such, by leveraging S3 for storage, you get a highly durable storage designed to provide 11 9s of durability and by default, 4 9s of availability over a given year. The levels of availability are represented as shown on your screen. Notice, 1 nine of availability represents about 90% of uptime over a given year, a maximum amount of about 36.5 days per year, and the equivalent downtime per day of about 2.4 hours. 4 nines of availability represents about maximum downtime of about 52.6 minutes across a year or 8.6 seconds less than 10 seconds equivalent downtime per day. The gold standard in terms of availability is 5 nines of availability; 99.999%.
And this equates to a maximum downtime per year of about 5.25 minutes and an equivalent downtime per day of less than a second, 0.86 seconds. All SAP on AWS backup and restore strategies rely on Amazon S3 as the storage service. Using S3 fills the offsite storage requirement automatically. There are rudimentary ways for you to use S3 for backups. The first is to store data directly, the second entails backing up your data indirectly by using any of the mechanisms built into some AWS services like EBS snapshots, for example, or use the AWS backup service or even third party backup solutions that are able to read and write to Amazon S3 as the final resting place for your datasets. AWS allows for High Availability deployments across multiple availability zones in the same region and across different regions. You can actually use High Availability and Disaster Recovery using multiple availability zones and regions any time you desire. Amazon S3 is able to support both type of topologies.
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.