With an on-premises data backup solution within your data center, it’s critical for your business to have a disaster recovery plan built into your business continuity plans. You need to have a plan in place should a disaster occur that affects your operation of the business. The same is true when you start to leverage the cloud for its storage capabilities for your backed-up data.
This course explains how cloud storage fits in with DR and the different considerations when preparing to design a solution to back up your on-premises data to AWS. It will explain how Amazon S3, AWS Snowball, and AWS Storage Gateway can all be used to help with the transfer and storage of your backup data.
You should not assume that just because you are backing data up to the cloud it will solve your every need: there are many points of consideration when planning a DR backup solution to the cloud, such as AWS. However, it does also provide opportunities that may not have been possible with a standard on-premises backup solution. It’s these points of interest that many enterprises are focusing on to gain a significant advantage when it comes to disaster recovery.
AWS has a number of different services available to help you architect the best solution for your needs. To allow you to set up the correct solution that works for you, you must first understand how each of these services can be of benefit to you.
To help you implement effective solutions, you must first have answers to the following:
- What is your RTO (Recovery Time Objective)?
- What is your RPO (Recovery Point Objective)?
- How quickly do you need to retrieve your data?
- How much data do you need to import/export?
- What durability is required for your data?
- How sensitive is your data?
- What security mechanisms are required to protect your data?
- Do you have any compliance controls that you need to abide by?
When you have answers to these questions, you will be able to start working towards an effective backup solution to create a cost-efficient, highly reliable, durable, and secure data backup storage solution.
Learning Objectives
- Gain an understanding of how your storage solution can affect your business continuity and DR plans.
- Obtain the knowledge to know when to use specific AWS storage solutions to your advantage between Amazon S3, Amazon Glacier, AWS Snowball, and AWS Storage Gateway.
- Understand how each of these services can provide a DR solution to fit your specific needs.
Intended Audience
This course has been designed for:
- Engineers who need to manage and maintain AWS storage services
- Architects who are implementing effective data backup solutions from on-premises to AWS
- Business continuity management managers
- Anyone looking to prepare for the AWS Solutions Architect - Professional certification
Prerequisites
As a prerequisite for this course, you should have a basic understanding of the following:
- Business continuity
- Disaster recovery
- Data backup terms and methodologies
- Amazon S3
- Amazon EC2
- Elastic Block Store (EBS)
This course includes
7 lectures
Feedback
If you have thoughts or suggestions for this course, please contact Cloud Academy at support@cloudacademy.com.
Resources referenced within this lecture
https://thecloudcalculator.com/calculators/file-transfer/
Transcript
Hello and welcome to this lecture, where I shall be covering a number of points of consideration when designing your data storage solutions for your infrastructure when looking at it from a DR perspective.
There is a fine line between how you architect your data storage needs, which must be fit for a purpose or the data it holds, but it may also have to conform to specific governance and compliance regulations for DR.
So determining which solution or service to use to store the data and which solution to use to ensure you can recover your data effectively in the event of a disaster is a balance.
From a DR perspective, this is largely down to the particular RTO and RPO for the environment you are designing.
As a quick refresher, Recovery Time Objective, RTO, is defined as the maximum amount of time in which a service can remain unavailable before it's classed as damaging to the business.
Recovery Point Objective, RPO, is defined as the maximum amount of time for which a data could be lost for a service.
Depending on the values of these, it can help you select the most appropriate storage method. For example, if your RTO was an hour, then restoring data from Amazon Glacier may not be effective, as it can take a number of hours to process, depending on your retrieval method.
Another huge element of this will depend on you as a business, and how you are operating with an AWS, and your connectivity to your AWS infrastructure. You may just be using AWS as a backup solution, and retaining all of your production data on-site at your own data center. Or you might be operating entirely within AWS, and utilizing AWS storage services for data backup of your AWS environment, such as snapshots of EBS volumes, or of RDS instances.
As this course is focused on how best to utilize AWS storage services when backing up data from your on-premise data center, it's important to look at the following elements when selecting your chosen storage service, starting with how will you get your data in and out of AWS.
The method on which you choose to move your data from on-premise into the cloud can vary depending on your own infrastructure and circumstances.
If you have a Direct Connect connection to AWS, then you can use this to move data in and out of the environment, which can support connectivity of up to 10 gigabits per second. If you don't have a direct connection link between your data center and AWS then you may have a hardware or software VPN connection which could also be used.
Now if you don't have either of these as connectivity options then you can use your own internet connection from the data center to connect and transfer the data to AWS. Depending on how much data you need to move or copy to AWS, then these lines of connectivity may not have the required bandwidth to cope with the amount of data transferred.
In this instance, there are physical disc appliances that are available, using the AWS Snowball service, whereby AWS will send you an appliance, either 50 Terrabytes or 80 Terrabytes in size, to your data center, where you can then copy your data to it before it is shipped back to AWS for uploading onto S3. You can use multiple Snowballs at a time to transfer Petabytes of data if required.
In extreme circumstances, AWS does offer an even larger storage transfer solution known as Snowmobile. This is an Exabyte-scale data transfer service, where you can transfer up to 100 Petabytes per Snowmobile, which is a 45-foot long shipping container pulled by a semi-trailer truck.
The AWS Storage Gateway service is another method which acts as a gateway between your data center and your AWS environment. A software appliance is configured on-site at your data center and offers a range of options in moving data into AWS.
More on both AWS Snowball and AWS Storage Gateway will be discussed in greater detail in upcoming topics within this course.
So how quickly do you need your data back? This closely relates to your RTO requirements. You'll need to define the boundaries of how quickly you need to get your data back which will depend on its criticality to the business. This will vary greatly from solution to solution. Some storage services offer immediate access to your data, such as Amazon S3, while others may require several hours to retrieve, such as Amazon Glacier Standard Retrieval.
Your connectivity to AWS also plays an important part in this timeframe, as discussed earlier. You need to understand how much data you need to import and export. Determining how much data you need to get in and out of AWS is essential. This can greatly affect your chosen solution.
You should also calculate your target transfer rate, and this is the length of time it would take you to perform a copy over your connection to AWS.
To help you calculate this you can use this useful resource by specifying the amount of data you have and the connection speed.
You need to ensure that your data backup solution offers the capacity and that you have the means in which to transfer the required amount of data, and understand how long this process takes. This closely relates to the previous point of how you will get your data in and out of AWS.
Durability. When looking at the durability of a data backup, you'll need to ascertain the criticality of that data to ensure it offers the most suitable resiliency and redundancy. For example, if I look at the Amazon S3 service it has the following classes available.
The Standard Class, which provides 11 nines of durability and 4 nines of availability.
The Infrequent Access Class, known as IA, provides 11 nines of durability, but only 3 nines of availability, and this is often used as a backup store over the Standard Class.
And Amazon Glacier. This also provides 11 nines of durability, and is used as a cold storage solution. This offers the cheapest storage cost of the three, but does not allow immediate access to the files.
AWS also offers specific service-level agreements for its services, so it's also worth taking a look at these to understand the durability and availability for your data.
Security. A key focus for any data you store in the Cloud is security. Ensuring that your data has the right level of security safeguarding it from unauthorized access is fundamental, especially if it contains sensitive information such as customer data.
You may need to abide by specific governance and compliance controls and so you need to ensure that where you store your data in the Cloud is able to offer the correct functionality to ensure your data remains compliant.
When working with sensitive information, you must ensure that you have a means of encryption both in-transit and when at rest. You should understand how your selected storage method operates and manages data encryption if this level of security is required.
A sound understanding of Cloud storage access security is a must for your support engineers, who will be maintaining the environment.
If security is not configured and implemented correctly at this stage, it could have devastating and damaging effects to you as a business should the data be compromised and exposed in any way, which has already happened to many organizations who failed to understand the implications of their security controls.
And finally compliance. As I just mentioned, compliance comes into play specifically when looking at security of your data. There are a number of different certifications, attestations, regulation, laws, and frameworks that you may need to remain compliant against.
To check how AWS storage services stack up against this governance, AWS has released a service called AWS Artifact, which allows customers to view and access AWS Compliance Reports. These are freely available to issue to your own auditors, to help you meet your controls.
The service itself is accessed by the AWS Management Console, and all of the reports available are issued by external auditors to AWS themselves, and within each report it will contain the scope indicating which services and region the report is associated with.
That now brings me to the end of this lecture. Coming up next I will discuss and explain how Amazon S3 can be effective as a data backup solution.
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.