image
Amazon S3 Replication
Amazon S3 Replication
Difficulty
Intermediate
Duration
22m
Students
1264
Ratings
5/5
Description

This course explores two different Amazon S3 features: the replication of data between buckets and bucket key encryption when working with SSE-KMS to protect your data. You will learn how Amazon S3 replication works, when to use it, and some of the configurable options. We'll also look at how S3 Bucket Keys can be used to reduce costs when using SSE-KMS.

If you have any feedback relating to this course, please contact us at support@cloudacademy.com.

Learning Objectives

The objectives of this course are to explain:

  • How Amazon S3 replication works, when you might use it, and some of the configurable options
  • How S3 Bucket Keys can be used to reduce costs when using SSE-KMS

Intended Audience

This course has been designed for those who support, operate, and architect solutions involving Amazon S3.

Prerequisites

As a prerequisite to this course, it would be advantageous to have a working knowledge of Amazon S3, including some basic understanding of S3 security and management features. 

 

Transcript

Hello and welcome to this lecture covering S3 replication. During this lecture, I want to provide an overview of how Amazon S3 replication works and when you might use it and some of the configurable options.

So what is S3 replication? As you have probably already guessed, S3 replication allows you to copy your objects asynchronously from one bucket to a single, or even multiple destination buckets, either from the same or different regions. You can even replicate your data objects to buckets owned by a totally different AWS account, so it is a very flexible way of duplicating your data where you need it across your AWS infrastructure.

If you are replicating your data within a single region, then it is known as same-region replication. And you might want to do this for a number of reasons. For example, you might have to meet stringent compliance standards and regulations which means you have to keep multiple copies of the data, between separate AWS accounts. By using same-region replication, you can easily replicate objects between Account A and Account B within the same region.

Another reason might be for data aggregation, if you're using multiple different AWS accounts, all of which might be using applications to store log data, then you could use same-region replication to replicate that data to a single bucket managed by a single account, allowing or easier management and processing of those logs.

However, as I already mentioned, you might want to replicate between different regions, which is known as cross-region replication. So when might you want to use this option?

Again, compliance regulations can be a big reason for using cross-region replication. You might be required to store your data across a wider geographical area than in just a single region. By replicating your objects to a completely different region allows you to meet this obligation.

Also, you might have reason to use cross-region replication to benefit your business and your customers. By replicating data closer to business units and customers in other regions, you can reduce latency for data access.

Now, you also have another feature which can be used with both single-region replication and cross-region replication, this being multi-destination. What multi-destination allows you to do, is to replicate objects from a single source bucket to multiple destination buckets, either in the same region or different regions giving you the ability to store multiple copies of the same data.

To configure S3 replication is a simple process which uses rules to manage the whole process. And these rules require you to enter specific information allowing S3 to know what is to be replicated and where it should be replicated to.

Before applying these rules, you should be aware that both the source and destination buckets must have versioning enabled. S3 versioning is a bucket feature that allows for multiple versions of the same object to exist. And this is useful to allow you to retrieve previous versions of a file, or recover a file should it be subjected to accidental deletion, or intended malicious deletion of an object.

Additionally, if the source bucket has the S3 Object Lock feature enabled, then the destination must also have it enabled too. Now, Object Lock is often used to meet a level of compliance known as WORM, meaning Write Once Read Many. It allows you to offer a level of protection against your objects in your bucket and prevents them from being deleted, either for a set period of time that is defined by you or alternatively prevents it from being deleted until the end of time.

Also I just want to highlight that S3 needs to have permission to perform the replication, and these permissions are granted via a role which is defined or selected during the creation of your replication rules.

The following is an example role allowing S3 to perform the replication on your behalf using the bucket awssecuritycert as the source bucket and ca-bucket-uk as the destination bucket.

Also, the following trust policy must be associated with the role, allowing the S3 principal to assume the role to carry out the actions. For more information relating to IAM, policies and roles, please see our existing course here.

There are some additional points relating to cross-region replication when using different AWS accounts. So for example, Account A with a bucket in region eu-west 1 needs to replicate its objects to Account B in us-east-1. Then the destination bucket owner must ensure that permissions are in place to grant the source owner access to replicate objects to the bucket via a bucket policy.

Now, one final point to make, is that destination buckets cannot be configured as Requester Pays buckets. For those unfamiliar with this feature, when it is configured, any costs associated with requests and data transfer becomes the responsibility of the requester instead of the bucket owner. The bucket owner will still, however, pay for the storage costs associated with the objects stored in the bucket.

Let's now take a quick look at how to configure a bucket for replication from within the AWS Management Console.

Okay, so I'm in the dashboard of Amazon S3. And the first thing I want to do for this demonstration is to create two buckets. One will be a source bucket and another will be a destination bucket. So this is going to be my source bucket. So I'll just call this "mysourcereplication." And I'll have it in the London region.

I'm going to block all public access. I need to enable versioning 'cause you need to have versioning enabled if we're going to use replication. I'm just gonna accept the default on the Tags. I'm not going to enable encryption. Click on Advance Settings and I'll just leave Object Lock disabled as well.

So let me just create this bucket. Okay. So we have my source replication bucket here. So let me now create my destination replication bucket. So create bucket again, mydestinationreplication. Let's go for a different region this time. Let's go for Paris. Again, I wanna block all public access.

Again, I need to enable versioning 'cause versioning needs to be enabled on both the source and destination when working with replication. And I'll just accept all the other default settings and then Create Bucket. Okay. So I now have both my source replication bucket and also my destination replication bucket.

So on my source replication bucket, I need to create the replication rules. And this is under the Management section. So if we scroll down here, we can see replication rules. And at the moment I don't have any rules created. So let me create a replication rule.

So here, I'm just going to call this "MyRule." And we can have this rule enabled or disabled. I'm going to have it enabled for this bucket. And here we have the information at the source bucket and the region that it's in. And I can either limit the scope of this rule using one of the filters.

For example, we can add in a prefix here to only replicate any objects with a specified prefix, or I can use this rule and apply it to all objects in the bucket, which is what I'm going to do. Now, for the destination, we can choose a bucket in this account or specify a bucket in another account.

So if I was doing cross-account replication, then we can specify the account ID and bucket name. But as you just saw, I created my destination bucket in this account. So I can just enter the bucket name, which was mydestinationreplication. We can see that it's picked up the destination region of Paris as well, which is where we created it.

Now, here, we need to specify an IAM role, and this will allow S3 to have access to both buckets and also perform the actions required to replicate the data. They can either create your own role and assign the permissions, or you can just simply select Create new role. So if you select this option, S3 will simply create a new role to perform this task.

If you want to replicate objects that are encrypted, then you can do so. And also you can change the storage class as well in the destination. I'm just going to leave that as default. Now, there's a number of additional replication options here that you can just enable via our checkbox.

For example, if you want to monitor the progress of your replication rules, you can select this tick box, or if you wanted to replicate any metadata changes between the source and destinations bucket, then you can enable replica commodification sync.

For this demonstration, just on crossing the rules, I'm going to leave that as default and then click Save. So now we can see that we have a replication rule here. This is what we called it, this the destination bucket. And here's some of the options that we configured within the rule. And up here, you can see the name of the role that's been created by S3 to perform that replication.

So let's now go and add some objects to these buckets, to see if the replication works. So if I select mysourcereplication, click on Upload, Add files, and I can just pick a couple of screenshots here. Just some PNG files. We can see that it's gone into this bucket. We can add any additional options if you want. I'm just going to take all the defaults and then click Upload.

So the three files have now been added to this bucket. So because we have the replication rule added to this source bucket, if we now go across to our destination bucket, we should see the objects appear there very soon. So let's take a look. Let's go to our buckets. And then go to mydestinationreplication bucket. We can already see that those three PNG files have already been replicated. So setting up replication rules is very simple. It's very quick, and it's very easy.

Once a bucket has been configured for S3 replication, only the objects added to the bucket from that point will be replicated. So if you're planning on using replication, consider implementing this feature at the time of bucket creation to ensure that it captures all objects going forward.

From a security standpoint, you might have objects that are encrypted. If those objects are encrypted with SSE-S3 or SSE-KMS, then these can be replicated as required as long as you specify that requirement within your replication rules. If using cross-region replication with KMS, then you will need to specify the ARN of the destination Customer Master Key, the CMK, to be used and these are region specific. For more information on both S3 Encryption mechanisms and KMS, please refer to the following courses.

Understanding S3 encryption mechanisms to secure your data: https://cloudacademy.com/course/s3-encryption-mechanisms/

How to use KMS Key Encryption to protect your data: https://cloudacademy.com/course/amazon-web-services-key-management-service-kms/

If you have a requirement to replicate your objects metadata, then you can enable replica modification sync which will allow replication of your metadata bi-directionally. And this helps to keep all changes to the objects tagging information, object lock retention information, and security controls such as Access Control Lists, giving you a very easy way to create a shared data set between buckets. As we saw in the demonstration, this feature can be configured and enabled from within the replication rule.

Much like replica rules in general, one of the great things about this replica modification sync, is that data relating to this feature is captured within Amazon CloudWatch using specific metrics when the following replication options are selected within the rules configuration.

About the Author
Students
236921
Labs
1
Courses
232
Learning Paths
187

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.