1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. Deploying a Highly Available Solution for Pizza Time

Managing S3 data

play-arrow
Start course
Overview
DifficultyIntermediate
Duration3h 11m
Students1072

Description

In this group of lectures we run a hands on deployment of the next iteration of the Pizza Time solution. The Pizza Time business has been a success. It needs to support more customers and wants to expand to meet a global market.  

We define our new solution, then walk through a hands on deployment that extends our scalability, availability and fault tolerance. 

Transcript

Hi and welcome to this lecture.

In this lecture we'll talk about managing S3 data, where you define a few concepts in these slides. We'll talk about versioning, lifecycle, and cross-region replication. And then we will have a demo where I will show you how to enable versioning in a bucket. We will configure a lifecycle rule, and we'll also enable cross-region replication between two buckets.

So versioning is a way of keeping multiple variants of an object in the same bucket. And you can't enable versioning just to a particular object. When enable versioning, it applies to all the objects in the bucket. And versioning enables you to recover objects from accidental deletion. And it also enables you to recover overwrited files.

So let's say that we enable versioning for a bucket and we deleted a file. Instead of removing the object S3, we will actually insert a delete marker in the object. We will talk more about it in a few.

Once you enable versioning in a bucket you can never return it to an unversioned state. You can though, suspend a versioning which will make S3 stop creating new versions for your file, but when you enable versioning in a bucket, that changes the structure of the bucket, so you can never return it to an unversioned state.

For example, let's say that we have a bucket with a single file called Picture.jpg. Nothing new in here, so let's move forward. We enable versioning for this bucket, and we uploaded a few versions of this file, so now we have two informations for each objects. We have the object name, and we will also create an ID for each object. And every time we upload a new version of this file, we will actually be creating a new ID for this file, and S3 we will mark the most recent version as the previous version of this file. So if we uploaded a new file, we would have a picture in here and we would have a new ID that would be a random number.

And when we delete a file, we are actually not removing the file, we are going to insert a delete marker in the file. When we look for this file with the URL of the object, we will actually find no object. We won't be able to access the object, but if we look for the previous version, if we look for the ID of the file, we'll be able to restore this previous version, or read it, or do whatever we want with this version. The version will still be there. We won't delete that. We will only place a delete marker in the file.

Talking now about lifecycle rules. You can use lifecycle rules to simplify the lifecycle management of your objects. You can apply lifecycle rules to the whole bucket or to specific objects or folders. Creating lifecycle rules is a great way to transition objects to another S3 storage class. You can for example have your objects in a standard storage class and after a while you want to shift these objects to reduce it redundancy storage class instead. Just to save some money and you can automate these changes using lifecycle rules.

You can also transition objects to Glacier archives. This is great for compliance and also archiving. Maybe once you have your files available for just a period of time, and later on you want to start these files in a cheaper storage class, you want to start these files in a Glacier archive because you don't think that you need of those files, but you need to have a copy of those files just in case. You can use lifecycle rules to expire objects.

Expire objects only applies when you have a version enabled bucket. What expire action we do is place the deleted marker in the current version of the object. So you can say for example, after 30 days place a delete marker to an object and this object won't be available anymore in the bucket, but we still have the previous version of the object inside the bucket if you need to restore that version.

You can also delete the previous version of an object. This also applies to versioning enabled buckets, different from the expire action. That will actually delete a previous version of a file. This is very handy. You can also configure a lifecycle rule to clean up incomplete multipart uploads. We can't see incomplete mulitpart uploads in our buckets, but AWS is still charge us for these incomplete uploads. So it's great having an automated way to clean up these incomplete uploads because sometimes we forget about them and we still have to pay the bill at the end of the month.

Now talking about cross-region replication. With cross-region replication you can asynchronously copy objects across buckets in different AWS regions. That's kind of self explanatory. I believe that you already knew what cross-region replication means.

One thing that you need to notice is that the source and destination buckets must be versioning-enabled. You can't have cross-region replication without versioning. And you can only replicate objects from a source bucket to only one destination bucket.

Let's now go to the S3 console and have our demo on managing S3 data.

Here at the S3 console, the first thing I want you to do is create a new bucket. So we'll call it my-nice-bucket-1. And this bucket will live inside the Oregon region. Selecting the bucket going on properties, we can click in here to enable versioning. And it's real simple. Just need to click in here, confirm. And you already enabled versioning for this bucket.

Remember that you can't disable versioning, you can only suspend versioning. Let's now talk about lifecycle rules, and to create a new rule you just need to go into the lifecycle tab. Click add rule and we can add a rule to the whole bucket, or to a particular folder or object. But in this case, I will add to the whole bucket.

And we have the options for this rule in here. In this case, I just want to expire the objects. And I'll say that after a year I'll expire the object inside this bucket. Click on review and we can specify a rule name, but I really don't want that. Click on create and activate. And now we have our lifecycle rule created. We can disable the rule if we click in here, but I will leave it enabled for this purpose.

Now talking about cross-region replication. You already have enabled versioning in this bucket and we could enable cross-region replication, but I don't have the bucket that I want to replicate with yet. So I will create another bucket and I'll call it my-nice-bucket-2. And this bucket will live in the Singapore region. Click on create.

Now that we have another bucket we need to enable versioning because remember we need to have versioning enabled in this source and also in the destination bucket. And now we can go back to the first bucket. Go on cross-region replication, select enable and the source will be this bucket. We need to select a destination region which will be Singapore. And in here we need to select the bucket. And we could change the storage class during the replication to the other bucket. But I will use the same storage class as the source.

We need to have an IEM rule in order to enable this replication. So I'll just click in here and I already have a rule configured. So we can see the details in here. I can select the cross-region replication rule that I already had created and I'll click on allow.

Now we have selected our IEM rule. We can simply click on save. And the cross-region replication is enabled. We can now open up a terminal and we can create a new file and send to our first bucket. Then we will check the result in the second bucket to see if the replication is really working. So let's do this. I will create a single file. This file will hold some secret information. So now we have our secrets.text and we want to upload this file to our first bucket, so I will do aws s3 mv. I'll specify the file that I want to move and now I need to specify the bucket name, which will be my-nice-bucket-1.

Okay, we have moved our secrets file to the first bucket and we can check the results in here. And now let's go to the second bucket and see if our file is already there.

And the file is here, the replication is working.

About the Author

Students14607
Labs11
Courses6

Eric Magalhães has a strong background as a Systems Engineer for both Windows and Linux systems and, currently, work as a DevOps Consultant for Embratel. Lazy by nature, he is passionate about automation and anything that can make his job painless, thus his interest in topics like coding, configuration management, containers, CI/CD and cloud computing went from a hobby to an obsession. Currently, he holds multiple AWS certifications and, as a DevOps Consultant, helps clients to understand and implement the DevOps culture in their environments, besides that, he play a key role in the company developing pieces of automation using tools such as Ansible, Chef, Packer, Jenkins and Docker.