The course is part of this learning path
In this course, we will discuss AWS S3 Lifecycle configurations, a tool to automate the movement and deletion of content through S3 storage classes. If you’re looking for information on how to create Lifecycle configurations, this is the course for you!
Learning Objectives
- What a Lifecycle configuration is
- Why Lifecycle configurations are beneficial
- How to create a Lifecycle configuration
- The limitations of Lifecycle configurations
Intended Audience
This course has been created for those looking to better manage their storage lifecycle and S3 storage costs or those already planning on implementing or using Lifecycle configurations that may need more information on how they work.
Prerequisites
To get the most from this course, you should have a strong understanding of S3, including knowledge of storage classes, multipart uploads, object versioning, object tags, and prefixes. Familiarity with XML and JSON will help you understand some of the practical parts of this course. For more information on the S3 service, check out existing content here: Deep Dive: Amazon S3.
Consider the following scenario: you have a data lake workload in a versioned S3 bucket that grows at a very fast and consistent pace. Tons of overwrites are occurring daily and large objects are being uploaded via multipart upload.
And because bad things often happen to good people - that means, that from the data management perspective, you may have incomplete multipart uploads every so often, and tons of out-of-date non-current versions of your data. I’m sure you can probably imagine how costly it is to run this data swamp - I mean, lake.
So it’s important that you consider every cost tool at your disposal. And one of these tools is to manage the storage lifecycle of your data, by moving your data to lower cost storage classes, or deleting data you no longer need. You can, of course, move and delete your data manually, but that can be difficult to manage at scale. So, there are two ways to automate this process:
The first way is to use the S3 Intelligent-Tiering storage class. You’ll pay a monthly object monitoring and automation charge, and in return S3 Intelligent-Tiering will monitor your object access patterns, and automatically move your objects between three tiers: frequent, infrequent and archival. S3 Intelligent-Tiering is recommended for data access patterns that are unknown or unpredictable, and is meant to give a more “hands-off” approach to managing your data lifecycle.
The second approach is by using Lifecycle configurations. You can use lifecycle configurations to transition data to a lower cost S3 storage class, or to delete data. Lifecycle configurations additionally provide options to clean up incomplete multipart uploads and manage noncurrent versions of your data - which ultimately, helps reduce storage spend.
Using lifecycle configurations is the most cost-effective strategy when your objects and workloads have a defined lifecycle and follow predictable patterns of usage.
For example, a defined access pattern may be that you use S3 for logging and only access your logs frequently for at most, a month. After that month, you may not need real-time access, but due to company data retaining policies, you cannot delete them for a year.
With this information, you can create a solid lifecycle configuration based on this access pattern. You could create an S3 Lifecycle configuration that transitions objects from the S3 standard storage class to the S3 Glacier Flexible Retrieval storage class after 30 days. By simply changing the storage class of your objects, you will begin to see significant cost savings in your overall storage spend. And after 365 days, you can then delete the objects and continue to save on costs.
You may find that a lot of your data follows a similar access pattern: you slowly stop needing real-time access to your data, and can eventually delete the data after a certain period of time passes. Or you may have data that you need to save to meet some compliance or governance regulation that can be moved from S3 Standard to archival storage and left alone for long periods of time. Or perhaps, you have a ton of objects in S3 Standard storage and you want to transition all of those objects into the S3- Intelligent Tiering storage class.
If these patterns sound similar to your use case, then using lifecycle configurations makes sense for your workload.
In summary, Lifecycle configurations are an important cost tool that can enable you to delete or transition old unused versions of your objects, clean up incomplete multipart uploads, transition objects to lower cost storage tiers and delete objects that are no longer needed.
Alana Layton is an experienced technical trainer, technical content developer, and cloud engineer living out of Seattle, Washington. Her career has included teaching about AWS all over the world, creating AWS content that is fun, and working in consulting. She currently holds six AWS certifications. Outside of Cloud Academy, you can find her testing her knowledge in bar trivia, reading, or training for a marathon.