Working with Amazon S3 lifecycle configurations
Using Instance Scheduler to Optimize Resource Cost
Utilizing Managed Services and Serverless Architectures to Minimize Cost
Networking Costs When Building a Hybrid Cloud
The course is part of this learning path
This course covers the core learning objective to meet the requirements of the 'Implementing effective Cost Management solutions in AWS - Level 2' skill
- Apply automated processes to AWS storage tiers tro minise data costs over a given time period
- Evaluate how to control your AWS resource cost by configuing start and stop scheduled with the Instance Scheduler for EC2 instances and RDS databases
- Analyze the most cost-effective connectivity options between AWS and on-premisis environments
- Analyze the various options to minimize total cost of owenership through both AWS managed servies and serverless architectures
Consider the following scenario: you have a data lake workload in a versioned S3 bucket that grows at a very fast and consistent pace. Tons of overwrites are occurring daily and large objects are being uploaded via multipart upload.
And because bad things often happen to good people - that means, that from the data management perspective, you may have incomplete multipart uploads every so often, and tons of out-of-date non-current versions of your data. I’m sure you can probably imagine how costly it is to run this data swamp - I mean, lake.
So it’s important that you consider every cost tool at your disposal. And one of these tools is to manage the storage lifecycle of your data, by moving your data to lower cost storage classes, or deleting data you no longer need. You can, of course, move and delete your data manually, but that can be difficult to manage at scale. So, there are two ways to automate this process:
The first way is to use the S3 Intelligent-Tiering storage class. You’ll pay a monthly object monitoring and automation charge, and in return S3 Intelligent-Tiering will monitor your object access patterns, and automatically move your objects between three tiers: frequent, infrequent and archival. S3 Intelligent-Tiering is recommended for data access patterns that are unknown or unpredictable, and is meant to give a more “hands-off” approach to managing your data lifecycle.
The second approach is by using Lifecycle configurations. You can use lifecycle configurations to transition data to a lower cost S3 storage class, or to delete data. Lifecycle configurations additionally provide options to clean up incomplete multipart uploads and manage noncurrent versions of your data - which ultimately, helps reduce storage spend.
Using lifecycle configurations is the most cost-effective strategy when your objects and workloads have a defined lifecycle and follow predictable patterns of usage.
For example, a defined access pattern may be that you use S3 for logging and only access your logs frequently for at most, a month. After that month, you may not need real-time access, but due to company data retaining policies, you cannot delete them for a year.
With this information, you can create a solid lifecycle configuration based on this access pattern. You could create an S3 Lifecycle configuration that transitions objects from the S3 standard storage class to the S3 Glacier Flexible Retrieval storage class after 30 days. By simply changing the storage class of your objects, you will begin to see significant cost savings in your overall storage spend. And after 365 days, you can then delete the objects and continue to save on costs.
You may find that a lot of your data follows a similar access pattern: you slowly stop needing real-time access to your data, and can eventually delete the data after a certain period of time passes. Or you may have data that you need to save to meet some compliance or governance regulation that can be moved from S3 Standard to archival storage and left alone for long periods of time. Or perhaps, you have a ton of objects in S3 Standard storage and you want to transition all of those objects into the S3- Intelligent Tiering storage class.
If these patterns sound similar to your use case, then using lifecycle configurations makes sense for your workload.
In summary, Lifecycle configurations are an important cost tool that can enable you to delete or transition old unused versions of your objects, clean up incomplete multipart uploads, transition objects to lower cost storage tiers and delete objects that are no longer needed.
Alana Layton is an experienced technical trainer, technical content developer, and cloud engineer living out of Seattle, Washington. Her career has included teaching about AWS all over the world, creating AWS content that is fun, and working in consulting. She currently holds six AWS certifications. Outside of Cloud Academy, you can find her testing her knowledge in bar trivia, reading, or training for a marathon.