Configuring Azure Blob Storage Lifecycle Management
The course is part of these learning paths
In this course, I explain how you can save money by setting up lifecycle management policies that automatically move your blobs to less expensive access tiers when certain conditions are met.
- Create and apply an Azure Blob Storage lifecycle management policy using the Azure Portal
- Apply a lifecycle management policy using the command line
- Azure administrators, developers, and data engineers
- Basic knowledge of Azure Blob Storage
Welcome to “Configuring Azure Blob Storage Lifecycle Management”. I’m Guy Hummel. To get the most from this course, you should already have some knowledge of Azure Blob Storage. In this short course, I’ll explain how you can save money by setting up lifecycle management policies to automatically move your blobs to less expensive access tiers when certain conditions are met.
First, let’s review Azure Blob Storage access tiers. The Hot tier is designed for blobs that will be accessed frequently. This is the default tier. It has the most expensive storage costs but the lowest access costs.
The Cool tier has lower storage costs but higher access costs. That’s why you should only put infrequently accessed blobs in the Cool tier. If you were to put frequently accessed blobs in this tier, then the access costs would very quickly outweigh the lower storage costs. Also, you have to leave data in the Cool tier for at least 30 days. If you delete it or move it to a different tier in less than 30 days, you’ll have to pay an early removal penalty.
The Archive tier has the lowest storage costs and the highest access costs. It’s intended for data that you rarely need to access, such as long-term backups. To avoid an early deletion penalty, you need to leave data in the Archive tier for at least 180 days. It’s significantly different from the hot and cool tiers because you can’t access your data right away when you need it. That’s because it uses offline storage. So you have to wait for your data to be rehydrated before you can access it. This process can take up to 15 hours.
Since there’s a big difference in cost between the different tiers, it’s a good idea to set up an automated system to move your data between tiers when the time is right. For example, you might have monthly reports that get accessed frequently in the first month after they’re created, are accessed much less frequently for the next 11 months, and are rarely accessed after that but need to be retained for a total of 10 years for compliance reasons.
In this case, you might want to create a policy that:
- Moves these reports from the Hot tier to the Cool tier after 30 days,
- Moves them from the Cool tier to the Archive tier when they’re 12 months old,
- And deletes them from the Archive tier after they’re 10 years old.
Notice that I said 12 months instead of 11 months for the second one because the blobs would have already been in the Hot tier for 1 month before being moved to the Cool tier, so they would be 12 months old. I’ll show you two different ways to set up these rules: using the Azure portal and using the command line.
First, let’s use the portal. I’ve already created a storage account and a blob container. In the menu on the left, select “Lifecycle management”. This is where we create the rules we want. Notice that it says lifecycle management is only available for general-purpose v2 accounts and blob storage accounts. Since most storage accounts are general-purpose v2 accounts these days, you probably won’t need to worry about having the right account type.
Now click “Add a rule”. Let’s call it “LifecycleForReports”. We’ll leave the rule scope as “Apply rule to all blobs in your storage account”. We’ll leave the blob type as just “Block blobs”. Append blobs are used for files that frequently have new data appended to them, which likely wouldn’t be the case for our monthly reports.
You might not be familiar with the blob subtype. If you turn on blob versioning, then a copy of the blob will be saved every time the blob is modified. That way, you could retrieve a previous version of the blob if you needed to. Snapshots are similar except that you create them manually. If you don’t use either versioning or snapshots, then you can just leave “Base blobs” checked.
Okay, now we can click “Next” and define the rule. A rule needs to contain one or more conditions. The first condition we want to have is to move blobs to Cool storage after 30 days, so let’s change this to “If base blobs were created more than 30 days ago, then move to cool storage. Then we click “Add conditions” again to add the next one, which is to move blobs from Cool storage to Archive storage when they’re 12 months old. So, we’ll put 365 days since they were created.
Finally, we’ll click “Add conditions” again, and this time, we’ll say that after 3,650 days (which is 10 years if you don’t count leap years), then delete the blob.
Now we click “Add” and we’re done. However, this lifecycle policy won’t go into effect immediately. Azure only runs policies once a day, so it can take up to 24 hours before any actions triggered by a new policy will take place.
This example was pretty simple, but there are some other features we can use to customize it. First, we could create a rule that only applies to certain blobs. To do this, we’d need to create a filter. Technically, when we set the blob type to only block blobs, that was a filter, but we can also create a filter that looks at the name of a blob.
For example, suppose we want a lifecycle rule to only apply to blobs that start with the word “report”, then we’d select “Limit blobs with filters”. Now there’s a tab called “Filter set”. There are two types of filters we can apply: blob prefix and blob index match. The second one is more complicated and requires tagging your blobs before it’ll work. Blob prefix is the one we need for our example. We just need to type “report” in this field. And we have to click the Update button. Now the rule will apply to any blobs that start with those letters.
Here’s another change we can make. Suppose you don’t want a blob to be moved to the Cool tier until a certain number of days after the last time it was accessed rather than since it was created. You might have noticed that this wasn’t an option in the list of possible conditions, so how would we do it? First, we have to check the “Enable access tracking” box. Then, when we go back to the first condition, the list includes “Last accessed”, so we can select it. And click “Update”.
This visual interface makes it very easy to create lifecycle rules, but if you had to apply lifecycle policies to lots of different storage accounts, it would probably be faster to create a standard configuration and apply it using the command line. Then you wouldn’t have to keep pointing and clicking in the interface over and over again.
To do this, you need to create a JSON file that contains your rules. But this can be a bit of a daunting task, so there’s a shortcut you can use. Once you’ve created one or more rules in this interface, you can go to the Code View tab, and there’s the JSON code you need.
The rule definition is divided into two sections: the rule actions and the rule filters. For each of the conditions we created, it shows the action to take and the condition required to take that action. Here’s the one for moving blobs to the Cool tier. Here’s the one for moving to the Archive tier, and here’s the one for deleting blobs. In the filters section, it shows the blob type filter and the prefix match filter that we set.
If we wanted to change anything, we could actually edit it right here and click the Save button, which would apply the updated policy to this storage account. But in most cases, we’d probably want to download the code, modify it, and use the command line to apply it to other storage accounts.
So, we’ll click “Download”. Then, to avoid having to install the Azure command-line utility on your desktop, we can upload the JSON file to the Cloud Shell and run the command there. So, I’ll open the Cloud Shell. Then I’ll select “Upload”. Here’s the file we downloaded. It’s called “policy.json”.
I’m not going to modify anything, but I’ll show you what command you’d use to apply this lifecycle policy to a storage account. I put the command in the transcript below in case you want to try this yourself.
It starts with “az storage account management-policy create”. Then you specify the account name. My storage account is called “camonthlyreports”. Then you type “--policy” and the name of the JSON file. And finally, you tell it the resource group where your storage account resides. I called mine “camonthlyreportsrg”. Now hit Enter. It might not be obvious from this output, but it successfully applied the policy to that storage account.
And that’s it for Blob Storage lifecycle management. Please give this course a rating, and if you have any questions or comments, please let us know. Thanks!
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).