Gaurav Mantri's Personal Blog.

Understanding Azure Storage Blob Access Tiers

It has been really-really long time that I have written a blog post. Past year or so has been simply crazy. From acquiring Cerebrata back to building a brand new product (Cerulean) from scratch, things kept me quite busy.

In this blog post we will talk about blob access tiers. We will talk about what these are and some guidance around when to use them. We will talk about cost implications of storing blobs in each access tier. Lastly we will talk about a tool that I built for estimating the costs of storing blobs in each access tier.

Blob Access Tiers

As we all know Azure Storage provides a reliable and highly available object storage to store massive amounts of data for a very cheap price. Blob access tiers is a functionality provided by Azure Storage to store your blobs in different access tiers based on how these blobs are accessed (hence the term “Blob Access Tier” Smile).

Currently there are three access tiers in which you can store the blobs:

  • Hot
  • Cool
  • Archive (Preview)

Now let’s talk about these tiers in somewhat more details.

“Hot” Access Tier

Typically you would want to put your blobs in “Hot” access tier when these blobs are accessed very frequently. For example, if you have a website and you’re storing the images, style sheets and JavaScript files required by that website in Azure Storage, you would want to put them in this tier as they are accessed very frequently.

For blobs stored in “Hot” access tier, you pay a higher storage price (compared to storage price in cooler tiers) but you pay lower transaction price (again compared to transaction price in cooler tier) when these blobs are accessed thus making this tier optimal for frequently accessed blobs.

“Cool” Access Tier

Typically you would want to put your blobs in “Cool” access tier when these blobs are accessed not that frequently yet you want them to be instantly available when needed. Good candidates for putting blobs in this access tier are the blobs holding recent backups (that you may require instantly to recover from), any content that is not accessed recently but when accessed should be available instantly.

For blobs stored in “Cool” access tier, you pay a relatively lower storage price (compared to storage prices for hot tier) but you pay higher transaction price (again compared to transaction prices for hot tier) when these blobs are accessed thus making this tier optimal for rarely accessed yet instantly available blobs.

“Archive” Access Tier

This is the latest access tier announced by Azure Storage and is currently in preview (at the time of writing of this blog).

Typically you would want to put your blobs in “Archive” access tier when these blobs are accessed not that frequently (or not at all) and you’re fine with latency of a few hours in accessing these blobs when needed. Good candidates for putting blobs in this tier are old backups, email archives and other things you would want to keep for compliance reasons.

For blobs stored in “Archive” access tier, you pay a lot less for storage price (compared to storage prices for hot tier) but you pay significantly higher transaction price (again compared to transaction prices for hot tier) when these blobs are accessed thus making this tier optimal for rarely accessed blobs.

A few more things about this access tier:

  • This access tier is currently in preview only thus its availability is limited.
  • During preview phase it is only available in US East 2 region and is only available for a Local Redundant Storage (LRS) Blob Storage accounts. Availability for other regions and other storage accounts will be announced later.
  • During preview phase you will need to enable your Azure Subscription to use this feature before you can use it. Please see this blog post from Azure Storage team for instructions on how to enable this feature in your Azure Subscription using either PowerShell or CLI tools: Announcing the public preview of Azure Archive Blob Storage and Blob-Level Tiering.
  • During preview phase the storage pricing for this tier is about 10% of that of hot tier however the pricing may go up when this feature becomes generally available. So a 90% price reduction in storage cost should not be a sole criteria for storing blobs in this tier.
  • During preview phase it is not possible for you to directly upload a file in this access tier. You must first upload a blob in hot access tier and then perform “Set Blob Tier” operation on that blob to move the blob to archive tier. To do it via .Net SDK, please use SDK version 8.4 or greater.
  • During preview phase it is not possible for you to directly read (for downloading purpose) a blob in this access tier. You must first convert that blob’s tier to either hot or cool (a process known as rehydration that could take hours) and then read the blob.
  • In fact, during preview phase it is not possible for you to perform any operations on blobs in this access tier other than listing the blobs and fetching properties of such blobs. If you really need to perform other operations (like setting metadata), you must first rehydrate the blob, perform operation and then change blob tier to archive again.

Blob Access Tiers & Cost Implications

Typically when people look at cloud object storage (like Azure Storage), they often consider only the cost of storing the objects there. They forget other costs associated with storing and accessing objects.

When using Azure Storage, typically these are the costs you can incur:

  • Storage Costs: These are the costs of storing objects in Azure Storage. in Azure you pay for the amount of time objects are stored in Azure.
  • Read Transactions Costs: Every time a blob or its properties are read, you incur these charges.
  • Write Transaction Costs: Every time a blob is written or updated, you incur these charges.
  • Data Egress Costs: Whenever blob data is transferred outside of the Azure region it is hosted in, you incur these charges.
  • Data Retrieval Costs: Whenever a blob is read from a cool or archive access tier, you incur these charges.
  • Data Write Costs: Whenever a blob is written to a cool or archive access tier, you incur these charges.

When considering the storage costs, one must consider all of these costs.

Which Access Tier Is Right For Me?

If I were to make a decision tree for this, the very first question that I will ask myself regarding this would be “Is the data stored in blobs accessed frequently?”.

If the answer to this question is yes, then it is a no-brainer that “Hot” access tier is right for you. You may pay higher storage costs but reduced transaction costs will offset this cost.

If the answer to this question is no, then the next question I will ask myself is “Whenever I need this data, do I need this data to be readily available or am I ok to wait for a few hours to get this data?

If the answer to this question is “I need this data to be readily available”, then “Cool” access tier is right for you.

However if your answer is “I am ok to wait for a few hours to get this data”, then “Archive” access tier may be right for you. Please note the emphasis I have put on “may” above. The reason for this is that a blob from an archive access tier will need to go through a rehydration process which essentially converts a blob’s access tier from “Archive” to “Hot”. Not only it is time consuming (but you’re ok with that Smile) but also an expensive operation. Also not to mention that once you’re done reading the blob, you will need to put this back in “Archive” tier from “Hot” tier and you will pay for those transactions as well.

One big question that still remains is how do you determine if the data stored in blobs accessed frequently or not? There are a few ways by which you can determine this:

  • Check blob’s last modified date: Though not an accurate measure but you can use a blob’s last modified date to determine if a blob is that frequently accessed or not. This will work for things like backups and log files that you store in Azure Storage. Typically once they are written, they are rarely accessed. However it may fail in scenarios where you’re storing content like website logos, images etc. Even though they are also rarely modified but are still frequently accessed. So you will have to be careful when choosing this criteria.
  • Check storage analytics logs: If you have enabled analytics logs for your storage account, every time a blob is accessed an entry is made in storage analytics logs. By checking for blobs in these logs can give you an accurate idea about whether or not a blob is accessed in say last 30/60 days or not. However this is a very cumbersome process as these log files are delimited text files and you would need to first download these log files, parse them to find the occurrences of blobs of your interest.

Blob Tier Analysis Tool

In this year’s Microsoft Ignite conference I had the privilege of speaking at the event. One of the sessions I spoke at was about this archive tier and for that I had prepared a tool called “Blob Tier Analysis Tool”.

This is a command line tool that can analyze the contents of your storage account and will tell you about the potential cost savings if you were to move these blobs into a different tier (say from Hot/Cool to Archive). It will also change the blob’s tier for you as well. Furthermore it can also analyze the contents of a local folder/file share on your computer and tell you about the potential costs if you were to store these files in Azure Storage.

The source code for this tool is available on Github so please download and give it a try: https://github.com/Azure-Samples/storage-dotnet-blob-tier-analysis-tool.

Summary

That’s it for this post. I hope you have found this post useful. If you find any issues with this post, please let me know and I will fix them ASAP.

Until next time!!!


[This is the latest product I'm working on]