Gaurav Mantri's Personal Blog.

Comparing Azure Storage Blob Versions and Snapshots

Recently Azure Storage team announced the availability of blob versions. A blob version essentially represents the state of a blob (in terms of content, properties and metadata etc.) at a particular point of time. Azure Storage automatically creates a new version of a blob whenever it is modified or deleted. You can read more about blob versioning here.

To be very honest, at first I didn’t really understood why Azure Storage team would build this feature considering similar feature (Blob Snapshots) already exists there (and it has been around for ages). As I started working on incorporating support for this feature in Cerebrata Cerulean, I read more about them. I realized that while there are a lots of similarities between them, there are some significant differences as well. In this blog post we will talk about the similarities and differences between them.

So let’s start! First we will talk about similarities and then we will talk about differences.

Similarities

Read-only copy

Basically both blob version and snapshot serve the same purpose: They create a read-only copy of the blob which represent the state of a blob at a particular point of time.

Number of versions/snapshots

There is no upper limit on the number of versions or snapshots a blob can have. However Storage team recommends keeping the number of versions/snapshots to be under 1000 as higher the number is, higher would be the latency in listing blobs.

Listing mechanism

The way you list blob versions and snapshots is nearly identical. They can be listed as part of List Blobs operation. To list blob versions, you would need to specify “versions” as part of “include” parameter. Similarly, to list blob snapshots you would need to specify “snapshots” as part of “include” parameter.

To list versions or snapshots of an individual blob, you would still use the same operation with same set of parameters. However you would need to specify the name of the blob as “prefix”.

Access tier (hot, cool or archive)

For storage accounts which support access tiers (hot, cool or archive), you can explicitly set access tier of a blob version or snapshot. For example, it is certainly possible to have a blob in “hot” access tier but its versions or snapshots in “cool” or “archive” access tier. This could result in lower storage costs. However, please be aware of the billing implication when the access tier is different for base blob and its version/snapshot.

Billing

The way blob versions and blob snapshots are billed is exactly the same.

For blobs, where access tier has not been explicitly set, you’re billed for the unique blocks in base blob and its versions/snapshots.

For blobs, where access tier has been explicitly set on either the base blob and/or its version/snapshot, you are billed for the entire blob in the new tier.

You can read more about the billing of blob version/snapshot here.

Differences

Storage account kind

One important difference between blob version/snapshot is that not all kinds of storage accounts support blob versioning whereas snapshot is supported for all kinds of storage accounts. At the time of writing this blog, only general purpose v2, block blob and blob storage accounts support versioning.

What this means is that if you’re looking to create read-only copies of blobs for data protection, you can use blob version feature only on the supported storage account kinds mentioned above. For other storage account kinds, only option available to you is to use blob snapshots.

Feature activation required

Blob versioning feature is not automatically enabled on the supported storage accounts and you must manually enable this feature. However blob snapshot feature is available by default and you don’t have to anything special to make use of this feature.

You can enable/disable blob versioning feature anytime you want. No such capability is available for blob snapshot feature. In other words, you can’t disable blob snapshot feature.

Automatic v/s manual

Once blob version feature is enabled on a storage account, a new version of a blob is created automatically whenever a blob is modified or deleted. You don’t have to do anything. Azure storage service takes care of that for you.

However creating new blob snapshot is a manual process and you have to write code that would take a snapshot of a blob.

This could be an important factor in choosing between blob versions and blob snapshots. Automatic blob versioning provides you convenience whereas manual blob snapshot provides you control .

Blob deletes

This, I believe, is the most significant difference and could have a big cost implication if not thought about properly.

Deleting a blob does not delete blob versions automatically! You will have to delete blob versions separately. So in a way, blob versions live independent of the base blob.

Microsoft recommends that you use Blob Lifecycle Management to define a policy to automatically delete blob versions after a certain number of days. You can also use Cerebrata Cerulean to manually delete blob versions when deleting one or more blobs.

Snapshots on the hand are tied to the base blob and snapshots are deleted when the base blob is deleted. In fact, you can’t delete a blob that has snapshots without deleting its snapshots as well.

Choosing between blob versions and snapshots

Even though Microsoft recommends using blob versions over snapshots however there are some factors that we must consider when choosing between these two. Some of them are:

Convenience v/s control

Blob version provides you convenience as all you have to do is enable this feature and from that point onwards, Azure Storage service takes care of creating read-only copies of the blob. However once this feature is enabled, versions of all blobs in that storage account will be created which you may or may not want.

On the other hand, blob snapshots provide you control as to for what all blobs you wish to create a read-only copy. Considering taking blob snapshot is a manual process (i.e. you have to write code to do so), it is error prone and developer-dependent. Furthermore, if someone updates the blob outside of the application code which is responsible for creating blob snapshot (Azure Portal, for example), then you lose snapshot capability.

Storage account support

As mentioned above, since not all kinds of storage account support blob version you are restricted to use blob snapshot feature if the storage account does not support blob versions. For example, if you’re using general purpose v1 or classic storage account, you will have no other option but to use blob snapshot feature.

Summary

That’s it for this post! I hope you have found this post useful. If you find any issues with the post, please let me know and I will get it fixed ASAP.

Happy Coding!


[This is the latest product I'm working on]