Gaurav Mantri's Personal Blog.

New Changes to Windows Azure Storage – A Perfect Thanksgiving Gift

Yesterday Windows Azure Storage announced a number of enhancements to the core service. These enhancements are long awaited and with the way things are implemented, all I can say is that it was worth the wait.

In this blog post, we will go over these changes. There are so many changes that if I want to go in details for each and every change, I would end up writing this post for days. So I will try to be brief here. Then in subsequent posts, I will go over each of these enhancement in great detail with code samples and stuff.

Windows Azure Storage Team has written an excellent blog post describing these changes which you can read here: http://blogs.msdn.com/b/windowsazurestorage/archive/2013/11/27/windows-azure-storage-release-introducing-cors-json-minute-metrics-and-more.aspx.

Now let’s talk about the changes.

CORS Support

This has been one of the most anticipated changes in Windows Azure Storage. The support for CORS had been with other cloud storage providers for quite some time and finally its here in Windows Azure as well. Essentially CORS would allow you to interact with Windows Azure Storage directly from browsers. For example, if you want to upload blobs into Windows Azure Storage through a browser based application, prior to CORS you would have to either upload the file on your web server first and then push it into blob storage from there or host the upload code in blob storage itself (https://gauravmantri.com/2013/02/16/uploading-large-files-in-windows-azure-blob-storage-using-shared-access-signature-html-and-javascript). Now you don’t need to do that. Once the CORS is enabled, you can simply upload the files into Blob Storage directly from your browser.

The fun doesn’t stop there :). If we take Amazon for example, CORS is only enabled for S3 (which is equivalent of blob storage). With Windows Azure, the CORS is supported not only for Blob Storage but also for  Table Storage and Windows Azure Queues. So now you have the power of managing your Tables and Queues directly from a browser-based application.

Let’s briefly talk about how you would go about utilizing this great feature. Based on my understanding, here’s what you would need to do:

  1. By default CORS is not enabled on a storage account. You would need to first enable it by specifying certain things like origin (i.e. the URL from where you will be making request to storage), allowed verbs (like PUT, POST etc.) and other things. You can enable CORS either by using REST API or using the latest version of Storage Client library (more on Storage Client library towards the end of the post).
  2. Once CORS is enabled, you are good to go on the server side. Now on to the client side.
  3. Now when your application tries to perform a request (e.g. putting a blob), a request is sent by the browser (or user agent) first to the storage service to ensure CORS is enabled before the actual operation. This is referred to as “Pre Flight” request in the CORS documentation. The browser would include a number of things in this “OPTIONS” request like request headers, HTTP method and request origin. Windows Azure Storage service will validate this request against the CORS rule set in Step 1. You don’t have to do this request, it is done by the browser automatically.
  4. If the “Pre Flight” request doesn’t pass the rule, the service will return a 403 error. If rules are validated then the service will return a 200 OK status code along with a number of response header. One of the important response header is “Access-Control-Max-Age” which basically tells you the number of seconds for which the browser doesn’t have to make this “Pre Flight” request again. Think of it as an authorization token validation period. Once this period has elapsed and you still need to do some work, the browser would need to make another “Pre Flight” request.
  5. Once the “Pre Flight” request is successful, browser automatically sends the actual request to the storage and that operation is performed.
You can read more about CORS support in Windows Azure Storage here: http://msdn.microsoft.com/en-us/library/windowsazure/dn535601.aspx.

JSON Support

Yet another important and much awaited enhancement. With the latest release, JSON is now supported on Windows Azure Tables. You can send the data in JSON format and receive the data back from storage in JSON format. Prior to this only way to send/receive data from Windows Azure Table Storage was through bulky and extremely heavy ATOM PUB XML format. To me, there are many advantages of using JSON over XML:

  • The amount of data which gets sent over the wire is reduced considerably thus your application would work much-much faster.
  • Not only that, table storage suddenly became somewhat cheaper as well because even though you don’t pay for data ingress you do pay for data egress (assuming the data goes out of Windows Azure Storage) and since your data egress has gone considerably smaller, you save money on bandwidth egress.
  • It opened up a number of possibilities as far as applications are concerned. JSON has become de-facto standard for data interchange in the modern applications. Combine JSON support with CORS and Shared Access Signature and now you should be able to interact with table storage directly from a browser based application.

You can read more about JSON support in Windows Azure Table Storage here: http://msdn.microsoft.com/en-us/library/windowsazure/dn535600.aspx.

Improved Storage Analytics

Storage analytics as you may already know gives you insights into what exactly is going on with your storage requests at the storage service level. Prior to this release the metrics were aggregated on an hourly basis. What that means is that you would have to wait for at least an hour to figure out what exactly is going on at the storage level. With the latest release, on top of these hourly aggregates the data is aggregated at minute level. What this means is that you can now monitor the storage service in almost real-time basis and identify any issues much-much faster.

Content-Disposition Header for Blobs

While it was made public during the last //Build conference that support for CORS and JSON is coming soon, this was one feature which kind of surprised me (in a nice way of course :)).

Assume a scenario where you want your users to download the files from your storage account but you wanted to give those files a user friendly name. Furthermore, you want your users to get prompted for saving the file instead of displaying the file in browser itself (say a PDF file opening up automatically in the browser only). To accomplish this, earlier you would need to first fetch the file from your blob storage on to your server and then write the data of that file in the response stream by setting “Content-Disposition” header. In fact, I spent a good part of last week implementing the same solution. Only if I had known that this feature is coming in storage itself:).

Now you don’t need to do that. What you could do is specify a content-disposition property on the blob and set that as “attachment; filename=yourdesiredfilename” and when your user tries to access that through a browser, they will be presented with file download option.

Now you may ask, what if I have an image file which I want to show inline also and also as a downloadable item also. Very valid requirement. Well, the smart guys in the storage team has already thought about that :). Not only you can set content-disposition as a blob property but you can override this property in a SAS URL (more on it in a bit).

Overriding Commonly Used Headers in SAS

This is another cool feature introduced in the latest release. As you know, blob supports standard headers like cache-control, content-type, content-encoding etc. which gets saved as blob properties. You could change them but once they are changed, the changes are permanent. For example, let’s say you have a text file with content-type set as “plain/text”. Now what you want to do is change the content type of this file to say “application/octet-stream” for some of the users. Earlier if you change the content type property to “application/octet-stream”, the change will be applicable to all the users and not for selected users which is not something you wanted in the first place.

With the new version storage service allows you to provide the new header values when you’re creating a SAS URL for that file. So when you’re creating a SAS URL, you can specify the content-type to be “application/octet-stream” and set the content-disposition to “attachment; filename=myrandomtextfilename” and when the user uses this SAS URL, they will be prompted to save the file instead of displaying it inline in the browser. Do keep in mind that the content-type of the blob in storage is still “plain/text”.

Ability to Delete Uncommitted Blobs

Sometime back I wrote a blog post about dealing with an error situation where because of messed up block ids, you simply can’t upload the blob (https://gauravmantri.com/2013/05/18/windows-azure-blob-storage-dealing-with-the-specified-blob-or-block-content-is-invalid-error/). At that time I wished for an ability to purge uncommitted blobs. Well guess what, my wish came true :). With the latest release of storage service, you can indeed purge an uncommitted blob.

Support for Multiple Conditional Headers

As you may already know, with Windows Azure Storage you can perform certain operations by specifying certain pre-conditions. For example, delete a blob if it has not been modified since last 10 days etc. However you didn’t have the flexibility of specifying multiple conditional headers. With the latest release, you now have that option at least for “Get Blob” and “Get Blob Properties” operation.

You can read more about multiple conditional headers here: http://msdn.microsoft.com/en-us/library/windowsazure/dd179371.aspx

Support for ODATA Prefer Header

Now this is an interesting enhancement :). Not sure if you have noticed but when you create an entity in a table, the Table Storage Service echoes that data back to you in response. Now earlier we talked about the bulkiness of XML request payload so not only I’m sending this data to table service (because I have to, duh!!!) but also I’m getting the same data back. Not only I paid for the storage transaction, I also paid for the data that was sent back to me. Not to mention I kind of slowed down my application a bit. Furthermore, in all likelihood I am not really interested in seeing that data again sent back to me in response to my request.

Earlier I didn’t have any control over this behavior but now I do. I can now specify as a part of my request whether or not I wish to see the data I sent in my response body. Though this feature is only available for “Create Table” and “Insert Entity” operation today, I think its quite significant improvement which will go a long way.

More Changes

There are many more changes (and my fingers really hurt typing all this :)), so I would encourage you to check out the release notes here: http://msdn.microsoft.com/en-us/library/windowsazure/dd894041.aspx.

How to Use These Features

Before I end this post, let’s take a moment to talk briefly about how you can avail these awesome features. Well, there are two ways by which you can do that:

  1. Use REST API: You can consume REST API as these features are available in the core API. The link for REST API documentation is here: http://msdn.microsoft.com/en-us/library/windowsazure/dd179355.aspx.
  2. Use Storage Client Library: When storage team released these changes at the REST API level, they also released a new version of .Net Storage Client library (3.0.0.0) which has full fidelity with the REST API. If you want you can download the .Net Storage Client Library through Nuget. One word of caution though: If you use this library, your code will not work in storage emulator. Essentially storage emulator is still wired to use older version of REST API (2012-02-12) while the newer version is 2013-08-15. Furthermore for table storage service, value for “DataServiceVersion” and “MaxDataServiceVersion” request headers should be “3.0;NetFx” where as older version required “2.0;NetFx“. Need less to say, I learnt the lesson hard way 🙂 however we had to migrate to the latest version as the features introduced in this release were quite important for the product we are building at Cynapta. We actually upgraded from 2.0.6.1 version of the storage client library and apart from development storage issue, we didn’t encounter any issues what so ever. If you are comfortable working with cloud storage all the time, I think it makes sense to go for an upgrade.

Summary

Though I said I will be brief, it turned out to be a rather big post :). Honestly I couldn’t control it. There is so much good stuff in this release. I hope you have found this post useful. I just went through the documentation for a few hours and wrote this blog post, so there may be some inaccuracies here. If you do find them, please let me know and I will fix them ASAP.

Now onto writing some code which will actually consume these awesome features.

A Happy Thanks Giving to you!

So long!!!


[This is the latest product I'm working on]