Gaurav Mantri's Personal Blog.

Uploading Large Files in Windows Azure Blob Storage Using Shared Access Signature, HTML, and JavaScript

In my last post I talked about Shared Access Signature feature of Windows Azure Storage. You can read that post here: http://gauravmantri.com/2013/02/13/revisiting-windows-azure-shared-access-signature/. In this post we’ll put that to some practical use :).

One thing I wanted to accomplish recently is the ability to upload very large files into Windows Azure Blob Storage from a web application. General approach is to read the file through your web application using “File” HTML control and upload that entire file to some server side code which would then upload the file in blob storage. This approach would work best for smaller files but would fail terribly when it comes to moderately to very large files as the file upload control would upload the entire file to the server (for bigger files, this would cause timeouts depending on your Internet connection) and then that file resides in the server memory before any action can be taken on that file (again for bigger files, this would cause performance issues assuming you have thousands of users uploading thousands of large files).

In this blog post, we’ll talk about how we can accomplish that using File API feature in HTML5.

Please be warned that I’m no JavaScript/HTML/CSS expert :). About 4 – 5 years ago I used to do a lot of JavaScript development and built some really insane applications using JavaScript (like mail merge kind of features, office automation, native device interfacing all using JS) but that was then. For last 4 – 5 years, I have been extensively doing desktop application development. So most of the JS code you’ll see below is copied from various sites and StackOverflow :).

Objectives

Following were some of the objectives I had in mind:

  1. The interface should be web based (possibly pure HTML).
  2. I don’t have to share my storage account credentials with the end users.
  3. If possible, there should not be any server side code (ASP.Net/PHP etc.) i.e. it should be pure HTML and the communication should be between user’s web browser and storage account.
  4. If possible, the solution should upload the file in chunks so that bigger files can be uploaded without reading them in completely in a single go.

Now let’s see how we can meet our objectives.

Solution

Here’re some of the things I did to achieve the objectives listed above.

Chunking File

The first challenge I ran into is how to upload the file in chunks. All the examples I saw were uploading an entire file however this is not what I wanted. Finally I ran into this example on HTML5 Rocks where they talked about the File API: http://www.html5rocks.com/en/tutorials/file/dndfiles/. Basically what caught my interest there is the “slicing” feature available in HTML 5’s File interface. In short, what this slice feature does is that it reads a portion of a file asynchronously and returns that data. This is exactly I was looking at. Once I found that, I knew 75% of my work was done! Everything else was a breeze :)

Securing Storage Account Credentials

Next thing on my list was securing storage account credentials and this was rather painless as I knew exactly what I had to do – Shared Access Signature. SAS provided me with a secured URI using which users will be able to upload files in my storage account without me giving them access to storage account credentials. This was covered extensively in my post about the same: Revisiting Windows Azure Shared Access Signature.

Direct Communication Between Client Application and Windows Azure Storage

Next thing was to facilitate direct communication between the client application and storage. As we know Windows Azure Storage is built on REST so that means I can simply use AJAX functionality to communicate with REST API. One important thing to understand is that Windows Azure Storage still does not support Cross-Origin Resource Sharing (CORS) at the time of writing this blog. What that means is that your web application and blob storage must be in the same domain. The solution to this problem is to host your HTML application in a public blob container in the same storage account where you want your users to upload the files. I’ve been told that CORS support is coming soon in Windows Azure Storage and once that happens then you need not host this application in that storage account but till then you would need to live with this limitation.

The Code

Now let’s look at the code.

HTML Interface

Since I was trying to hack my way through the code, I kept the interface rather simple. Here’s how my HTML code looks like:

<body>
    <form>
        <div style="margin-left: 20px;">
            <h1>File Uploader</h1>
            <p>
                <strong>SAS URI</strong>: 
                <br/>
                <span class="input-control text">
                    <input type="text" id="sasUrl" style="width: 50%"
                           value=""/>
                </span>
            </p>
            <p>
                <strong>File To Upload</strong>: 
                <br/>
                <span class="input-control text">
                    <input type="file" id="file" name="file" style="width: 50%"/>
                </span>
            </p>
            <div id="output">
                
                <strong>File Properties:</strong>
                <br/>
                <p>
                    Name: <span id="fileName"></span>
                </p>
                <p>
                    File Size: <span id="fileSize"></span> bytes.
                </p>
                <p>
                    File Type: <span id="fileType"></span>
                </p>
                <p>
                    <input type="button" value="Upload File" onclick="uploadFileInBlocks()"/>
                </p>
                <p>
                    <strong>Progress</strong>: <span id="fileUploadProgress">0.00 %</span>
                </p>
            </div>
        </div>
        <div>
        </div>
    </form>
</body>

image

All it has a textbox for a user to enter SAS URI and HTML File control. Once the user selects a file, I display the file properties and the “Upload” button to start uploading the file.

Reading File Properties

To display file properties, I made use of “onchange” event of File element. The event gave me a list of files. Since I was uploading just one file, I picked up the first file from that list and got its name (blob would have that name), size (blob’s size and determining chunk size) and type (for setting blob’s content type property).

//Bind the change event.
$("#file").bind('change', handleFileSelect);

        
        //Read the file and find out how many blocks we would need to split it.
        function handleFileSelect(e) {
            var files = e.target.files;
            selectedFile = files[0];
            $("#fileName").text(selectedFile.name);
            $("#fileSize").text(selectedFile.size);
            $("#fileType").text(selectedFile.type);
        }


Chunking

In my application I made an assumption that I will split the file in chunks of 256 KB size. Once I found the file’s size, I just found out the total number of chunks.

        
        //Read the file and find out how many blocks we would need to split it.
        function handleFileSelect(e) {
            maxBlockSize = 256 * 1024;
            currentFilePointer = 0;
            totalBytesRemaining = 0;
            var files = e.target.files;
            selectedFile = files[0];
            $("#output").show();
            $("#fileName").text(selectedFile.name);
            $("#fileSize").text(selectedFile.size);
            $("#fileType").text(selectedFile.type);
            var fileSize = selectedFile.size;
            if (fileSize < maxBlockSize) {
                maxBlockSize = fileSize;
                console.log("max block size = " + maxBlockSize);
            }
            totalBytesRemaining = fileSize;
            if (fileSize % maxBlockSize == 0) {
                numberOfBlocks = fileSize / maxBlockSize;
            } else {
                numberOfBlocks = parseInt(fileSize / maxBlockSize, 10) + 1;
            }
            console.log("total blocks = " + numberOfBlocks);
        }

Endpoint for File Uploading

The SAS URI actually represented a URI for blob container. Since I had to create an endpoint for uploading file, I split the URI (path and query) and appended the file name to the path and then re-appended the query to the end.

            var baseUrl = $("#sasUrl").val();
            var indexOfQueryStart = baseUrl.indexOf("?");
            submitUri = baseUrl.substring(0, indexOfQueryStart) + '/' + selectedFile.name + baseUrl.substring(indexOfQueryStart);
            console.log(submitUri);

Reading Chunk

This is where File API’s chunk function would come in picture. What happens in the code is that when the user clicks the upload button, I read a chunk of that file asynchronously and get a byte array. That byte array will be uploaded.


        var reader = new FileReader();
        var fileContent = selectedFile.slice(currentFilePointer, currentFilePointer + maxBlockSize);
        reader.readAsArrayBuffer(fileContent);

Uploading Chunk

Since I wanted to implement uploading in chunk, as soon as a chunk is read from the file, I create a Put Block request based on Put Block REST API specification using jQuery’s AJAX function and pass that chunk as data. Once this request is successfully completed, I read the next chunk and repeat the process till the time all chunks are processed.

        reader.onloadend = function (evt) {
            if (evt.target.readyState == FileReader.DONE) { // DONE == 2
                var uri = submitUri + '&comp=block&blockid=' + blockIds[blockIds.length - 1];
                var requestData = new Uint8Array(evt.target.result);
                $.ajax({
                    url: uri,
                    type: "PUT",
                    data: requestData,
                    processData: false,
                    beforeSend: function(xhr) {
                        xhr.setRequestHeader('x-ms-blob-type', 'BlockBlob');
                        xhr.setRequestHeader('Content-Length', requestData.length);
                    },
                    success: function (data, status) {
                        console.log(data);
                        console.log(status);
                        bytesUploaded += requestData.length;
                        var percentComplete = ((parseFloat(bytesUploaded) / parseFloat(selectedFile.size)) * 100).toFixed(2);
                        $("#fileUploadProgress").text(percentComplete + " %");
                        uploadFileInBlocks();
                    },
                    error: function(xhr, desc, err) {
                        console.log(desc);
                        console.log(err);
                    }
                });
            }
        };

Committing Blob

Last step in this process is to commit the blob in blob storage. For this I create a Put Block List request based on Put Block List REST API specification and process that request again using jQuery’s AJAX function and pass the block list as data. This completed the process.

        function commitBlockList() {
            var uri = submitUri + '&comp=blocklist';
            console.log(uri);
            var requestBody = '<?xml version="1.0" encoding="utf-8"?><BlockList>';
            for (var i = 0; i < blockIds.length; i++) {
                requestBody += '<Latest>' + blockIds[i] + '</Latest>';
            }
            requestBody += '</BlockList>';
            console.log(requestBody);
            $.ajax({
                url: uri,
                type: "PUT",
                data: requestBody,
                beforeSend: function (xhr) {
                    xhr.setRequestHeader('x-ms-blob-content-type', selectedFile.type);
                    xhr.setRequestHeader('Content-Length', requestBody.length);
                },
                success: function (data, status) {
                    console.log(data);
                    console.log(status);
                },
                error: function (xhr, desc, err) {
                    console.log(desc);
                    console.log(err);
                }
            });

Complete Code

Here’s the complete code. For CSS, I actually used Metro UI CSShttp://metroui.org.ua/. If you’re planning on building web applications and want to give them Windows 8 applications style look and feel, do give it a try. It’s pretty awesome + it’s open source. Really no reason for you to not give it a try!

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>File Uploader</title>
    <script src="js/jquery-1.7.1.js"></script>
    <link rel="stylesheet" href="css/modern.css"/>
    <script>
        var maxBlockSize = 256 * 1024;//Each file will be split in 256 KB.
        var numberOfBlocks = 1;
        var selectedFile = null;
        var currentFilePointer = 0;
        var totalBytesRemaining = 0;
        var blockIds = new Array();
        var blockIdPrefix = "block-";
        var submitUri = null;
        var bytesUploaded = 0;
        
        $(document).ready(function () {
            $("#output").hide();
            $("#file").bind('change', handleFileSelect);
            if (window.File && window.FileReader && window.FileList && window.Blob) {
                // Great success! All the File APIs are supported.
            } else {
                alert('The File APIs are not fully supported in this browser.');
            }
        });
        
        //Read the file and find out how many blocks we would need to split it.
        function handleFileSelect(e) {
            maxBlockSize = 256 * 1024;
            currentFilePointer = 0;
            totalBytesRemaining = 0;
            var files = e.target.files;
            selectedFile = files[0];
            $("#output").show();
            $("#fileName").text(selectedFile.name);
            $("#fileSize").text(selectedFile.size);
            $("#fileType").text(selectedFile.type);
            var fileSize = selectedFile.size;
            if (fileSize < maxBlockSize) {
                maxBlockSize = fileSize;
                console.log("max block size = " + maxBlockSize);
            }
            totalBytesRemaining = fileSize;
            if (fileSize % maxBlockSize == 0) {
                numberOfBlocks = fileSize / maxBlockSize;
            } else {
                numberOfBlocks = parseInt(fileSize / maxBlockSize, 10) + 1;
            }
            console.log("total blocks = " + numberOfBlocks);
            var baseUrl = $("#sasUrl").val();
            var indexOfQueryStart = baseUrl.indexOf("?");
            submitUri = baseUrl.substring(0, indexOfQueryStart) + '/' + selectedFile.name + baseUrl.substring(indexOfQueryStart);
            console.log(submitUri);
        }

        var reader = new FileReader();

        reader.onloadend = function (evt) {
            if (evt.target.readyState == FileReader.DONE) { // DONE == 2
                var uri = submitUri + '&comp=block&blockid=' + blockIds[blockIds.length - 1];
                var requestData = new Uint8Array(evt.target.result);
                $.ajax({
                    url: uri,
                    type: "PUT",
                    data: requestData,
                    processData: false,
                    beforeSend: function(xhr) {
                        xhr.setRequestHeader('x-ms-blob-type', 'BlockBlob');
                        xhr.setRequestHeader('Content-Length', requestData.length);
                    },
                    success: function (data, status) {
                        console.log(data);
                        console.log(status);
                        bytesUploaded += requestData.length;
                        var percentComplete = ((parseFloat(bytesUploaded) / parseFloat(selectedFile.size)) * 100).toFixed(2);
                        $("#fileUploadProgress").text(percentComplete + " %");
                        uploadFileInBlocks();
                    },
                    error: function(xhr, desc, err) {
                        console.log(desc);
                        console.log(err);
                    }
                });
            }
        };

        function uploadFileInBlocks() {
            if (totalBytesRemaining > 0) {
                console.log("current file pointer = " + currentFilePointer + " bytes read = " + maxBlockSize);
                var fileContent = selectedFile.slice(currentFilePointer, currentFilePointer + maxBlockSize);
                var blockId = blockIdPrefix + pad(blockIds.length, 6);
                console.log("block id = " + blockId);
                blockIds.push(btoa(blockId));
                reader.readAsArrayBuffer(fileContent);
                currentFilePointer += maxBlockSize;
                totalBytesRemaining -= maxBlockSize;
                if (totalBytesRemaining < maxBlockSize) {
                    maxBlockSize = totalBytesRemaining;
                }
            } else {
                commitBlockList();
            }
        }
        
        function commitBlockList() {
            var uri = submitUri + '&comp=blocklist';
            console.log(uri);
            var requestBody = '<?xml version="1.0" encoding="utf-8"?><BlockList>';
            for (var i = 0; i < blockIds.length; i++) {
                requestBody += '<Latest>' + blockIds[i] + '</Latest>';
            }
            requestBody += '</BlockList>';
            console.log(requestBody);
            $.ajax({
                url: uri,
                type: "PUT",
                data: requestBody,
                beforeSend: function (xhr) {
                    xhr.setRequestHeader('x-ms-blob-content-type', selectedFile.type);
                    xhr.setRequestHeader('Content-Length', requestBody.length);
                },
                success: function (data, status) {
                    console.log(data);
                    console.log(status);
                },
                error: function (xhr, desc, err) {
                    console.log(desc);
                    console.log(err);
                }
            });

        }
        function pad(number, length) {
            var str = '' + number;
            while (str.length < length) {
                str = '0' + str;
            }
            return str;
        }
    </script>
</head>
<body>
    <form>
        <div style="margin-left: 20px;">
            <h1>File Uploader</h1>
            <p>
                <strong>SAS URI</strong>: 
                <br/>
                <span class="input-control text">
                    <input type="text" id="sasUrl" style="width: 50%"
                           value=""/>
                </span>
            </p>
            <p>
                <strong>File To Upload</strong>: 
                <br/>
                <span class="input-control text">
                    <input type="file" id="file" name="file" style="width: 50%"/>
                </span>
            </p>
            <div id="output">
                
                <strong>File Properties:</strong>
                <br/>
                <p>
                    Name: <span id="fileName"></span>
                </p>
                <p>
                    File Size: <span id="fileSize"></span> bytes.
                </p>
                <p>
                    File Type: <span id="fileType"></span>
                </p>
                <p>
                    <input type="button" value="Upload File" onclick="uploadFileInBlocks()"/>
                </p>
                <p>
                    <strong>Progress</strong>: <span id="fileUploadProgress">0.00 %</span>
                </p>
            </div>
        </div>
        <div>
        </div>
    </form>
</body>
</html>

Some Caveats

This makes use of HTML5 File API and while all new browsers support that, same can’t be said about older browsers. If your users would be accessing an application like this using older browsers, you would need to think about alternative approaches. You could either make use of SWF File Uploader or could write an application using Silverlight. Steve Marx wrote a blog post about uploading files using Shared Access Signature and Silverlight which you can read here: http://blog.smarx.com/posts/uploading-windows-azure-blobs-from-silverlight-part-1-shared-access-signatures.

I found the code working in IE 10, Google Chrome (version 24.0.1312.57 m) on my Windows 8 machine. I got error when I tried to run the code in FireFox (version 18.0.2) and Safari (version 5.1.7) browsers so obviously one would need to keep the browser incompatibility in mind.

Enhancements

I hacked this code in about 4 hours or so and obviously my knowledge is somewhat limited when it comes to JavaScript and CSS so a lot can be improved on that front :). However some other features I could think of are:

Generate SAS on demand: You could possibly have a server side component which would generate SAS URI on demand instead of having a user enter that manually.

Multiple file uploads: This application can certainly be extended to upload multiple files. A user would select multiple files (or may be even a folder) and have the application upload multiple files.

Drag/drop support: This application can certainly be extended to support drag/drop scenario where users could drag files from their desktop and drop them to upload.

Do upload in Web Worker: This is another improvement that can be done where uploads are done through web worker capability in HTML5.

Parallel uploads: Currently the code uploads one chunk at a time. A modification could be to upload multiple chunks simultaneously.

Transient error handling: Since Windows Azure Storage is a remote shared resource, you may encounter transient errors. You could modify the application to handle these transient errors. For more details on transient errors, please see this blog post of mine: http://gauravmantri.com/2013/01/11/some-best-practices-for-building-windows-azure-cloud-applications/.

Summary

So that’s it for this post! As you saw, it is quite easy to implement a very simple HTML/JS based application for getting data into Windows Azure Blob Storage. Obviously there’re some limitations and there’s cross-browser compatibility issues one would need to consider but once those are sorted out, it opens up a lot of exciting opportunities. I hope you’ve found this post useful. As always, if you find any issues with the post please let me know and I’ll fix it ASAP.

Happy Coding!


[This is the latest product I'm working on]

Comments

  1. Natalia says:

    Thanks, very interesting sample and idea (even without CORS support from Storage side). I deployed the page, scripts to my storage (the same where files will be uploaded) but I’ve got InvalidXmlDocument (XML specified is not syntactically valid) exception in final stage (commit: comp=blocklist) but the request seems to be ok. Maybe I did something wrong.

  2. Very nice!

    I did something similar (without the chunking) a little while back: http://coderead.wordpress.com/2012/11/21/uploading-files-directly-to-blob-storage-from-the-browser/

    I wonder if there is a way of creating this as a jQuery plugin or something, to make it really easy to add to a site?

  3. Sumit Kathuria says:

    Thanks for sharing this article and would also like to ask about the approaches for doing parallel chunk upload. As a javascript is based on single threaded model, is the webworkers are only possible way to upload parallel chunks? Would that be good solution, if i slice the file in to multiple pieces and upload each of them using webworkers.

  4. I get the SAS URI from the blob storage service, but when the file is being sent, I get the following error message – “SCRIPT7002: XMLHttpRequest: Network Error 0×80070005, Access is denied.” Is this due to some setting on the Azure server or is it some other problem? Any help is greatly appreciated.

  5. Why do we need SAS Url just to upload a file?

    • Since the container is set to private permissions and the file is going straight to the server via a public link, you need the SAS to allow for write permissions.

      My application is actually calculating the SAS locally with an MVC controller action and sending it back out to the page to be used, but isn’t uploading it to the local server, just saving it directly to azure storage.

  6. I know that this topic is quite old, but I need to upload big file to Azure Blob Storage now, so I try to post my question anyway! ;)

    I followed this tutorial step by step and it’s great! I have a Mobile Service in Windows Azure which creates a SAS for me and, when I run the web app locally, it works perfectly.

    Then I uploaded the web app into my web server and the file uploader component, which relies on this tutorial, does not work anymore. The problem is that every browser blocks the requests due to CORS.

    I followed another of your tutorials: I downloaded, built and run “Azure CORS rule manager”, and I set up a rule that states:
    Allowed origins: *
    Allowed methods: Get, Put
    Allowed headers: *
    Exposed headers: x-ms-*
    Max age (seconds): 3600

    …but my requests are still blocked because of CORS. :(

    Am I missing something?
    I don’t know how to solve this problem..

    • Hi Rik,

      Your settings look OK to me. In fact I tried uploading some files using the settings you’re using and SAS and I was able to upload the file. Can you share more details about your project? We can take it offline if you like. Please email me details at gaurav at Cynapta.com. Thanks.

  7. How to prevent hackers to upload large files to my Azure storage using SAS?
    My iOS app needs to upload image to my Azure storage but if a hacker use Fiddler to capture my request, they can upload anything they want, correct?
    Is there anyway to protect my Azure storage?

    Thanks!

    • One thing you could do is to issue SAS URLs with relatively shorter timespans. So if you think normally it would take 1 minute to upload a file, then create a SAS Token which is valid for 1 minute only. Get a new token for each file a user wants to upload. Also make use of Access Policy when issuing SAS tokens. With access tokens it is easier to revoke SAS URLs. You just delete that access policy if you think something’s fishy is going on. Hope this helps.

      • Thanks!
        If I have 1 million users(hopefully…), is it ok to create and delete SAS or policy frequently?
        If I have to do this, the app needs to make another call to get SAS first and make uploading call which increasing one network round trip.
        I guess it’s ok, I will try it but is there any better way?

        Thanks.

  8. this really saved my ass

Trackbacks

  1. [...] Uploading Large Files in Windows Azure Blob Storage Using Shared Access Signature, HTML, and JavaScr… by @gmantri (posted Feb. [...]

  2. [...] Here is excellent sample for uploading files directly to Azure Blob. Sample will not include server side code of generating of SAS and it cannot be used yet as it is. Azure BLOB does not yet support CORS so whole script needs to be saved to same BLOB also. But (soon ?) when this will be possible it will be the ultimate way of uploading huge files to server. [...]

  3. [...] first and then push it into blob storage from there or host the upload code in blob storage itself (http://gauravmantri.com/2013/02/16/uploading-large-files-in-windows-azure-blob-storage-using-shared-…). Now you don’t need to do that. Once the CORS is enabled, you can simply upload the files [...]

  4. […] Upload large file to blob storage via Javascript […]

Speak Your Mind

*