Gaurav Mantri's Personal Blog.

Windows Azure Blob Storage – Dealing With “The specified blob or block content is invalid” Error

If you’re uploading blobs by splitting blobs into blocks and you get the error – The specified blob or block content is invalid, then this post is for you.

Short Version

If you’re uploading blobs by splitting blobs into blocks and you get the above mentioned error, ensure that your block ids of your blocks are of same length. If the block ids of your blocks are of different length, you’ll get this error.

Long Version

Now for the longer version of this post Smile. A few days back I was working with storage client library especially around uploading blobs in chunks and with one particular blob I was constantly getting the error – The specified blob or block content is invalid. I tried numerous combinations even resorting to REST API directly but to no avail. It only happened with just one blob. Furthermore if I uploaded the same blob without splitting it into blocks, all was well. I was at my wits’ end. Tried searching the Internet for this error but could not find a conclusive answer to my problem.

After much trial and error, I was able to simulate the same problem on other blobs as well. Here’s how you can recreate it:

  1. Start uploading the blob by splitting it into blocks. For block id, let’s do a 7 character long string e.g. intValue.ToString(“d7”). This will ensure that my block ids would be “0000001”, “0000002”, …, ”0000010” …..
  2. After one or two blocks are uploaded, cancel the operation.
  3. Now re-upload the blob by splitting it into blocks. However this time for block id, let’s do a 6 character long string e.g. intValue.ToString(“d6”).
  4. You’ll get the error as soon as you try to upload the 1st block.

Possible Solutions

Now that we know the root cause of this problem, let’s look at some of the possible solutions to solve this problem.

Wait out

One possible solution is to wait out. I know its lame but still a possible solution. We know that Windows Azure Blob Storage Service keeps all uncommitted blocks for a duration of 7 days and if within 7 days those uncommitted blocks are not committed, the storage service purges them.

I wish storage service provided some mechanism to purge uncommitted blocks programmatically.

Commit uncommitted blocks

You could possibly commit the blocks which are in uncommitted state so that at least you get a blob (which would not be the blob we wanted to upload in the first place). You can then delete that blob and re-upload the blob by specifying block ids which are of same length. To fetch the list of uncommitted blocks, if you’re using REST API directly you can perform “Get Block List” operation and pass “blocklisttype=uncommitted” as one of the query string parameters. If you’re using storage client library (assuming you’re using the version 2.x of .Net storage client library), you can do something like the code below:

        private static List<string> GetUncommittedBlockIds(CloudBlockBlob blob)
        {
            var sasUri = blob.GetSharedAccessSignature(new SharedAccessBlobPolicy()
            {
                SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(5),
                Permissions = SharedAccessBlobPermissions.Read,
            });
            var blobUri = new Uri(string.Format("{0}{1}", blob.Uri, sasUri));
            List<string> uncommittedBlockIds = new List<string>();
            var request = BlobHttpWebRequestFactory.GetBlockList(blobUri, null, null, BlockListingFilter.Uncommitted, null, null);
            //request.Headers.Add("Authorization", 
            using (var resp = (HttpWebResponse)request.GetResponse())
            {
                using (var stream = resp.GetResponseStream())
                {
                    var getBlockListResponse = new GetBlockListResponse(stream);
                    var blocks = getBlockListResponse.Blocks;
                    foreach (var block in blocks.Where(b => !b.Committed))
                    {
                        uncommittedBlockIds.Add(Encoding.UTF8.GetString(Convert.FromBase64String(block.Name)));
                    }
                }
            }
            return uncommittedBlockIds;
        }

A few things to keep in mind here:

Fetch uncommitted blocks to see block id length

You could fetch the list of uncommitted blocks just to find out the length of the block id used. You could then use that block id length for your new upload session and do the upload. Please see the code snippet above to find this information.

Upload another blob with same name without splitting it into blocks

You could also upload another blob with the same name without splitting it into blocks. It could very well be a zero byte blob. That way your uncommitted block list will be wiped clean. Then you could delete that dummy blob and re-upload the actual blob.

A Few Words About Blocks

Since we’re talking about blocks, I thought it might be useful to mention a few points about them:

  • Blocks and block related operations are only applicable for “Block Blobs”. Duh!! You’ll get an error if you’re trying to do these operations on a “Page Blob”.
  • For uploading large blobs, it is recommended that you split your blob into blocks. In fact if your blob size is more than 64 MB, then you have to split it into blocks.
  • Minimum size of a block is 1 Byte and the maximum size of a block is 4 MB. It is recommended that you choose a block size based on your internet connectivity and number of parallel threads you want use to upload these blocks.
  • A blob can be split into a maximum of 50000 blocks. It’s important to remember this limitation because you are reminded of this limit when you’re trying to upload 50001st block.
  • The length of all the block ids must be same. So if you’re using an integer value to denote block id, you make sure that you pad that integer value with “0” so that you get same length. So you could do something like int.ToString(“d6”).
  • When passing the block id as a parameter, it must be Base64 encoded.
  • While the order in which the blocks are uploaded is not important, the order is important when you commit the block list because that’s when the blob is constructed by the service. For example, let’s say you’re uploading a blob by splitting it into 5 blocks (with ids “000001”, “000002”, “000003”, “000004”, and “000005”). You could upload these blocks in any order – 000004, 000001, 000003, 000005, 000002 however when you commit the block list, ensure that the block ids are passed in proper order i.e. 000001, 000002, 000003, 000004, 000005.

Summary

That’s it for this post. I hope you’ve found this information useful. I spent considerable amount of time trying to fix this problem so I hope it will help some folks out. As always, if you find any issues with the post please let me know and I’ll fix it ASAP.

Happy Coding!

Windows Azure SDK 2.0 For .Net – Taking A Second Look At Windows Azure Diagnostics

clip_image002.jpg

A few days ago I wrote a post about newly released Windows Azure SDK 2.0 for .Net. You can read that post here: http://gauravmantri.com/2013/04/30/introducing-windows-azure-sdk-2-0-for-net/. In that post I briefly talked about the improvements in the … [Continue reading]

Windows Azure Cloud Services, Extensions and Service Management API – Fun with Remote Desktop

wlEmoticon-smile.png

I want you to try something for me (pretty please with cherry on top ). Fire up Visual Studio, create a simple Windows Azure Cloud Service and then without making any changes just publish that service. When you publish the service, DO NOT ENABLE … [Continue reading]

Introducing Windows Azure SDK 2.0 for .Net

clip_image002.jpg

Today Microsoft announced the availability of Windows Azure SDK 2.0 for .Net. You can read more about the announcement in Scott Guthrie’s blog post here: … [Continue reading]

Windows Azure Primer–Basics

wlEmoticon-smile.png

There’re many articles which cover Windows Azure and it’s functionality in great details however I could not find much information when it comes to novice users (or may be I need to brush up on my searching skills ). Anyways, through these series … [Continue reading]

Moving On – Goodbye Cerebrata, Hello Cynapta!

DSCF8988.jpg

Today’s my last day at Cerebrata. Wow!!! It has been an incredible journey for the last 5 years. I vividly remember when around the same time 5 years ago, I told my team in the US about my decision to move back to India to “do my own thing”. I … [Continue reading]

Uploading Large Files in Windows Azure Blob Storage Using Shared Access Signature, HTML, and JavaScript

image.png

In my last post I talked about Shared Access Signature feature of Windows Azure Storage. You can read that post here: http://gauravmantri.com/2013/02/13/revisiting-windows-azure-shared-access-signature/. In this post we’ll put that to some … [Continue reading]

Revisiting Windows Azure Shared Access Signature

image.png

In this blog post, we’ll talk about Shared Access Signature (SAS) functionality in Windows Azure. Steve Marx (@smarx) wrote excellent blog posts on the same subject a few years ago which you can read here: … [Continue reading]