Gaurav Mantri's Personal Blog.

How To Copy an Object from Amazon S3 to Windows Azure Blob Storage using “Copy Blob”

Yesterday I wrote a blog post summarizing updates done to Windows Azure Storage. You can read that blog post here: https://gauravmantri.com/2012/06/13/updates-to-windows-azure-storagesummary-of-new-features/.

One of the significant improvement done is with “Copy Blob” function. While preparing for my blog post, I referred to the excellent blog post by Windows Azure Storage team about this functionality. You can read that blog post here: http://blogs.msdn.com/b/windowsazurestorage/archive/2012/06/12/introducing-asynchronous-cross-account-copy-blob.aspx. One thing that caught my eye on this blog post was a statement which stated that this copy blob functionality can copy blobs from outside of Windows Azure as long as they are either accessible publicly or via a pre-signed URL. They need not be in Windows Azure.

THIS IS HUGE!!!

This got me thinking Smile. I mean, now I have an option of moving my files easily into Windows Azure Blob Storage where most of the work of copying will be done by Windows Azure. So I thought, why not write a simple application which will try and copy a file (object) from Amazon S3 to Windows Azure Blob Storage. I was able to achieve this in a matter of hours (hours because I didn’t had an account with Amazon S3 till yesterday and then my knowledge with Storage Client library is somewhat limited). Anyways, the point I am trying to make is: It actually works and it’s really simple to do so.

Why Do It?

I can think of a few use cases where you would want to copy stuff from another cloud provider to Windows Azure Blob Storage. Some of them are:

Windows Azure Blob Storage is Really Compelling

With all the new enhancements and cost reductions that are announced, it is really a compelling alternative to other cloud storage providers. That could be one of the factors for you to switch from your existing cloud storage provider to Windows Azure Storage. Or you may have always wanted to switch to Windows Azure but you were worried about how to move the storage data. This new functionality makes it super easy now.

Windows Azure Blob Storage as Backup for Existing Cloud Storage

You can now easily use Windows Azure Blob Storage as a backup for your existing cloud storage. It was possible earlier also however it was rather cumbersome process and you would have to write a lot of code to achieve this.

With the latest enhancements to Windows Azure Storage, it’s really simple. You don’t have to write a single line of code to perform actual byte level copying from your source to Windows Azure Blob Storage. Windows Azure Storage platform does all the heavy lifting for you. You just tell it the source and target and fire off copy functionality and it takes care of the rest. Things can’t get better than this.

Show Me The Code!!!

OK, enough talking! Let’s see some code. I ended up building a simple console application (source code below). But before I show you the code, there are certain more things I want to talk about Smile.

Prerequisites

There are a few things you would need to do, before you could actually run the code below:

  1. Create a new storage account: Please note that this functionality will only work if your storage account is created after 7th June 2012. So if you have a storage account which was created after this date, you can use that storage account or head over to new and improved Windows Azure Portal and create a new storage account. Have your account name and key handy.
  2. Get the latest storage client library: At the time of writing of this blog, the officially released SDK version was 1.7. Unfortunately this functionality is not available in that version. You would need to get version 1.7.1 which you can get from GitHub. Get the source code and compile it.
  3. Source object/blob is accessible: Based on the blog post above, Copy Blobs functionality can copy a blob from outside of Windows Azure which is publicly accessible in some shape or form. It could be that the object has either at least READ permission on it or has a Shared Access Signature like URL (called Presigned Object URL) which would grant temporary read access to the object. You could use Amazon’s SDK to create these URLs on the fly. Please refer to http://docs.amazonwebservices.com/AmazonS3/latest/dev/ShareObjectPreSignedURLDotNetSDK.html for instructions on how to do so using Amazon’s SDK for .Net.

The Code

As I said, the code is really simple and is shown below:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.StorageClient;

namespace CopyAmazonObjectToBlobStorage
{
    class Program
    {
        private static string azureStorageAccountName = "<Your Windows Azure Storage Account Name>";
        private static string azureStorageAccountKey = "<Your Windows Azure Storage Account Key>";
        private static string azureBlobContainerName = "<Windows Azure Blob Container Name>";
        private static string amazonObjectUrl = "<URL of an Object stored in Amazon S3>";
        private static string azureBlobName = "<Name of the Blob in Windows Azure Storage>";
        static void Main(string[] args)
        {
            CloudStorageAccount csa = new CloudStorageAccount(new StorageCredentialsAccountAndKey(azureStorageAccountName, azureStorageAccountKey), true);
            CloudBlobClient blobClient = csa.CreateCloudBlobClient();
            var blobContainer = blobClient.GetContainerReference(azureBlobContainerName);
            Console.WriteLine("Trying to create the blob container....");
            blobContainer.CreateIfNotExist();
            Console.WriteLine("Blob container created....");
            var blockBlob = blobContainer.GetBlockBlobReference(azureBlobName);
            Console.WriteLine("Created a reference for block blob in Windows Azure....");
            Console.WriteLine("Blob Uri: " + blockBlob.Uri.AbsoluteUri);
            Console.WriteLine("Now trying to initiate copy....");
            blockBlob.StartCopyFromBlob(new Uri(amazonObjectUrl), null, null, null);
            Console.WriteLine("Copy started....");
            Console.WriteLine("Now tracking blob's copy progress....");
            bool continueLoop = true;
            while (continueLoop)
            {
                Console.WriteLine("");
                Console.WriteLine("Fetching lists of blobs in Azure blob container....");
                var blobsList = blobContainer.ListBlobs(true, BlobListingDetails.Copy);
                foreach (var blob in blobsList)
                {
                    var tempBlockBlob = (CloudBlob) blob;
                    var destBlob = blob as CloudBlob;

                    if (tempBlockBlob.Name == azureBlobName)
                    {
                        var copyStatus = tempBlockBlob.CopyState;
                        if (copyStatus != null)
                        {
                            Console.WriteLine("Status of blob copy...." + copyStatus.Status);
                            var percentComplete = copyStatus.BytesCopied / copyStatus.TotalBytes;
                            Console.WriteLine("Total bytes to copy...." + copyStatus.TotalBytes);
                            Console.WriteLine("Total bytes copied...." + copyStatus.BytesCopied);
                            if (copyStatus.Status != CopyStatus.Pending)
                            {
                                continueLoop = false;
                            }
                        }
                    }
                }
                Console.WriteLine("");
                Console.WriteLine("==============================================");
                System.Threading.Thread.Sleep(1000);
            }

            Console.WriteLine("Process completed....");
            Console.WriteLine("Press any key to terminate the program....");
            Console.ReadLine();
        }
    }
}

The code is pretty straight forward. What I do is specify Windows Azure Storage Account credentials and the URL for the source blob (which happens to be in Amazon S3). Then I create a blob container in Windows Azure Storage and instruct Windows Azure Storage to start copying the blob specified via source URL. Once copy request is sent, all the application does then is monitors the status of copy operation. As you can see, you don’t write a single line of code to copy bytes from source to target. It’s all taken care of by Windows Azure.

Here’s how the object looked like in Amazon S3:

image

Once I copied the object, I can see this in Windows Azure Blob Storage using Cloud Storage Studio.

image

Summary

As you saw, Windows Azure has made it real easy for you to bring stuff into Windows Azure. If you’re considering moving from other cloud platform to Windows Azure and are somewhat concerned about onboarding friction, this is one step towards reducing that friction.

I’ve created a simple example, wherein I just copied a single file from Amazon S3 to Windows Azure Blob Storage. We can extend this functionality to copy all objects from a bucket in Amazon S3 to a Windows Azure Blob Container. If nobody else beats me to it, I will write a sample application which does just that Smile.

I hope you have found this information useful. If you think I have made some errors and provided some incorrect information, please feel free to correct me by providing comment. I’ll fix those issues ASAP.

Stay tuned for more Windows Azure related posts. So Long!!!


[This is the latest product I'm working on]