For a change, this would be a technical post which will not be about Windows Azure :). For the service we are building at Cynapta, we had a requirement where we needed to upload large files using a pure browser based interface. For that we used Query String authentication mechanism available in Amazon S3 which provided time-limited access to buckets in Amazon S3 (in other words Pre Signed URL). In this blog post, we will talk about how we accomplished this using C#.
How Large Files are Uploaded in Amazon S3
Before we talk about using Query String authentication in Amazon S3, let’s take a moment and talk about how large files are uploaded in Amazon S3 and then we will focus on the issue at hand.
In order to upload large files, Amazon S3 provides a mechanism called “Multi Part Upload”. Using this mechanism, essentially you chunk a large file into smaller pieces (called “Parts” in Amazon S3 terminology) and upload these chunks. Once all parts are uploaded, you tell Amazon S3 to join these files together and create the desired object.
To do a “Multi Part Upload”, one would go through the following steps:
Initiate Multipart Upload
This is where you basically tell Amazon S3 that you will be uploading a large file. When Amazon S3 receives this request, it sends back an “Upload Id” which you have to use in subsequent requests. To learn more about this process, please click here: http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html.
Upload Part
This is where you split the large file into parts and upload these parts. A few things to remember are:
- Each part must be at least 5 MB in size with the exception of last part.
- Each part is assigned a sequential part number (starting from 1) and there can be a maximum of ten thousand (10,000) parts. In other words maximum number of parts in which a file can be split is ten thousand.
To learn more about this process, please click here: http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html.
Complete Multipart Upload
Once all parts are uploaded, using this step you basically tell Amazon S3 to join all the parts together to create the object. To learn more about this process, please click here: http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadComplete.html.
Challenge
For the rest of the service, we used Amazon SDK for .Net and more or less it worked really well for us. For normal uploads where we didn’t have to split the file in parts, the SDK worked really great as it provides a function to create Pre Signed URL. However one thing we noticed is that there was no support for generating Pre Signed URL when you wanted to do Multi Part Upload and unfortunately I was not able to find code in .Net (C#) to do so.
So we ended up consuming Amazon S3 REST API in C# code to create Pre Signed URL.
Code
Let’s get into the code right away. What I did was created a console application and wrote some helper functions which implemented REST API for Amazon S3. For the purpose of demonstration, let’s assume that we want to upload a file called “verylargefile.exe” in a bucket called “gaurav-test-bucket” which is hosted in “us-west-2” region. Based on this, my object URL will be “https://gaurav-test-bucket.s3-us-west-2.amazonaws.com/verylargefile.exe”. Since we are uploading using Pre Signed URL, we will assume that the URL will be valid for 12 hours from the time we started the process.
Creating Authorization Header
First thing that we need to do is write code to create authorization header. To learn more about how to create authorization header, please click here: http://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html. If you go through the documentation, you will realize that in order to create authorization header, there are a few things you would need to do:
- Create “CanonicalizedResource” Element String
- Create “CanonicalizedAmzHeaders” Element String
- Create “StringToSign String
- Create Signature
Now we’ll see the code for each of these steps:
Create “CanonicalizedResource” Element String
To create CanonicalizedResource Element String, you would need the request URI. Here’s the code to create that:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | private static string [] subResourcesToConsider = new string [] { "acl" , "lifecycle" , "location" , "logging" , "notification" , "partNumber" , "policy" , "requestPayment" , "torrent" , "uploadId" , "uploads" , "versionId" , "versioning" , "versions" , "website" , }; private static string [] overrideResponseHeadersToConsider = new string [] { "response-content-type" , "response-content-language" , "response-expires" , "response-cache-control" , "response-content-disposition" , "response-content-encoding" }; private static string GetCanonicalizedResourceString(Uri requestUri) { var host = requestUri.DnsSafeHost; var hostElementsArray = host.Split( '.' ); var bucketName = "" ; if (hostElementsArray.Length > 3) { StringBuilder sb = new StringBuilder(); for ( int i = 0; i < hostElementsArray.Length - 3; i++) { sb.AppendFormat( "{0}." , hostElementsArray[i]); } bucketName = sb.ToString(); if (bucketName.Length > 0) { if (bucketName.EndsWith( "." )) { bucketName = bucketName.Substring(0, bucketName.Length - 1); } bucketName = string .Format( "/{0}" , bucketName); } } var subResourcesList = subResourcesToConsider.ToList(); var overrideResponseHeadersList = overrideResponseHeadersToConsider.ToList(); StringBuilder canonicalizedResourceStringBuilder = new StringBuilder(); canonicalizedResourceStringBuilder.Append(bucketName); canonicalizedResourceStringBuilder.Append(requestUri.AbsolutePath); NameValueCollection queryVariables = HttpUtility.ParseQueryString(requestUri.Query); SortedDictionary< string , string > queryVariablesToConsider = new SortedDictionary< string , string >(); SortedDictionary< string , string > overrideResponseHeaders = new SortedDictionary< string , string >(); if (queryVariables != null && queryVariables.Count > 0) { var numQueryItems = queryVariables.Count; for ( int i = 0; i < numQueryItems; i++) { var key = queryVariables.GetKey(i); var value = queryVariables[key]; if (subResourcesList.Contains(key)) { if (queryVariablesToConsider.ContainsKey(key)) { var val = queryVariablesToConsider[key]; queryVariablesToConsider[key] = string .Format( "{0},{1}" , value, val); } else { queryVariablesToConsider.Add(key, value); } } if (overrideResponseHeadersList.Contains(key)) { overrideResponseHeaders.Add(key, HttpUtility.UrlDecode(value)); } } } if (queryVariablesToConsider.Count > 0 || overrideResponseHeaders.Count > 0) { StringBuilder queryStringInCanonicalizedResourceString = new StringBuilder(); queryStringInCanonicalizedResourceString.Append( "?" ); for ( int i = 0; i < queryVariablesToConsider.Count; i++) { var key = queryVariablesToConsider.Keys.ElementAt(i); var value = queryVariablesToConsider.Values.ElementAt(i); if (! string .IsNullOrWhiteSpace(value)) { queryStringInCanonicalizedResourceString.AppendFormat( "{0}={1}&" , key, value); } else { queryStringInCanonicalizedResourceString.AppendFormat( "{0}&" , key); } } for ( int i = 0; i < overrideResponseHeaders.Count; i++) { var key = overrideResponseHeaders.Keys.ElementAt(i); var value = overrideResponseHeaders.Values.ElementAt(i); queryStringInCanonicalizedResourceString.AppendFormat( "{0}={1}&" , key, value); } var str = queryStringInCanonicalizedResourceString.ToString(); if (str.EndsWith( "&" )) { str = str.Substring(0, str.Length - 1); } canonicalizedResourceStringBuilder.Append(str); } return canonicalizedResourceStringBuilder.ToString(); } |
Create “CanonicalizedAmzHeaders” Element String
To create CanonicalizedAmzHeaders Element String, you would need all the headers which will be included in the request. Based on the documentation, only the headers starting with “x-amz-“ will be considered in this function though. Here’s the code to create that:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | private static string GetCanonicalizedAmzHeadersString(NameValueCollection requestHeaders) { var canonicalizedAmzHeadersString = string .Empty; if (requestHeaders != null && requestHeaders.Count > 0) { StringBuilder sb = new StringBuilder(); SortedDictionary< string , string > sortedRequestHeaders = new SortedDictionary< string , string >(); var requestHeadersCount = requestHeaders.Count; for ( int i = 0; i < requestHeadersCount; i++) { var key = requestHeaders.Keys.Get(i); var value = requestHeaders[key].Trim(); key = key.ToLowerInvariant(); if (key.StartsWith( "x-amz-" , StringComparison.InvariantCultureIgnoreCase)) { if (sortedRequestHeaders.ContainsKey(key)) { var val = sortedRequestHeaders[key]; sortedRequestHeaders[key] = string .Format( "{0},{1}" , val, value); } else { sortedRequestHeaders.Add(key, value); } } } if (sortedRequestHeaders.Count > 0) { foreach ( var item in sortedRequestHeaders) { sb.AppendFormat( "{0}:{1}\n" , item.Key, item.Value); } canonicalizedAmzHeadersString = sb.ToString(); } } return canonicalizedAmzHeadersString; } |
Create “StringToSign” String
There will be 2 methods that we need to implement for creating “StringToSign” string – one for regular authorization header and other one for authorization header for Pre Signed URL. The difference between the two is that in the former we have to pass the current date/time (in UTC) while in the latter we need to pass the number of seconds since Jan 1st 1970 during which the Pre Signed URL will be valid.
StringToSign for regular authorization header
Here’s the code to create StringToSign for regular authorization header:
01 02 03 04 05 06 07 08 09 10 11 12 | private static string GetStringToSign(Uri requestUri, string httpVerb, string contentMD5, string contentType, DateTime date, NameValueCollection requestHeaders) { var canonicalizedResourceString = GetCanonicalizedResourceString(requestUri); var canonicalizedAmzHeadersString = GetCanonicalizedAmzHeadersString(requestHeaders); var dateInStringFormat = date.ToString( "R" ); if (requestHeaders != null && requestHeaders.AllKeys.Contains( "x-amz-date" )) { dateInStringFormat = string .Empty; } var stringToSign = string .Format( "{0}\n{1}\n{2}\n{3}\n{4}{5}" , httpVerb, contentMD5, contentType, dateInStringFormat, canonicalizedAmzHeadersString, canonicalizedResourceString); return stringToSign; } |
StringToSign for Pre Signed URL authorization header
Here’s the code to create StringToSign for Pre Signed URL authorization header:
1 2 3 4 5 6 7 | private static string GetStringToSign(Uri requestUri, string httpVerb, string contentMD5, string contentType, long secondsSince1stJan1970, NameValueCollection requestHeaders) { var canonicalizedResourceString = GetCanonicalizedResourceString(requestUri); var canonicalizedAmzHeadersString = GetCanonicalizedAmzHeadersString(requestHeaders); var stringToSign = string .Format( "{0}\n{1}\n{2}\n{3}\n{4}{5}" , httpVerb, contentMD5, contentType, secondsSince1stJan1970, canonicalizedAmzHeadersString, canonicalizedResourceString); return stringToSign; } |
Create Signature
Last step in this process is to create signature. Here’s the code to do the same:
1 2 3 4 5 6 7 8 | private static string CreateSignature( string secretKey, string stringToSign) { byte [] dataToSign = Encoding.UTF8.GetBytes(stringToSign); using (HMACSHA1 hmacsha1 = new HMACSHA1(Encoding.UTF8.GetBytes(secretKey))) { return Convert.ToBase64String(hmacsha1.ComputeHash(dataToSign)); } } |
Once the signature is created, most of the heavy weight work is done :). Now we can focus on multipart upload.
Initiate Multipart Upload
As mentioned above, this process returns you an upload id. For this process, we would need to append “uploads=” as the query string to my object URL. Here’s the code to do so:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | private static string InitiateMultipartUpload( string accessKey, string secretKey, Uri requestUri, DateTime requestDate, string contentType, NameValueCollection requestHeaders) { var uploadId = string .Empty; var uploadIdRequestUrl = new Uri( string .Format( "{0}?uploads=" , requestUri.AbsoluteUri)); var uploadIdRequestUrlRequestHeaders = new NameValueCollection(); if (requestHeaders != null ) { for ( int i = 0; i < requestHeaders.Count; i++) { var key = requestHeaders.Keys[i]; var value = requestHeaders[key]; if (key.StartsWith( "x-amz-" , StringComparison.InvariantCultureIgnoreCase)) { uploadIdRequestUrlRequestHeaders.Add(key, value); } } } var stringToSign = GetStringToSign(uploadIdRequestUrl, "POST" , string .Empty, contentType, requestDate, requestHeaders); var signatureForUploadId = CreateSignature(secretKey, stringToSign); uploadIdRequestUrlRequestHeaders.Add( "Authorization" , string .Format( "AWS {0}:{1}" , accessKey, signatureForUploadId)); var request = (HttpWebRequest)WebRequest.Create(uploadIdRequestUrl); request.Method = "POST" ; request.ContentLength = 0; request.Date = requestDate; request.ContentType = contentType; request.Headers.Add(uploadIdRequestUrlRequestHeaders); using ( var resp = (HttpWebResponse)request.GetResponse()) { using ( var s = new StreamReader(resp.GetResponseStream())) { var response = s.ReadToEnd(); XElement xe = XElement.Parse(response); } } return uploadId; } |
A few things to consider here:
- If you want to set the content type and other properties of the object, this is the step you would do that. After this step, you won’t be able to set the content type during the upload process.
- Same thing goes for setting custom metadata attributes as well.
Upload Part
Once we get the upload id, next step would be to upload parts. For the purpose of our demonstration, we will split the file in 5 MB parts. Here’s the code to upload all parts.
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | private static Dictionary< int , string > UploadParts( string accessKey, string secretKey, Uri requestUri, string uploadId, string filePath, DateTime expiryDate) { Dictionary< int , string > partNumberETags = new Dictionary< int , string >(); DateTime Jan1st1970 = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc); TimeSpan ts = new TimeSpan(expiryDate.Ticks - Jan1st1970.Ticks); var expiry = Convert.ToInt64(ts.TotalSeconds); var fileContents = File.ReadAllBytes(filePath); int fiveMB = 5 * 1024 * 1024; int partNumber = 1; var startPosition = 0; var bytesToBeUploaded = fileContents.Length; do { var bytesToUpload = Math.Min(fiveMB, bytesToBeUploaded); var partUploadUrl = new Uri( string .Format( "{0}?uploadId={1}&partNumber={2}" , requestUri.AbsoluteUri, HttpUtility.UrlEncode(uploadId), partNumber)); var partUploadSignature = CreateSignature(secretKey, GetStringToSign(partUploadUrl, "PUT" , string .Empty, string .Empty, expiry, null )); var partUploadPreSignedUrl = new Uri( string .Format( "{0}?uploadId={1}&partNumber={2}&AWSAccessKeyId={3}&Signature={4}&Expires={5}" , requestUri.AbsoluteUri, HttpUtility.UrlEncode(uploadId), partNumber, accessKey, HttpUtility.UrlEncode(partUploadSignature), expiry)); var request = (HttpWebRequest)WebRequest.Create(partUploadPreSignedUrl); request.Method = "PUT" ; request.Timeout = 1000 * 600; request.ContentLength = bytesToUpload; using ( var stream = request.GetRequestStream()) { stream.Write(fileContents, startPosition, bytesToUpload); } using ( var resp = (HttpWebResponse)request.GetResponse()) { using ( var s = new StreamReader(resp.GetResponseStream())) { partNumberETags.Add(partNumber, resp.Headers[ "ETag" ]); } } bytesToBeUploaded = bytesToBeUploaded - bytesToUpload; startPosition = bytesToUpload; partNumber = partNumber + 1; } while (bytesToBeUploaded > 0); return partNumberETags; } |
Complete Multipart Upload
Once all the parts are uploaded, you would need to finish the process by performing complete multipart upload process. Here’s the code to do that.
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | private static void FinishMultipartUpload( string accessKey, string secretKey, Uri requestUri, string uploadId, Dictionary< int , string > partNumberETags, DateTime expiryDate) { DateTime Jan1st1970 = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc); TimeSpan ts = new TimeSpan(expiryDate.Ticks - Jan1st1970.Ticks); var expiry = Convert.ToInt64(ts.TotalSeconds); var finishOrCancelMultipartUploadUri = new Uri( string .Format( "{0}?uploadId={1}" , requestUri.AbsoluteUri, uploadId)); var signatureForFinishMultipartUpload = CreateSignature(secretKey, GetStringToSign(finishOrCancelMultipartUploadUri, "POST" , string .Empty, "text/plain" , expiry, null )); var finishMultipartUploadUrl = new Uri( string .Format( "{0}?uploadId={1}&AWSAccessKeyId={2}&Signature={3}&Expires={4}" , requestUri.AbsoluteUri, HttpUtility.UrlEncode(uploadId), accessKey, HttpUtility.UrlEncode(signatureForFinishMultipartUpload), expiry)); StringBuilder payload = new StringBuilder(); payload.Append( "<?xml version=\"1.0\" encoding=\"utf-8\"?><CompleteMultipartUpload>" ); foreach ( var item in partNumberETags) { payload.AppendFormat( "<Part><PartNumber>{0}</PartNumber><ETag>{1}</ETag></Part>" , item.Key, item.Value); } payload.Append( "</CompleteMultipartUpload>" ); var requestPayload = Encoding.UTF8.GetBytes(payload.ToString()); var request = (HttpWebRequest)WebRequest.Create(finishMultipartUploadUrl); request.Method = "POST" ; request.ContentType = "text/plain" ; request.ContentLength = requestPayload.Length; using ( var stream = request.GetRequestStream()) { stream.Write(requestPayload, 0, requestPayload.Length); } using ( var resp = (HttpWebResponse)request.GetResponse()) { } } |
Complete Code
Here’s the complete code:
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 | using System; using System.Collections.Generic; using System.Collections.Specialized; using System.Globalization; using System.IO; using System.Linq; using System.Net; using System.Security.Cryptography; using System.Text; using System.Threading.Tasks; using System.Web; using System.Xml.Linq; namespace AmazonS3RestWrapper { class Program { private static string [] subResourcesToConsider = new string [] { "acl" , "lifecycle" , "location" , "logging" , "notification" , "partNumber" , "policy" , "requestPayment" , "torrent" , "uploadId" , "uploads" , "versionId" , "versioning" , "versions" , "website" , }; private static string [] overrideResponseHeadersToConsider = new string [] { "response-content-type" , "response-content-language" , "response-expires" , "response-cache-control" , "response-content-disposition" , "response-content-encoding" }; static void Main( string [] args) { var accessKey = "access key" ; var secretKey = "secret key" ; var requestUri = new Uri( "https://gaurav-test-bucket.s3-us-west-2.amazonaws.com/verylargefile.exe" ); var filePath = @"D:\verylargefile.exe" ; var expiryDate = DateTime.UtcNow.AddHours(12); var uploadId = InitiateMultipartUpload(accessKey, secretKey, requestUri, DateTime.UtcNow, "application/x-msdownload" , null ); var partNumberETags = UploadParts(accessKey, secretKey, requestUri, uploadId, filePath, expiryDate); FinishMultipartUpload(accessKey, secretKey, requestUri, uploadId, partNumberETags, expiryDate); Console.WriteLine( "File uploaded successfully. Press any key to terminate the application." ); Console.ReadLine(); } private static string GetStringToSign(Uri requestUri, string httpVerb, string contentMD5, string contentType, DateTime date, NameValueCollection requestHeaders) { var canonicalizedResourceString = GetCanonicalizedResourceString(requestUri); var canonicalizedAmzHeadersString = GetCanonicalizedAmzHeadersString(requestHeaders); var dateInStringFormat = date.ToString( "R" ); if (requestHeaders != null && requestHeaders.AllKeys.Contains( "x-amz-date" )) { dateInStringFormat = string .Empty; } var stringToSign = string .Format( "{0}\n{1}\n{2}\n{3}\n{4}{5}" , httpVerb, contentMD5, contentType, dateInStringFormat, canonicalizedAmzHeadersString, canonicalizedResourceString); return stringToSign; } private static string GetStringToSign(Uri requestUri, string httpVerb, string contentMD5, string contentType, long secondsSince1stJan1970, NameValueCollection requestHeaders) { var canonicalizedResourceString = GetCanonicalizedResourceString(requestUri); var canonicalizedAmzHeadersString = GetCanonicalizedAmzHeadersString(requestHeaders); var stringToSign = string .Format( "{0}\n{1}\n{2}\n{3}\n{4}{5}" , httpVerb, contentMD5, contentType, secondsSince1stJan1970, canonicalizedAmzHeadersString, canonicalizedResourceString); return stringToSign; } private static string GetCanonicalizedResourceString(Uri requestUri) { var host = requestUri.DnsSafeHost; var hostElementsArray = host.Split( '.' ); var bucketName = "" ; if (hostElementsArray.Length > 3) { StringBuilder sb = new StringBuilder(); for ( int i = 0; i < hostElementsArray.Length - 3; i++) { sb.AppendFormat( "{0}." , hostElementsArray[i]); } bucketName = sb.ToString(); if (bucketName.Length > 0) { if (bucketName.EndsWith( "." )) { bucketName = bucketName.Substring(0, bucketName.Length - 1); } bucketName = string .Format( "/{0}" , bucketName); } } var subResourcesList = subResourcesToConsider.ToList(); var overrideResponseHeadersList = overrideResponseHeadersToConsider.ToList(); StringBuilder canonicalizedResourceStringBuilder = new StringBuilder(); canonicalizedResourceStringBuilder.Append(bucketName); canonicalizedResourceStringBuilder.Append(requestUri.AbsolutePath); NameValueCollection queryVariables = HttpUtility.ParseQueryString(requestUri.Query); SortedDictionary< string , string > queryVariablesToConsider = new SortedDictionary< string , string >(); SortedDictionary< string , string > overrideResponseHeaders = new SortedDictionary< string , string >(); if (queryVariables != null && queryVariables.Count > 0) { var numQueryItems = queryVariables.Count; for ( int i = 0; i < numQueryItems; i++) { var key = queryVariables.GetKey(i); var value = queryVariables[key]; if (subResourcesList.Contains(key)) { if (queryVariablesToConsider.ContainsKey(key)) { var val = queryVariablesToConsider[key]; queryVariablesToConsider[key] = string .Format( "{0},{1}" , value, val); } else { queryVariablesToConsider.Add(key, value); } } if (overrideResponseHeadersList.Contains(key)) { overrideResponseHeaders.Add(key, HttpUtility.UrlDecode(value)); } } } if (queryVariablesToConsider.Count > 0 || overrideResponseHeaders.Count > 0) { StringBuilder queryStringInCanonicalizedResourceString = new StringBuilder(); queryStringInCanonicalizedResourceString.Append( "?" ); for ( int i = 0; i < queryVariablesToConsider.Count; i++) { var key = queryVariablesToConsider.Keys.ElementAt(i); var value = queryVariablesToConsider.Values.ElementAt(i); if (! string .IsNullOrWhiteSpace(value)) { queryStringInCanonicalizedResourceString.AppendFormat( "{0}={1}&" , key, value); } else { queryStringInCanonicalizedResourceString.AppendFormat( "{0}&" , key); } } for ( int i = 0; i < overrideResponseHeaders.Count; i++) { var key = overrideResponseHeaders.Keys.ElementAt(i); var value = overrideResponseHeaders.Values.ElementAt(i); queryStringInCanonicalizedResourceString.AppendFormat( "{0}={1}&" , key, value); } var str = queryStringInCanonicalizedResourceString.ToString(); if (str.EndsWith( "&" )) { str = str.Substring(0, str.Length - 1); } canonicalizedResourceStringBuilder.Append(str); } return canonicalizedResourceStringBuilder.ToString(); } private static string GetCanonicalizedAmzHeadersString(NameValueCollection requestHeaders) { var canonicalizedAmzHeadersString = string .Empty; if (requestHeaders != null && requestHeaders.Count > 0) { StringBuilder sb = new StringBuilder(); SortedDictionary< string , string > sortedRequestHeaders = new SortedDictionary< string , string >(); var requestHeadersCount = requestHeaders.Count; for ( int i = 0; i < requestHeadersCount; i++) { var key = requestHeaders.Keys.Get(i); var value = requestHeaders[key].Trim(); key = key.ToLowerInvariant(); if (key.StartsWith( "x-amz-" , StringComparison.InvariantCultureIgnoreCase)) { if (sortedRequestHeaders.ContainsKey(key)) { var val = sortedRequestHeaders[key]; sortedRequestHeaders[key] = string .Format( "{0},{1}" , val, value); } else { sortedRequestHeaders.Add(key, value); } } } if (sortedRequestHeaders.Count > 0) { foreach ( var item in sortedRequestHeaders) { sb.AppendFormat( "{0}:{1}\n" , item.Key, item.Value); } canonicalizedAmzHeadersString = sb.ToString(); } } return canonicalizedAmzHeadersString; } private static string CreateSignature( string secretKey, string stringToSign) { byte [] dataToSign = Encoding.UTF8.GetBytes(stringToSign); using (HMACSHA1 hmacsha1 = new HMACSHA1(Encoding.UTF8.GetBytes(secretKey))) { return Convert.ToBase64String(hmacsha1.ComputeHash(dataToSign)); } } private static string InitiateMultipartUpload( string accessKey, string secretKey, Uri requestUri, DateTime requestDate, string contentType, NameValueCollection requestHeaders) { var uploadId = string .Empty; var uploadIdRequestUrl = new Uri( string .Format( "{0}?uploads=" , requestUri.AbsoluteUri)); var uploadIdRequestUrlRequestHeaders = new NameValueCollection(); if (requestHeaders != null ) { for ( int i = 0; i < requestHeaders.Count; i++) { var key = requestHeaders.Keys[i]; var value = requestHeaders[key]; if (key.StartsWith( "x-amz-" , StringComparison.InvariantCultureIgnoreCase)) { uploadIdRequestUrlRequestHeaders.Add(key, value); } } } var stringToSign = GetStringToSign(uploadIdRequestUrl, "POST" , string .Empty, contentType, requestDate, requestHeaders); var signatureForUploadId = CreateSignature(secretKey, stringToSign); uploadIdRequestUrlRequestHeaders.Add( "Authorization" , string .Format( "AWS {0}:{1}" , accessKey, signatureForUploadId)); var request = (HttpWebRequest)WebRequest.Create(uploadIdRequestUrl); request.Method = "POST" ; request.ContentLength = 0; request.Date = requestDate; request.ContentType = contentType; request.Headers.Add(uploadIdRequestUrlRequestHeaders); using ( var resp = (HttpWebResponse)request.GetResponse()) { using ( var s = new StreamReader(resp.GetResponseStream())) { var response = s.ReadToEnd(); XElement xe = XElement.Parse(response); } } return uploadId; } private static Dictionary< int , string > UploadParts( string accessKey, string secretKey, Uri requestUri, string uploadId, string filePath, DateTime expiryDate) { Dictionary< int , string > partNumberETags = new Dictionary< int , string >(); DateTime Jan1st1970 = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc); TimeSpan ts = new TimeSpan(expiryDate.Ticks - Jan1st1970.Ticks); var expiry = Convert.ToInt64(ts.TotalSeconds); var fileContents = File.ReadAllBytes(filePath); int fiveMB = 5 * 1024 * 1024; int partNumber = 1; var startPosition = 0; var bytesToBeUploaded = fileContents.Length; do { var bytesToUpload = Math.Min(fiveMB, bytesToBeUploaded); var partUploadUrl = new Uri( string .Format( "{0}?uploadId={1}&partNumber={2}" , requestUri.AbsoluteUri, HttpUtility.UrlEncode(uploadId), partNumber)); var partUploadSignature = CreateSignature(secretKey, GetStringToSign(partUploadUrl, "PUT" , string .Empty, string .Empty, expiry, null )); var partUploadPreSignedUrl = new Uri( string .Format( "{0}?uploadId={1}&partNumber={2}&AWSAccessKeyId={3}&Signature={4}&Expires={5}" , requestUri.AbsoluteUri, HttpUtility.UrlEncode(uploadId), partNumber, accessKey, HttpUtility.UrlEncode(partUploadSignature), expiry)); var request = (HttpWebRequest)WebRequest.Create(partUploadPreSignedUrl); request.Method = "PUT" ; request.Timeout = 1000 * 600; request.ContentLength = bytesToUpload; using ( var stream = request.GetRequestStream()) { stream.Write(fileContents, startPosition, bytesToUpload); } using ( var resp = (HttpWebResponse)request.GetResponse()) { using ( var s = new StreamReader(resp.GetResponseStream())) { partNumberETags.Add(partNumber, resp.Headers[ "ETag" ]); } } bytesToBeUploaded = bytesToBeUploaded - bytesToUpload; startPosition = bytesToUpload; partNumber = partNumber + 1; } while (bytesToBeUploaded > 0); return partNumberETags; } private static void FinishMultipartUpload( string accessKey, string secretKey, Uri requestUri, string uploadId, Dictionary< int , string > partNumberETags, DateTime expiryDate) { DateTime Jan1st1970 = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc); TimeSpan ts = new TimeSpan(expiryDate.Ticks - Jan1st1970.Ticks); var expiry = Convert.ToInt64(ts.TotalSeconds); var finishOrCancelMultipartUploadUri = new Uri( string .Format( "{0}?uploadId={1}" , requestUri.AbsoluteUri, uploadId)); var signatureForFinishMultipartUpload = CreateSignature(secretKey, GetStringToSign(finishOrCancelMultipartUploadUri, "POST" , string .Empty, "text/plain" , expiry, null )); var finishMultipartUploadUrl = new Uri( string .Format( "{0}?uploadId={1}&AWSAccessKeyId={2}&Signature={3}&Expires={4}" , requestUri.AbsoluteUri, HttpUtility.UrlEncode(uploadId), accessKey, HttpUtility.UrlEncode(signatureForFinishMultipartUpload), expiry)); StringBuilder payload = new StringBuilder(); payload.Append( "<?xml version=\"1.0\" encoding=\"utf-8\"?><CompleteMultipartUpload>" ); foreach ( var item in partNumberETags) { payload.AppendFormat( "<Part><PartNumber>{0}</PartNumber><ETag>{1}</ETag></Part>" , item.Key, item.Value); } payload.Append( "</CompleteMultipartUpload>" ); var requestPayload = Encoding.UTF8.GetBytes(payload.ToString()); var request = (HttpWebRequest)WebRequest.Create(finishMultipartUploadUrl); request.Method = "POST" ; request.ContentType = "text/plain" ; request.ContentLength = requestPayload.Length; using ( var stream = request.GetRequestStream()) { stream.Write(requestPayload, 0, requestPayload.Length); } using ( var resp = (HttpWebResponse)request.GetResponse()) { } } } } |
Some Thoughts
Having worked with both Windows Azure (extensively I might add) and Amazon S3 (I’m just starting with it), I find uploading large files rather easy in Windows Azure. Here’re my reasons:
- Windows Azure lets you split the file in any chunk size whereas with Amazon S3, the part size must be a minimum of 5 MB. IMHO, having this large part size restriction could lead to more timeout exceptions. Considering the Internet speed here in India, it was quite painful to wait for 5 MB part to upload (Yeah, it’s that bad :)).
- With Windows Azure, you just create an upload Shared Access Signature on the blob container and you can upload any file there. However with Amazon S3, creating a Pre Signed URL on bucket won’t work. Your Pre Signed URL must include object key. Things get complicated if you’re uploading the file in parts. In that case when you create Pre Signed URL, you need to pass both Upload Id and Part Number in the signature creation process.
Summary
That’s it for this post. I hope you have found it useful. As I’m still learning Amazon S3, it is highly likely that I may have made some mistakes. If that’s the case, please let me know and I’ll fix them ASAP.
Happy Coding!!!