Gaurav Mantri's Personal Blog.

Effective way of fetching diagnostics data from Windows Azure Diagnostics Table (Hint: Use PartitionKey)

As we all know Windows Azure Diagnostics data is stored in Windows Azure Storage. Depending on the kind of diagnostics data being collected, it could either store in Table Storage (WAD* tables e.g. WADLogsTable table for trace logs data) or Blob Storage (wad-* blob containers e.g. wad-crash-dumps for crash dumps).

In this blog post, we will talk about one of the most common mistake I have come across when fetching the diagnostics data. For the sake of explaining, let’s take WADLogsTable. Here is the structure of WADLogsTable with some sample data:

Attribute Name Attribute Type Sample Data
PartitionKey String 0634012319400000000
RowKey String 43b7f32f389648639f16b55c6dcd7c4b___WorkerRole…
Timestamp Date/Time 2010-02-08T13:20:48.2487780Z
EventTickCount Int64 (Long) 634012319404982640
DeploymentId String 43b7f32f389648639f16b55c6dcd7c4b
Role String WorkerRole1
RoleInstance String WorkerRole1_IN_0
Level Int32 (Int) 5
EventId Int32 (Int) 0
Pid Int32 (Int) 1652
Tid Int32 (Int) 2128
Message String Worker Role is working.: Working

Now, if we want to fetch the data for say last 5 minutes our natural instinct would be to query based on Timestamp because this attribute’s value specifies the Date/Time in UTC when this entity (or Row) was created.

THIS IS BAD! IN FACT, THIS IS REAL BAD!!!

Here are the reasons you should not do it (in the order of least-to-most significant reasons):

  • Firstly, Timestamp tells you when the entity was created in your storage account. It does not tell you when the log entry was captured by the diagnostics engine running in your VM. So if you’re transferring data every 15 minutes from your VM to your storage account and you query based on Timestamp you will not get proper results. Instead a better alternative is to use EventTickCount. But you should not do that either.
  • More importantly, Azure Tables only support indexing on PartitionKey and RowKey. All other attributes are not indexed. What this means is that if you query on any attributes other than PartitionKey and RowKey, Azure Table Storage service will do full table scan going from one Partition to another till the time matching values are retrieved. What this means is that depending on the size of your table, you query may take a few seconds to return or could take hours to return. I have had seen a few instances where folks have done query on Timestamp attribute and complained that Azure Table Service is extremely slow and takes a long time to return diagnostics data for last 15 minutes.

But I want the data for last 5 minutes?

Here’s where smart guys from Microsoft come into picture. Kudos to the team who designed the way diagnostics will be stored. Let’s take a look at the sample data above especially PartitionKey (0634012319400000000) value and compare that with EventTickCount value (634012319404982640). If you look closely, they look very similar. If you look more closely, both of them represents Tick counts. If I execute following statements, first Console.Writeline will write “2010-02-08 13:19:00.000000” while the second one would write “2010-02-08 13:19:00.498264”

   
DateTime dateTimeFromPartitionKeyValue = new DateTime(0634012319400000000);
//Writes 2010-02-08 13:19:00.000000
Console.WriteLine(dateTimeFromPartitionKeyValue.ToString("yyyy-MM-dd HH:mm:ss.ffffff"));
DateTime dateTimeFromEventTickCountValue = new DateTime(634012319404982640);
//Writes 2010-02-08 13:19:00.498264
Console.WriteLine(dateTimeFromEventTickCountValue.ToString("yyyy-MM-dd HH:mm:ss.ffffff"));

What this tells us is that effectively PartitionKey value represents the Date/Time value when the event was logged. It actually has the precision of a minute i.e. all the logs data collected in one minute will share same PartitionKey.

So coming back to the question of how you would fetch the diagnostics data for last 5 minutes and you’re using Storage Client library, here is what your code should look like:

   
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString");
CloudTableClient cloudTableClient = storageAccount.CreateCloudTableClient();
TableServiceContext serviceContext = cloudTableClient.GetDataServiceContext();
IQueryable<TraceLogsEntity> traceLogsTable = serviceContext.CreateQuery<TraceLogsEntity>("WADLogsTable");
var selection = from row in traceLogsTable where row.PartitionKey.CompareTo("0" + DateTime.UtcNow.AddMinutes(-5.0).Ticks) >= 0 select row;
CloudTableQuery<TraceLogsEntity> query = selection.AsTableServiceQuery<TraceLogsEntity>();
IEnumerable<TraceLogsEntity> result = query.Execute();

If you’re making use of REST API to fetch this data, here is what you would need to do:

  1. Get the current date/time ticks in UTC using DateTime.UtcNow.Ticks (let’s say it returns 634012319404982640)
  2. Prepend a “0” in front of it. So now the value would be 0634012319404982640.
  3. Use the following query ($filter) criteria: PartitionKey ge ‘0634012319404982640’.

I hope this helps.

If there are any comments or suggestions, feel free to comment.


[This is the latest product I'm working on]