Gaurav Mantri's Personal Blog.

Azure Search Service – Some Documented/Undocumented Business Rules

As you may already know, for past 2 weeks we have been implementing support for Azure Search Service in Cloud Portam. We released a newer version yesterday (more on this here). Currently there is no SDK available for this service and we ended up implementing REST API for Azure Search Service. During the course of development, we learned a lot about the service and discovered a number of business rules. If you’re trying to write code to consume this REST API, you may find this blog post useful as it will (hopefully) save roundtrips to the server only to find out that you passed incorrect data.

Terminology

Let’s take a moment and talk about some of the terms used in Azure Search. Please note that this is my understanding and my interpretation may be wrong (if that’s the case, please let me know and I will update the post).

Index

I would like to think of an Index as “Search Catalog”. If you’re coming from a relational database world, I would like you to think of an index as a table in a relational database. Essentially it is a container for data you want to search along with some additional properties to influence the search results.

Index Fields

Index Fields define the schema for an index. Again if you’re coming from a relational database world, I would like you to think of index fields as columns in a table. A column in a table is of a certain data type and has certain attributes (like Primary Key, Nullable etc.) and similarly an index field is of a certain data type and has certain attributes (e.g. Key, Searchable etc.)

Index Scoring Profile

This is where things get interesting :) [read: I really didn’t understand it completely :)]. A scoring profile to me is a way to influence search result and show search results in a different order. For example, let’s say you have an e-commerce website where you’re using Azure Search and a user searches for say “laptop”. Now assuming you wish to get rid of your old inventory first. What you could do is define scoring profile in such a way that you should old inventory towards the top of your search results.

Index Scoring Profile Weights

This is part of Index Scoring Profile and is used to assign different weights to different searchable fields in your index.

Index Scoring Profile Functions

This is again part of Index Scoring Profile and is used to alter search ranking for items based on values in applicable fields.

Rules

Now that we’ve covered some basic terminology, let’s look at some of the rules we discovered. Please note that at the time of writing the blog, service version is 2014-07-31-Preview and the rules I mentioned below are for this version. Please check MSDN Documentation for the latest service version.

Index Fields

Some of the rules we discovered as applicable to index fields:

Data Type / Properties Matrix

Following table shows index field data type and the properties applicable.

Key Searchable Filterable Sortable Facetable Suggestions Retrievable
Edm.String Yes Yes Yes Yes Yes Yes Yes
Collection(Edm.String) No Yes Yes No Yes Yes Yes
Edm.Int32 No No Yes Yes Yes No Yes
Edm.Double No No Yes Yes Yes No Yes
Edm.Boolean No No Yes Yes Yes No Yes
Edm.DateTimeOffset No No Yes Yes Yes No Yes
Edm.GeographyPoint No No Yes Yes No No Yes

 

The way you would read this matrix is by asking this question: Can an index field of XYX data type have ABC property? For example, Can an index field of “Edm.DateTimeOffset” data type be “Seachable”? The answer is “No”. Any attempt to set this property to true will result in a bad request error thrown from the server.

Key Property

Think of “Key” property as “Primary Key” for the index. We discovered following rules when it came to key property:

  • An index must have a key field or in other words, primary key is required.
  • There can be only one key field per index or in other words an index can’t have composite primary key.
  • Key field must be of “Edm.String” data type. No other data types are supported for key field at the time of writing of this blog.

Some Other things

  • Currently once an index is created, you can’t edit or delete any of the existing index fields. Through update operation, you can add new fields though.
  • From what I have been told, currently only “String” collections are supported through “Collection (Edm.String) data type.
  • Currently there’s no limit on the number of fields an index can have however to future safe your application in case search service announces the limit, it is better only to add fields which are really required for searching.

 

Index Scoring Profile Weights

These are the things we discovered for scoring profile weights:

  • Weights can only be defined on “Searchable” fields.
  • Weight value must be a positive number.

Index Scoring Profile Functions

This is where we had most fun :). These are the things we discovered for scoring profile functions:

  • Scoring profile functions can only be defined on “Filterable” fields.
  • For each scoring profile function, you have to define a boosting value which should be a positive number but should not have a value of one (1).
  • There are three types of functions supported currently – Distance, Freshness and Magnitude.
  • Magnitude type functions can only be defined on fields of “Edm.Int32” and “Edm.Double” data type.
  • Freshness type functions can only be defined on fields of “Edm.DateTimeOffset” data type.
  • Distance type functions can only be defined on fields of “Edm.GeographyPoint” data type.
  • For Freshness type function, you must define boosting duration which represents a Timespan. You would need to define that in “P[nD][T[nH][nM][nS]]” format e.g. P1DT12H32M50.345S.
  • For Distance type function, you must define a boosting distance which should be a positive number greater than zero (0). Please note that the boosting distance is in Kilometers (so if you’re habitual of using miles, just multiply that number by 1.6 to get Kilometers :)).
  • For Magnitude type function, you must define numerical values for boosting range start and end. In our tests, we could not find any limitations other than numerical on the values you put in for boosting range start and end.

Some Regular Expressions

Since we wanted to enforce these rules on the client side only, we ended up writing some regular expressions which I am including below. Please note that I am no RegEx Ninja :) and most of the stuff I put below is copied from Stack Overflow and modified to suit our needs. If you find any issues with these or have a better way of doing it, please let me know.

Index Field Name

/(^(?!azureSearch))(^[a-zA-Z]([a-zA-Z0-9_]*)$)/

Index Scoring Profile Weight

/^\+?(\d*[1-9]\d*\.?|\d*\.\d*[1-9]\d*)$/

CORS Max Age (in Seconds)

/^[1-9][0-9]*$/

Index Scoring Profile Function Boost

/^\+?(\d*[1-9]\d*\.?|\d*\.\d*[1-9]\d*)$/

Index Scoring Profile Distance Function – Reference Point Parameter

/^[a-zA-Z][a-zA-Z0-9]{1,15}$/

Index Scoring Profile Distance Function – Boosting Distance

/^((\d*\.)?)\d+$/

Index Scoring Profile Freshness Function – Number of Days

/^\d{0,3}$/

Index Scoring Profile Freshness Function – Number of Hours

/(^[1-9]$)|(^[01][0-9]$)|(^2[0-3]$)/

Index Scoring Profile Freshness Function – Number of Minutes/Seconds

/(^[1-9]$)|(^[0-5][0-9]$)/

Index Scoring Profile Freshness Function – Number of Milliseconds

/(^[1-9]$)|(^[0-5][0-9]$)/

Index Scoring Profile Magnitude Function – Boosting Range Start/End

/^([-+]?(\d*\.)?)\d+$/

Summary

That’s it for this post. As you start building search applications using Azure Search Service, I hope you will find this post useful … Or … you could simply use Cloud Portam where we have already taken care of these rules(How’s that for a sales pitch :)).

Jokes apart, I do hope that you find this post useful. As always, if you find any issues with the post please let me know ASAP so that I can fix that.

Happy Searching with Azure Search Service and Happy Coding!!!

Cloud Portam Now Supports Azure Search Service

SNAGHTML764cdcc.png

I’m pleased to announce that today we released a new version of Cloud Portam which includes support for Azure Search Service which was announced recently. This blog post talks about Azure Search Service features currently supported in the … [Continue reading]

Making Azure Blob Storage Searchable Using Azure Search Service

Every now and then question about searching in Azure Blob Storage comes up in one of the community forums. Folks posting the questions would like to find out things like: Finding blobs containing certain letters in them e.g. all pdf or png files … [Continue reading]

Announcing The Launch Of Cloud Portam – A Browser Based Azure Storage Explorer

image.png

Its’ been quite some time that I blogged. Past few months have been crazy busy for me. After the failure of my last product (anybody remember CloudFileXchange thingie we did :)), I spent a lot of time thinking about what to build and then spent more … [Continue reading]

What Startups Can Learn From Aam Aadmi Party

For those who have been closely watching Indian Parliament elections concluded recently, you already know who Aam Aadmi Party (AAP) is. For those who don’t know, AAP is the youngest political party which came into existence in 2012 as a result of … [Continue reading]

Me and My Boss – The Saga Continues With Basic Tier Azure Virtual Machines

Man, my boss is always on my case. After what I did with Azure Mobile Services with him, he’s always after me. So this Friday, he calls me up in his office and following conversation happens between us: Boss: The Virtual Machines we’re hosting in … [Continue reading]

C# Corner Annual Developer Conference 2014 – Saluting the Spirit of Community Power

Over this weekend (April 11 – 13th 2014), I had the honor and privilege of attending and presenting at C# Corner Annual Developer Conference. This blog post talks about my experience there. In my opinion (and is shared by all who attended), the event … [Continue reading]

Managing Storage Emulator in Azure SDK Version 2.3

image.png

Yesterday (3rd of April), a new version of Azure SDK (version 2.3) was released. Being an Azure Fanboy :) (and not to mention serious Azure developer) I went ahead and downloaded this new SDK and converted my cloud services project to make use of … [Continue reading]

Using Windows Azure Access Control Service (ACS) in MVC 5 Application Using Visual Studio 2013

image.png

It has been a while that I wrote a technical post:). Today I ran into a post on Stack Overflow and that made me write this post. If you have used Visual Studio 2012, you would know how easy it is to implement Windows Azure Access Control Service … [Continue reading]

Things I learned from Sanjaya Singhal

sjs.png

We draw inspiration from people around us. Over the years my thinking process and professional behavior is defined by the behavior of a few individuals. In this post, I will talk about one such great man who has kept on inspiring me for past 20 odd … [Continue reading]