Gaurav Mantri's Personal Blog.

Azure Cosmos DB and Node SDK – Part III: Working with Documents

In the previous posts in this series, we saw how you can work with databases and containers in Cosmos DB using their Node SDK. You can read those posts here:

In this post, we are going to see how you can work with documents contained inside a container using Node SDK. We will see how you can perform CRUD operations on documents and talk about querying the documents.

Before We Begin

There are certain things we need to do before we can get started:

  • Please ensure that you’ve followed all steps identified in “Before We Begin” section in the previous post.
  • Please create a database in your account. Again you can refer to the code for creating a database in the previous post. For this post, I created a database called “mydatabase”.
  • Please create a container in this database. For this post, we’re going to store information about users in this container. Please ensure that the container created is a “Partitioned” container with “/address/stateCode” partition key (we’re going to logically partition our users based on the state in which they live). The container can be created with minimum (default) throughput (400 RU/s). For this post, I created a container called “users” inside “mydatabase” database.

Once the database and container are created, create a new file called “document-samples.js” and add following lines of code there:

const {Promise} = require('bluebird');
const {CosmosClient} = require('@azure/cosmos');

const accountEndpoint = 'https://account-name.documents.azure.com:443/';
const accountKey = 'yM0g3KnPANPpBgKLi34OMz1UZ7Png2pjQrs209IrrQkyhtqZKmALludel1nizEOqeJMm1gavLb0dS0gAoMw3Pw==';
const databaseId = 'mydatabase';
const containerId = 'users';

/**
** Method to get client connection object.
**/
const getClient = () => {
   return new CosmosClient({
     endpoint: accountEndpoint,
     auth: {
       masterKey: accountKey
     }
   });
 };

Please make sure to use the values for your Cosmos DB account credentials. Also, for this post please ensure that your Cosmos DB account is targeting SQL API as the SDK is for SQL API only.

We’re now all set to move forward!

Oh, and one more thing. Because we will be including code for both async/await and promises, we will just prefix the method name with the approach we’re using like we did in the previous posts. For example, “createDocumentAsync” and “createDocumentPromise” for async/await and promise respectively.

Create Document

First, let’s see how we can create a document in a container. Essentially a document is a JSON object. Let’s assume we want to create following document in the container:

const doc = {
  "id": "0000000001",
  "firstName": "John",
  "lastName": "Smith",
  "address": {
    "street": "123 Main Road",
    "city": "Columbia",
    "state": "MD",
    "zipCode": "21045"
  },
  "phone": "XXX-XXX-XXXX",
  "email": "john.smith@something.com",
  "ssn": "XXX-XXX-XXXX"
};

What we’re trying to do is creating a document with “id” as “0000000001” and partition key as “MD”.

Using Async/Await

Here’s the code to create a document using async/await:

const createDocumentAsync = async (doc) => {
  const client = getClient();
  const database = client.database(databaseId);
  const container = database.container(containerId);
  const result = container.items.create(doc);
  return result;
};

First thing we’re doing here is getting an instance of CosmosClient. After that we’re getting an instance of Database class using database() method of client. Then we’re getting an instance of Container class using container method of database. Finally we’re calling create method on items property of the container and passing the document that we want to create.

The output of this method is an object that has following key members:

  • body: This contains the system properties of the document like _rid, _self, _etag, _ts etc. along with custom properties we have defined in the document like firstName, lastName etc.
  • headers: This contains the response headers.
  • item: This actually is an instance of Item class. You will need to use this object if you want to perform any operation on the document like reading its’ properties, deleting etc.

Using Promise

And here’s the code to do so if you were to use promise:

const createDocumentPromise = (doc) => {
  return new Promise((resolve, reject) => {
    const client = getClient();
    const database = client.database(databaseId);
    const container = database.container(containerId);
    container.items.create(doc)
    .then((result) => {
      resolve(result);
    })
    .catch((error) => {
      reject(error);
    });
  });
};

Few Things About “id” Field/Attribute

In the document definition above you noticed that we defined an “id” field/attribute. A few things I want to bring to your attention:

  • Unique value in a partition: This uniquely identifies a document and is required. However, it uniquely identifies a document inside a logical partition. What that means is that inside a logical partition, no two documents can have same “id” value. However two documents in different partitions can have same “id” value.
  • Similar to “RowKey” in Azure Table: If you’ve used Azure Table Storage in past, this is similar to “RowKey” property of an entity which uniquely identifies an entity in a partition.
  • Immutable: This value is immutable i.e. once a document is assigned an id, the value of the id attribute can’t be changed. You must delete an existing document and create a new document to change the id.

Auto “id” Generation

As mentioned above, each document must have this “id” attribute. Node SDK has this interesting thing where you can instruct Cosmos DB engine to auto generate an id for you. To do so, you just have to make use of disableAutomaticIdGeneration property of the RequestOptions and pass false value for it. Here’s the code snipped to do so:

  const result = container.items.create(doc, {
    disableAutomaticIdGeneration: false
  });

Creating Document Without PartitionKey

What happens if you forgot to specify the partition key in the document and try to save it? For example, consider the document below:

const doc = {
  "id": "0000000001",
  "firstName": "John",
  "lastName": "Smith",
  "address": {
    "street": "123 Main Road",
    "city": "Columbia",
    "zipCode": "21045"
  },
  "phone": "XXX-XXX-XXXX",
  "email": "john.smith@something.com",
  "ssn": "XXX-XXX-XXXX"
};

Now you would imagine that you’ll get an error while saving this document. Unfortunately (or fortunately), you won’t get any error and the document will be created successfully. Neither the REST API nor the SDK will prevent you from performing such operation (Cerulean will, BTW :)).

Considering our container is a partitioned container and each document must go in one or the other logical partition, the question comes in which partition does this particular document go?

Well to answer your question, Cosmos DB engine puts such documents where no partition key value is specified in a special partition and the key for that partition is {}. You will need to use this partition key value when fetching such documents.

Update Document

Next, let’s see how you can update a document. For updating a document in a partitioned container, you will need two things:

  1. “id” of the document: You will need the “id” of the document that you wish to update.
  2. Updated “body” of the document: This will be a JSON object with custom properties that you wish to update.

Using Async/Await

Here’s the code to update a document using async/await.

const updateDocumentAsync = async (id, documentBody) => {
  const client = getClient();
  const database = client.database(databaseId);
  const container = database.container(containerId);
  const documentToUpdate = container.item(id);
  const result = documentToUpdate.replace(documentBody);
  return result;
};

First thing we’re doing here is getting an instance of CosmosClient. After that we’re getting an instance of Database class using database() method of client. Then we’re getting an instance of Container class using container method of database. Then we’re creating an instance of Item class using item()method. Finally we’re calling replace() method and passing the modified document body.

Using Promise

And here’s the code to do so if you were to use promise:

const updateDocumentPromise = (id, documentBody) => {
  return new Promise((resolve, reject) => {
    const client = getClient();
    const database = client.database(databaseId);
    const container = database.container(containerId);
    const documentToUpdate = container.item(id);
    documentToUpdate.replace(documentBody)
    .then((result) => {
      resolve(result);
    })
    .catch((error) => {
      reject(error);
    });
  });
};

A Few More Things

There are a few more things I want to mention about updating a document.

  • Update operation will fail if the document inside the partition doesn’t exist.
  • You can’t change the value of “id” attribute of the document with this operation. As mentioned above, you will need to first delete the document and create a new document with different “id”.
  • You can’t change the value of “partition key” attribute of the document with this operation. You will need to first delete the document and create a new document with different “partition key”.

Create or Update Document

Create document operation will create the document if it doesn’t exist (and will fail if it exists) while update document will update the document if it exists (and will fail if it doesn’t exist).

How about handling a situation where you want document to be created if it doesn’t exist but update it if it exists? One clunky way to deal with it is try to create the document, catch the exception (Conflict error) and then try to update it? Ugly, right?

This is where Upsert (short form for Update or Insert)comes into picture. This will update the document if it exists otherwise a new document will be created.

Using Async/Await

Here’s the code to upsert a document using async/await.

const upsertDocumentAsync = async (doc) => {
  const client = getClient();
  const database = client.database(databaseId);
  const container = database.container(containerId);
  const result = container.items.upsert(doc);
  return result;
};

The code above is very much like that for creating a document. Instead of calling create()method, we’re simply calling upsert()method. It’s that simple!

Using Promise

And here’s the code to do so if you were to use promise:

const upsertDocumentPromise = (doc) => {
  return new Promise((resolve, reject) => {
    const client = getClient();
    const database = client.database(databaseId);
    const container = database.container(containerId);
    container.items.upsert(doc)
    .then((result) => {
      resolve(result);
    })
    .catch((error) => {
      reject(error);
    });
  });
};

One More Thing

One important thing to understand is that using this functionality to update the value of either “id” or “partition key” attribute. Because a document is uniquely identified by a combination of the value of it’s “id” and “partition key” attribute, changing any of the value will always result in creation of the new document.

In order to change any of these values, you will need to first delete the document and then create a new document.

Get Document Properties

Now let’s see how we can read properties of a document. To get the properties of a document, you will need two things:

  • “id” of the document: You will need the “id” of the document that you wish to update.
  • “Partition key” of the document: You will need the value of the “partition key” attribute of the document.

Using Async/Await

Here’s the code to get a document’s properties using async/await.

const getDocumentPropertiesAsync = async (id, partitionKey) => {
  const client = getClient();
  const database = client.database(databaseId);
  const container = database.container(containerId);
  const document = container.item(id);
  const requestOptions = {
    partitionKey: partitionKey
  };
  const result = document.read(requestOptions);
  return result;
};

First thing we’re doing here is getting an instance of CosmosClient. After that we’re getting an instance of Database class using database() method of client. Then we’re getting an instance of Container class using container method of database. Then we’re creating an instance of Item class using item()method. Finally we’re calling read() method and passing the partition key value in the request options.

Using Promise

And here’s the code to do so if you were to use promise:

const getDocumentPropertiesPromise = (id, partitionKey) => {
  return new Promise((resolve, reject) => {
    const client = getClient();
    const database = client.database(databaseId);
    const container = database.container(containerId);
    const document = container.item(id);
    const requestOptions = {
      partitionKey: partitionKey
    };
    document.read(requestOptions)
    .then((result) => {
      resolve(result);
    })
    .catch((error) => {
      reject(error);
    });
  });
};

Delete Document

Now let’s see how we can delete a document from a container. To delete a document, you will need two things (same as reading the properties of a document):

  • “id” of the document: You will need the “id” of the document that you wish to update.
  • “Partition key” of the document: You will need the value of the “partition key” attribute of the document.

Using Async/Await

Here’s the code to delete a document properties using async/await.

const deleteDocumentAsync = async (id, partitionKey) => {
  const client = getClient();
  const database = client.database(databaseId);
  const container = database.container(containerId);
  const document = container.item(id);
  const requestOptions = {
    partitionKey: partitionKey
  };
  const result = document.delete(requestOptions);
  return result;
};

If you notice, the code is very similar to that of reading the properties of a document with just one difference. Here we’re calling the delete()method to delete a document.

Using Promise

And here’s the code to do so if you were to use promise:

const deleteDocumentPromise = (id, partitionKey) => {
  return new Promise((resolve, reject) => {
    const client = getClient();
    const database = client.database(databaseId);
    const container = database.container(containerId);
    const document = container.item(id);
    const requestOptions = {
      partitionKey: partitionKey
    };
    document.delete(requestOptions)
    .then((result) => {
      resolve(result);
    })
    .catch((error) => {
      reject(error);
    });
  });
};

Query Documents

Lastly we come to the most important part – how to query the documents that are already there in your container.

As you may already know, Cosmos DB provides SQL like querying to query the documents stored in a container in accounts targeting SQL API (hence the name, I guess).

We’re not going to cover how to write these queries. You can find plenty examples of that online. What we will do instead is provide code for executing queries and then take some use cases.

Using Async/Await

Here’s the code to query documents using async/await.

const queryDocumentsAsync = async (query, feedOptions) => {
  const client = getClient();
  const database = client.database(databaseId);
  const container = database.container(containerId);
  const result = await container.items.query(query, feedOptions).executeNext();
  return result;
};

First thing we’re doing here is getting an instance of CosmosClient. After that we’re getting an instance of Database class using database() method of client. Then we’re getting an instance of Container class using container method of database. Finally we’re calling query()method on items property of the container and passing the query and feed options.

The method expects two parameters: query and feedOptions.

Query parameter can either be a simple SQL query like:

const query = "Select * from Root r Where r.address.state = 'MD'";

Or it could be an instance of SqlQuerySpec in case you want to write parameterized queries like:

  const querySpec = {
    query: "Select * from Root r Where r.address.state = @state",
    parameters: [
      {name: '@state', value: 'MD'}
    ]
  };

feedOptions parameter is optional and is an instance of type FeedOptions. We will learn more about this in a little bit.

result object will contain an array of matching documents and the response headers. There are many response headers returned but two of them I would like to mention specifically:

  • x-ms-request-charge: This response header will tell you about the request units consumed by the query you executed. This would be a great indicator of how good or bad your query is.
  • x-ms-continuation: This response header will tell you if there are more documents available matching your query. We will discuss this in more details below.

Using Promise

And here’s the code to do so if you were to use promise:

const queryDocumentsPromise = (query, feedOptions) => {
  return new Promise((resolve, reject) => {
    const client = getClient();
    const database = client.database(databaseId);
    const container = database.container(containerId);
    container.items.query(query, feedOptions).executeNext()
    .then((result) => {
      resolve(result);
    })
    .catch((error) => {
      reject(error);
    });
  });
};

Now let’s take some specific use cases and see how you will use the query functionality in those cases.

Querying Specific Partition

Let’s say we want to find out all users with last name as “Smith” inside “MD” state (our partition key). There’re two ways to go about it:

Include PartitionKey value in the query

In this scenario, we will include the partition key value in our query. For example, our query would look like:

const query = "Select * from Root r Where r.lastName = 'Smith' and r.address.state = 'MD'";

Specify PartitionKey value as one of the feed options

In this scenario, We will make use of partitionKey property of the feed options. This is how our code would look like:

const query = "Select * from Root r Where r.lastName = 'Smith'"
const feedOptions = {
  partitionKey: 'MO'
};
const result = await queryDocumentsAsync(query, feedOptions);

Querying across Partitions

Now let’s say we want to find out all the users in our system with last name as “Smith”. This can be accomplished by making use of enableCrossPartitionQuery property of feed options and setting its value to true. This is how our code would look like:

const query = "Select * from Root r Where r.lastName = 'Smith'"
const feedOptions = {
  enableCrossPartitionQuery: true
};
const result = await queryDocumentsAsync(query, feedOptions);

Please note that the operation above will have performance issues as the query will be sent to all logical partitions. It is always advisable to have a query target a specific partition.

Limiting the number of records returned

Let’s say you want to limit the number of records returned in a single execution of a query. This can be accomplished by making use of maxItemCount property of feed options and settings its value to the desired number of records. This is how our code would look like:

const query = "Select * from Root r Where r.lastName = 'Smith'"
const feedOptions = {
  enableCrossPartitionQuery: true,
  maxItemCount: 100 //Return only 100 records
};
const result = await queryDocumentsAsync(query, feedOptions);

Listing all documents in a container

Let’s say you want to list all documents in a container. Further let’s assume that you want to fetch a maximum of 100 documents per request.

As mentioned above, whenever a query is executed and there’re more records available matching that query, response headers include x-ms-continuation header. It is the responsibility of the calling application to include this header’s value in the next execution to get the next set of records.

For example, take a look at the code below:

let continuationToken = undefined;
const documents = [];
do {
  const feedOptions = {
    enableCrossPartitionQuery: true,
    maxItemCount: 100,
    continuation: continuationToken
  };
  const queryResult = await queryDocumentsWithFeedOptionsAsync('Select * from Root r', feedOptions);
  documents.push(...queryResult.result);
  continuationToken = queryResult.headers['x-ms-continuation'];
} while (continuationToken !== undefined);

What we’re doing in the code above is instructing Cosmos DB engine to return us a maximum of 100 records. Whenever the query results are returned, we’re checking if we got x-ms-continuation header back. If we get this header, then we’re executing the query again but including the value of this response header in feed options as value for continuation parameter. If we don’t get back this header, that means we have fetched all the documents matching our query and we break out of our loop.

Please note that this continuation token must be treated as opaque. What is meant by that is that one should not try to infer the value of this token. Also, one should not try to build any logic around this token’s value.

The logic should be simple – Presence of x-ms-continuation header in response means there are more documents available and absence of this header in response means no more documents are available. It’s that simple!

Wrapping Up

That’s it for this post! In the next series of posts we will do same thing with document attachments and other things so stay tuned for that.

If you find any issues with the code samples or any other information in this post, please let me know and I will fix them at the earliest.

Happy Coding!


[This is the latest product I'm working on]