1. Home
  2. Training Library
  3. Microsoft Azure
  4. Courses
  5. Introduction to Azure DocumentDB

Stored Procedures


Course Intro
Getting the Most from DocumentDB
Use Cases
3m 12s
5m 20s
Start course
1h 36m

It's been common, if inconsistently applied, knowledge for many years that relational databases are a less-than-ideal fit for some types of software problems. Indeed, entire categories of software development tooling, such as object-relational mappers (ORMs), exists to bridge the gap between highly normalized relational data and in-memory, object-oriented representations. In practice, ORMs can create as much complexity as they alleviate, so developers began looking at the relational database itself as ripe for potential disruption.

Thus came the rise of NoSQL and databases that eschew the traditional rows/columns/tables/foreign keys metaphor for other choices like JSON document stores, graph databases that represent data and relationships as nodes with connecting edges, key/value stores that act as a glorified hashtable, and others. The wide range of options meant you could choose the right tool for your particular needs, instead of trying to squeeze a relational database square peg into your application's round hole. Solutions like MongoDB, Cassandra, Redis, and Neo4j rose to prominence and became de facto industry standards for developers interested in leveraging the power and flexibility NoSQL.

While NoSQL was a boon to software developer productivity, the initial product offerings did little to alleviate the administrative burden of managing your database. Server provisioning, backups, data security at-rest and in-transit... all these challenges (and many more) remained as developers adopted NoSQL in greater numbers. Fortunately for them and all of us, the rise of the cloud and managed database service offerings like Azure DocumentDB brought us the best of both worlds: fast, flexible, infinitely-scalable NoSQL with most of the administrative headaches assumed by a dedicated team of experts from Microsoft. You focus on your data and your application, and rely on a 99.99% SLA for the rest!

In this "Introduction to Azure DocumentDB" course, you’ll learn how to use Azure DocumentDB (DocDB) in your applications. You'll create DocDB accounts, databases, and collections. You'll perform ad-hoc and application-based queries, and see how features like stored procedures and MongoDB protocol support can help you. You'll also learn about ideal DocDB use cases and the pricing model. By the end of this course, you’ll have a solid foundation to continue exploring NoSQL and DocumentDB.

An Introduction to Azure DocumentDB: What You'll Learn

Lecture What you'll learn
Intro What to expect from this course
DocumentDB Overview A high-level overview of the DocumentDB feature set
Overview of Managing DocumentDB A discussion of DocumentDB features for managing resources, data, scalability, configuration, and so on
Creating an Account Creating a top-level DocDB account in the Azure portal
Creating a Collection Creating and configuring a DocDB collection in the Azure portal
Importing Data Discussion and demonstration of moving data into a DocDB collection
Overview of Developing with DocumentDB A discussion of DocumentDB features from a development point of view
SQL Queries How to author queries in the Azure portal
Programming with DocumentDB Reading and writing data in code, using the .NET SDK
Stored Procedures Authoring DocDB stored procedures and executing them using the DocDB REST API
MongoDB Protocol Support Configuring and using DocDB's MongoDB protocol support
Use Cases A brief discussion of scenarios well-suited for DocDB use
Pricing A review of the DocDB pricing model, and discussion of cost estimation and Total Cost of Ownership
Ecosystem Integration A short review of DocDB integration with other Azure services
Summary Course wrap up

If you have thoughts or suggestions for this course, please contact Cloud Academy at support@cloudacademy.com.


Welcome back. Let's look a little bit now at Stored Procedures in DocDB and see what it looks like to both create them and call them. Now, you can create Stored Procedures from any of the languages or platforms that have SDK Support or any of the languages that can call the Management API, the REST API for DocDB. So you can create Stored Procedures that way if you like. I'm going to show you what it looks like in the portal.

So let me navigate down to Script Explorer from my top level DocDB account. And I'm going to click on a new collection that I've created called sales. And you can see that I have a Stored Procedure already called record sale. So I'm gonna click on this and show you what this looks like. The first thing that you'll note is that this is by no means not the most elegant editing experience that you're ever going to find, so I wouldn't necessarily recommend that you spend a ton of time authoring lots of Stored Procedures, you know very complex or very many Stored Procedures in this user interface. But certainly this is good for kind of small, kind of test scenarios where you're just trying to try a few things out. And there's also a bit of a testing experience in this as well, so if I zoom this back up, yeah, you can see that I can add inputs here and I can see results, so I can, there's a save and execute button up here that I could use if I wanted to. But we're not gonna do that, we'll just author this Stored Procedure here. I'll show you what this looks like and then we'll call it from elsewhere.

So, my record sale Stored Procedure is meant to do exactly what it sounds like. This collection is just modeling a simple kind of sales process, or the results of a sales process, where each document records a sales amount and also specifies a region, meaning north, south, east or west, where this sale occurred. So this is a partition collection and so I'm using that region property on all of the documents in the collection to do partitioning. And this is kind of like a, you know, kind of an easy way to hopefully provide a bit of kind of balanced partitioning and keep the size of my partitions relatively equal. Certainly there is a lot of guidance online for coming up with a good partition key, a partitioning strategy for your particular application scenario, and I'd encourage you to read that as you get more and more, as you dig further and further into DocDB. But just know that this, in my case, this sales collection does have one, it is partitioned by region.

So basically what the idea of this Stored Procedure is is that each time I want to record a sale, I pass in a document which corresponds to that sale, and I'm going to create the document, meaning I'm going to add it to my collection. But then I want to do a little bit of extra work as well. And so the pattern that I'm going to show you here, this is actually a fairly common pattern, if you're unfamiliar with NoSQL databases, you may be unfamiliar with this pattern. But recall that in part of our discussion earlier in the course, we discussed the notion that collections don't have to, all the documents in a collection don't have to have the same schema. In fact in some cases, it's quite useful to have multiple documents or different documents with different schemas in the same collection. So this is one of those cases, or this pattern is an example of one of those cases. So the idea is that most of the documents, in fact almost all of the documents in this collection will just be individual, kind of discreet sales documents, each representing a single sale. But what I'm also interested in doing is performing some pre-aggregation of sales information to kind of make it easier for me to go back and query aggregated information across all of my sales, at least by, all of my sales by each partition. The idea is I could, if I wanted to, if I wanted to get say, the total amount of sales that I've had in a region or the total number of sales, that sort of thing, I could obviously perform an Ad Hoc Query on a just-in-time basis and kind of compute that information. But you know, I have to scan across all of the documents, even with the fast indexing of DocumentDB, that's still a relatively expensive operation and again, remember that we're paying a per request unit cost for every query that we run in DocDB. So the idea is that even if we can do it quickly, it still may be relatively expensive and so the more we can do kind of ahead of time to pre-aggregate some of this information, then, in the long run, this will probably benefit us from a cost standpoint.

So in this case what I'm doing is I'm storing these individual sales documents in my collection and for each partition, I'm also maintaining what I call a stats document. And so this is just sort of like a roll up document, a single document per partition in the collection, one for north, south, east and west that just maintains some additional information like the total sales in that region, the total sales count and just the total dollar amount, as well as the maximum sale, that is the biggest one that I've had so far in that region. So what happens is my document comes into my Stored Procedure, I call create document. This underscore syntax is kind of a shorthand syntax. Basically what we're using here is the JavaScript or the Nodejs.dk. That's the SDK or the syntax, if you will, that you use to author a Stored Procedure in DocDB. Again, this is all JavaScript that we're looking at here. And so instead of having to kind of manually create a connection object and populate all of its properties and that sort of thing, we have this shorthand syntax that we can use, because obviously we're already in DocumentDB at this point. This is running within the scope of the server itself, so there's no, you know there's no reason for us to have to go create a connection. DocumentDB can essentially provide us an ambient connection if you will. And we just reference that using this underscore, underscore syntax. So I call create document, which will actually insert this document into my collection, and then as a result of that, I check to see if there was an error, if there was then of course I'm going to throw that. If there wasn't, then what I want to do is I want to call the filter operation and I want to find that statistics document, that single one that I maintained per region. So each of those documents has a special property called stats that equals true, so I look for that, this is just a filter clause. And I'm also looking for the region of the stats document that matches the region of the incoming document that I just inserted. So again, if I don't find that or if I have some other error, then I'll throw that error, otherwise I get a collection back. My collection should only have one item, so I just get the first one. And then I'm updating things like the sales count, the sales total and then I'm also updating the max sale. So if the incoming total, if the total on the incoming document is greater than whatever other max I've currently seen thus far, then I'll replace the current max with the one that's in the incoming document. Very simple, very straightforward. Obviously this isn't a terribly real-world situation. I'm trying to balance kind of the complexity of the example with just the need to kind of motivate the point here of how Stored Procedures work. But I think you get the idea.

So ultimately, I've updated my stats and now I need to call replace document to, again, replace the old document, the old statistics document with this new one that has the updated information. So I call underscore, underscore replace document. And again, if I have an error then I'll throw that back to the client, otherwise I'm good, and I can stop. So some of you may be looking at this thinking well Josh, earlier in the discussion, we've talked a little bit about DocumentDB support for triggers, and this sounds like maybe something that a trigger could perform as well. And that's actually true, so I'm just showing you one example or one way that this could be done. You could certainly perform this kind of behavior using something else as well. So you don't necessarily take this as gospel, we just take this as one possible means to achieve this, this particular pattern. Okay, so I've created this Stored Procedure already, it's already in my collection and ready to go.

So now let's switch gears and we'll invoke the Stored Procedure. Now this can be done a couple of different ways, I could certainly do it from the user interface or I could also do it from any of the SDKs like .NET and Node.js or Python, et cetera. I can call a Stored Procedure from any of those places. But I can also call a Stored Procedure directly using the REST API. DocumentDB has a REST API for all of its management and execution operations and so I can use that REST API to invoke this Stored Procedure. So that's what I'm going to do here. So to do that, I'm gonna switch over to a tool called Postman. Some of you may be familiar with Postman, it's a tool basically for building REST API calls and then executing them against a particular URL endpoint. So I've configured Postman to issue a request to my Stored Procedure. So if I zoom in a little closer, you can see that I have, I'm making a POST call here and I have a, this is the URL that goes to my Stored Procedure. So you can see here's the URL to, the base URL to my DocDB account, and then I have to navigate further into that and I say DB of default, the column, or the sorry, the collection that I want is sales and under the sprocs, kind of sub-node here, I want record sale, which is the name of my Stored Procedure. Let me also talk for a moment about some of the header values that are needed for issuing a REST call to DocDB as well. Since that can be fairly specific and there's really good documentation for how to put this together online. In fact I'll show you the URL here in a moment for where you can read more about it. But just real briefly let me highlight some of this. So let me pull this over here. So of course you need a content type to issue almost any REST call. The content type is slightly different for queries in DocDB. Instead of application JSON, you use application query plus JSON. You need a couple of Microsoft specific headers, specifically one for the date and UTC time as well as the DocDB version that you're targeting. So this is a special version number that you'll need to use. You also have to specify the partition key that you want to talk to in DocDB. Again, because I'm using a partition collection, then I need to specify which partition my Stored Procedure execution will operate against. A Stored Procedure can't operate against all partitions at the same time. We have to designate which partition we're going to operate against. The cache control header is useful just to ensure that this on the server side nothing is cached and I'm always getting the latest values, it's just a recommended best practice. And this authorization header, the details of how to put this together again, you can check out the URL that's on the screen now for more information about how to put this together. There's some fairly standard .NET or Node code, JavaScript code that you can use to build this header value. So again, check out the URL to figure out how that works.

So the last thing I'll show you is just the body of my request. And if I zoom in, you can see that this is a JSON body, it's an array of parameters or array of data and it takes a single object. This is the document that I'm going to insert. It has a total of 200 and the region is south, so that's what I'm going to insert into my collection. So if I click send here ... Then you can see I don't get anything back in the body. If I zoom in you can see there's nothing, nothing comes back, but I did get an HTTP 200, which means that my request succeeded. So, if we go back to the portal, and I will click on document explorer. Navigate to my sales collection. And we can see that the most recent document is south and yeah, so here's our document here. DocDB's added an ID for us since we didn't add one ourselves. We're in the south region and we have a total of 200. So that's my document, that's good. So now let's check to see, did the stats document for the south actually, was that actually updated with our aggregated statistics. So if I look here I can see that this is the other document with a region of south, so this should be our statistics document. So if I click on this, then yeah, sure enough, we can see this is stats equals true, max sale is 200, sales total is 200, sales count is one. And we're in the south region. So if I go back and run this maybe one more time, we could say, make this 1300 ... And run this again. You can see I have a 200, so let's go back and refresh our stats document. There we go. We can see sales total is 1500, we've had two sales in the south region, and our max sale is 1300, which is the one that we just added just a moment ago.

About the Author
Josh Lane
Azure Researcher and Trainer

Josh Lane is a Microsoft Azure MVP and Azure Trainer and Researcher at Cloud Academy. He’s spent almost twenty years architecting and building enterprise software for companies around the world, in industries as diverse as financial services, insurance, energy, education, and telecom. He loves the challenges that come with designing, building, and running software at scale. Away from the keyboard you'll find him crashing his mountain bike, drumming quasi-rythmically, spending time outdoors with his wife and daughters, or drinking good beer with good friends.