Google Cloud Datastore

Contents

Firestore and Datastore

The course is part of this learning path

Google Cloud Datastore
Difficulty
Intermediate
Duration
19m
Students
662
Ratings
3.7/5
starstarstarstar-halfstar-border
Description

This brief course provides an introduction to two Google NoSQL database offerings: Firestore and Datastore. We'll start by exploring the basic functionality of Cloud Firestore and how it's used with App Engine, as well as how it compares to Datastore. After that, we'll dive deeper into Cloud Datastore, covering queries, indexes, entity groups, and transactions.

Learning Objectives

  • Understand the purpose of Cloud Firestore
  • Understand the relationship between Cloud Datastore and Cloud Firestore
  • Learn when to choose Datastore Mode in Firestore and explain how to use Datastore with App Engine
  • Learn about queries, indexes, entity groups, and transactions in Datastore

Intended Audience

  • Database administrators
  • Google Cloud architects
  • Anyone looking to learn more about Firestore and Datastore

Prerequisites

To get the most out of this course, you should have prior experience working with databases and with Google Cloud Platform in general.

Transcript

In this lesson, we'll dive deeper into Cloud Datastore. We'll cover queries and indexes and entity groups and transactions.

Let's start with queries. A query can specify a kind and then zero or more filters and zero or more sort orders. We can filter on properties, keys and ancestors. Filters are basically pretty simple. Here's an example in Python outside of the context of an actual application. We define the q variable as a query for the person kind, and then we filter it on the person name = John, and then we can add a sort order by calling the order method, so we say order and then we specify the name, and then if we add a hyphen in front of it, it makes it descending, and we can query on ancestors as well with something like this. We can use an ancestor query specifying ancestor = and then the key.

Let's check out an example from our actual application. If we look at the images.py file, we can see that we're using a class method called for category to fetch all of the images for a given category. It uses an ancestor key as a query filter and this allows us to get all of the images that belong to the category that was passed in. So if we were to break down this code, it would translate into something like we get the key, based on the urlsafe key, and then we use that to query all of the images that have that category as an ancestor. We sort by the created on date, descending, and then we take the last 20 results. It's a fairly simple-to-use API, but it's very powerful.

With a traditional relational database, we use indexes to improve performance. Due to the design of Datastore, we use one or more indexes for any query we run. With Datastore, there are basically two types of indexes. We have single property indexes and composite indexes. Single property indexes are automatically created for us which means each individual property is indexed, allowing us to query it. Now, these indexes take up space, which means there's cost attached to it, so we also have the ability in our code to say index = fault. This will allow us to skip indexing properties that we won't ever be querying. This is going to save us money.

Okay, there are some limits on the queries we can run with single property indexes. We can use equality filters on one or more properties, which is a merged join, so something like first name = Bob and last name = James, this works because even though we're querying on two properties, we're using an equality filter so it's merging the results, and we can use inequality filters on one property, such as first name >= to the letter B and first name is < the letter C, and only one sort order can be defined on a single property query. Now, if we want to query on multiple properties, we can create a composite index. We can create it manually using the if find YAML syntax, or we can run our queries on the development server and it's going to generate an index.yaml file or a datastore-index.xml file for Java.

Let's check out our index.yaml file. Right here at the top it says auto-generated and that's because when we run any code that runs a query against the development version of Datastore, it builds the index.yaml for us. That way when we deploy, App Engine knows what indexes it needs to build. Here's an example of what a composite index might look like. We have a last name and a first name and they're both ascending.

For multi-valued properties, like our tags, it looks similar except an index entry gets created for every value of a property. We can query multi-valued properties if at least one value matches the filters. We saw that when we checked out the tags on our images in a previous lesson. It's considered best practice that we don't index very long strings. Instead, we should be using the Search API which gives us Google-like search capabilities. Also, we should clean up old indexes using the appcfg vacuum_indexes command, and if we have properties that shouldn't be indexed, maybe something like a very long string as we just mentioned, we can flag them as not indexed with the indexed = false.

We've talked about how Cloud Datastore is fast and efficient for querying, but why is that? It's because we use the indexes to shift the cost of querying to upfront when the index is created, so sometimes it's going to take a little while for the indexes to initially build if we have very large data sets, though once an index is built, then querying them is very fast. Let's talk about consistency with Datastore. We've talked about eventual and strong consistency a few times.

The difference is basically that for strong consistency, the data we read is the last data that was written, and with eventual consistency, the data that we read may not be the last data written. Eventual consistency is great for when we don't need anything critical. This can be things like a blog post, and we'd use strong consistency when it's vital to see the latest updates. Now, this can be for things like the price of a product in our catalog. If we need strong consistency, we have a few options. We can use an ancestor query. We can fetch an entity using the get method on a Key, or we can use a transaction. We've talked about the first two throughout our discussion on Datastore. However, we haven't talked about transactions, so let's dive into that a bit more.

We can use transactions to gain strong consistency. Let's say that we wanted to update a property and in this example, it's the amount of tickets available for a conference. So, we're able to ensure that if this executes successfully, any future queries will have this data. The ndb library makes it easy to perform transactions with this transactional decorator, along with the other methods that we can find in the API documentation. Now, sometimes we're going to need to work with entities that are not part of the entity group we're using.

Let's say we have two bank account entities that are not part of the same group, and we want to transfer funds from one to another. We want to be able to ensure strong consistency with something like this since eventual consistency could result in something like withdrawing more money than we should have available or not being able to withdraw enough money that we should have. For something like this, we can use cross-grouped transactions to ensure strong consistency. We still use the same transactional decorator. However, we set the xg parameter to true.

There are some best practices for transactions. First, because entity groups can only be written to once per second, we need to consider the design of our entity groups in advance. Next, an entity group's relationships are immutable, so if we need to make a change to the relationship, we need to delete the entities and recreate them with the new relationships. Also, we have a 60-second time-out on transactions. This is intended to reduce the chances that an entity is edited in another transaction during that same time. Finally, inside of transactions, the only type of query we can run is an ancestor query. So, we may need to fetch data outside of the transaction and pass it off to the code that's going to be running that transaction.

All right. Let's summarize what we've covered in this course. A query can specify a kind, and then zero or more filters and zero or more sort orders.

We can filter on properties, keys, and ancestors. With Datastore, there are basically two types of indexes. We have single property indexes and composite indexes.

Datastore supports strong and eventual consistency, and we can use ancestor queries calling the get method of a Key and transactions for achieving that strong consistency, and transactions can also be cross-grouped to allow us to support strong consistency for disparate entity groups.

Thanks for taking the time to watch this course. For Cloud Academy, I’m Ben Lambert. Thanks for watching.

 

About the Author
Students
96259
Labs
28
Courses
46
Learning Paths
54

Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.