image
Creating Secondary Indexes
Start course
Difficulty
Intermediate
Duration
1h 32m
Students
20918
Ratings
4.6/5
Description

Please note this course is outdated and has been replaced with the following courses:

 

This course provides an introduction to working with Amazon DynamoDB, a fully-managed NoSQL database service provided by Amazon Web Services. We begin with a description of DynamoDB and compare it to other database platforms. The course continues by walking you through designing tables, and reading and writing data, which is somewhat different than other databases you may be familiar with. We conclude with more advanced topics including secondary indexes and how DynamoDB handles very large tables.

Course Objectives

You will gain the following skills by completing this course:

  • How to create DynamoDB tables.
  • How to read and write data.
  • How to use queries and scans.
  • How to create and query secondary indexes.
  • How to work with large tables. 

Intended Audience

You should take this course if you have:

  • An understanding of basic AWS technical fundamentals.
  • Awareness of basic database concepts, such as tables, rows, indexes, and queries.
  • A basic understanding of computer programming. The course includes some programming examples in Python.

Prerequisites 

See the Intended Audience section.

This Course Includes

  • Expert-guided lectures about Amazon DynamoDB.
  • 1 hour and 31 minutes of high-definition video. 
  • Expert-level instruction from an industry veteran. 

What You'll Learn

Video Lecture What You'll Learn
DynamoDB Basics A basic and foundational overview of DynamoDB.
Creating DynamoDB Tables How to create DynamoDB tables and understand key concepts.
Reading and Writing Data How to use the AWS Console and API to read and write data.
Queries and Scans How to use queries and scans with the AWS Console and API.
Secondary Indexes How to work with Secondary Indexes.
Working with Large Tables How to use partitioning in large tables.

If you have thoughts or suggestions for this course, please contact Cloud Academy at support@cloudacademy.com.

Transcript

This video will demonstrate how to create secondary indexes using the AWS web console.

As we showed in the last video, it might be helpful to have a global secondary index on the orders table. So that we can query for the most recent orders for a particular customer. To support that type of query, we'll need our index to use customer ID as the partition key and order date as the sort key. Let's head back to the console.

Let's click tables in the sidebar, And then click on our orders table. To modify indexes we'll go to the indexes tab here. You'll see that there are no indexes listed yet. Let's click the create index button to start building our index. The index's partition key will be customer ID and that's a number. Let's click add/sort key and then specify order date for the second key. Since DynamoDB doesn't have a date/time data type, we've been using strings for the order date. So let's leave it like that. At this point, the console has suggested a name for our index. I don't really like this autogenerated name so let's edit it and change it to just "Customer ID index." That will be simpler to understand when we write our queries.

We can also choose which attributes get projected into the index. The default is that all attributes will be placed into the index so that if we query our index the results set will include all the attributes from the table. We could change this to keys only, which would only include the table's keys and the index's keys or include which would let us name specific attributes to include in the index. But let's leave it at the default and project all attributes into the index.

The other setting that we might want to adjust is the provision capacity for this index. Remember that global secondary indexes have their own separate reserved capacity. We need to make sure that we've provisioned enough capacity to keep up with the reads and writes that we're doing in the index. If we're projecting all the tables attributes into the index then every write to the table will also need to update those attributes in the index. That means that we should provision the same amount of write capacity units for both the table and the index. But we only need as much read capacity as we intend to do reading from the index. Sometimes you might read the index more often than you read the main table, so you'd want to provision more read capacity that you would for the table itself. In this case, let's assume that it's going to get about the same read traffic as the main table does. We could always adjust this provision capacity later.

Once the settings are correct, we can click create index and AWS will create the index and backfill it with all of the data in our table. Now you can see the index has the status of creating. That means that it's backfilling all of the data that's already in the table. This process is transparent to the user. When it's done, the index will become active and you'll actually get an email saying that the backfill is complete. From this screen, we could also delete a global secondary index if we wanted to, by selecting the index and clicking delete index at the top. So that's how you work with global secondary indexes.

Now I'd like to show how to add a local secondary index but remember that you can't add or make changes to local secondary indexes on existing tables. If you want any local secondary indexes on a table you need to decide that up front, before you create the table. Since we haven't really put much data in these tables, I've deleted the order line items table from my AWS account. Now we can recreate that table with a local secondary index on the status attribute and also with a global secondary index on product ID. These are the same indexes that we discussed in the last video.

So let's get started by clicking create table. Just like before, the table name is "Order Line Items." The partition key is order ID, and it's a number. And this a compound key with a sort key line number, also a number. In order to add indexes, we'll need to uncheck use default settings. Then a table of secondary indexes will appear. Let's click add index. We'll start by adding a global secondary index on product ID so that we can query for all of the line items that match a specific product. Sometimes we want to know which of those haven't been shipped yet. Or find all the ones that unshipped. So we'll use status as a sort key. Let's rename the index "Product ID Index" and then click add index. Nothing happens yet, but this index will be added when we create the table. The second index is going to be a local secondary index on the status of each line item within an order. This is the one that will let us query for all the unshipped items in a single order. So let's click add index again. Because it's a local secondary index, the partition key has to match the table's partition key which is order ID and which is a number. Then let's click add/sort key, and enter status for the sort key for this index. This time, the console recognizes that it's possible to create this one as a local secondary index. So it allows us to check the box create as local secondary index. Let's click that now. And finally, let's rename this one so that it's a little bit easier to write our queries. Now we can click add index. You can see in the table of secondary indexes our two indexes are now visible. The first one will be a global secondary index or GSI and the second is a local secondary index or LSI. Below that table are the provision capacity settings for the table and for the global secondary index. Let's leave them alone for now. We can scroll to the bottom and click create. Just like before this will create the table for us but this time, it will have two secondary indexes.

About the Author

Ryan is the Storage Operations Manager at Slack, a messaging app for teams. He leads the technical operations for Slack's database and search technologies, which use Amazon Web Services for global reach.

Prior to Slack, Ryan led technical operations at Pinterest, one of the fastest-growing social networks in recent memory, and at Runscope, a debugging and testing service for APIs.

Ryan has spoken about patterns for modern application design at conferences including Amazon Web Services re:Invent and O'Reilly Fluent. He has also been a mentor for companies participating in the 500 Startups incubator.