image
Queries and Scans with the API
Start course
Difficulty
Intermediate
Duration
1h 32m
Students
20918
Ratings
4.6/5
Description

Please note this course is outdated and has been replaced with the following courses:

 

This course provides an introduction to working with Amazon DynamoDB, a fully-managed NoSQL database service provided by Amazon Web Services. We begin with a description of DynamoDB and compare it to other database platforms. The course continues by walking you through designing tables, and reading and writing data, which is somewhat different than other databases you may be familiar with. We conclude with more advanced topics including secondary indexes and how DynamoDB handles very large tables.

Course Objectives

You will gain the following skills by completing this course:

  • How to create DynamoDB tables.
  • How to read and write data.
  • How to use queries and scans.
  • How to create and query secondary indexes.
  • How to work with large tables. 

Intended Audience

You should take this course if you have:

  • An understanding of basic AWS technical fundamentals.
  • Awareness of basic database concepts, such as tables, rows, indexes, and queries.
  • A basic understanding of computer programming. The course includes some programming examples in Python.

Prerequisites 

See the Intended Audience section.

This Course Includes

  • Expert-guided lectures about Amazon DynamoDB.
  • 1 hour and 31 minutes of high-definition video. 
  • Expert-level instruction from an industry veteran. 

What You'll Learn

Video Lecture What You'll Learn
DynamoDB Basics A basic and foundational overview of DynamoDB.
Creating DynamoDB Tables How to create DynamoDB tables and understand key concepts.
Reading and Writing Data How to use the AWS Console and API to read and write data.
Queries and Scans How to use queries and scans with the AWS Console and API.
Secondary Indexes How to work with Secondary Indexes.
Working with Large Tables How to use partitioning in large tables.

If you have thoughts or suggestions for this course, please contact Cloud Academy at support@cloudacademy.com.

Transcript

This video we'll discuss how to perform queries and scans using the DynamoDB API. Let's go back into our Python console and get ready to write some code. For this exercise, we'll continue to use the order line items table. We need some boilerplate to start. And then we'll recreate the table object that lets us interact with that table.

In our first query, let's retrieve all the line items for a specific order. This is similar to what we did in the web console, but now we're doing it in code. The query API has many parameters, but the only ones that are required are the table name and the key condition expression, which is how we specify exactly which records should be included in the query result. The Python SDK will automatically add the table name because we're working with a table project. All we need to specify is the key condition expression, that the order ID partition key should equal 672102. Let's run that query now.

This time we get a result with nine items. The query response also includes a count, which is the number of records that match the query, and scanned count, which is the number of records that were reviewed when considering the query. In this case, we're not using any filters, so these both show nine items. Now let's add a filter, only looking for unshipped items, and see how that changes the result. This is what the API call looks like when we add a filter expression. Now let's run it.

We can see the two items that haven't shipped. This time the count is two because there were two records returned by the query. But scanned count is still nine because DynamoDB still had to load and process all nine records that matched the partition key. The filter only affected which records were sent back from the API.

There are quite a few optional parameters for the query API. Some of them aren't too difficult to figure out, like selecting whether to sort results in forward or reverse order. The rest are for more advanced features that we won't cover in this series.

Now let's show an example of a full table scan. If we want to scan an entire table, we can call the scan API with no parameters. Again, our table object is automatically sending the table name parameter. And when we run that, we'll see a very lengthy response. Now let's scan the full table again and add a filter to find all records shipped on a particular date. This isn't something that we'd want to do in production code, because it's probably going to be slow and expensive, but it's fine for this example.

With the filter expression in place, we should only get a few results back. Let's try it out. Our table is small, so this returns pretty quickly. You can see that it returns two items, both of which were shipped on that date. But the scanned count is 722, which means it had to scan through the entire table to find these two items. On large tables, that's going to add up quickly, so you may not want to do this often, but you see that it is pretty easy to code.

About the Author

Ryan is the Storage Operations Manager at Slack, a messaging app for teams. He leads the technical operations for Slack's database and search technologies, which use Amazon Web Services for global reach.

Prior to Slack, Ryan led technical operations at Pinterest, one of the fastest-growing social networks in recent memory, and at Runscope, a debugging and testing service for APIs.

Ryan has spoken about patterns for modern application design at conferences including Amazon Web Services re:Invent and O'Reilly Fluent. He has also been a mentor for companies participating in the 500 Startups incubator.