The course is part of these learning pathsSee 3 more
Creating DynamoDB Tables
Reading and Writing Data
Queries and Scans
Working with Large Tables
This course provides an introduction to working with Amazon DynamoDB, a fully-managed NoSQL database service provided by Amazon Web Services. We begin with a description of DynamoDB and compare it to other database platforms. The course continues by walking you through designing tables, and reading and writing data, which is somewhat different than other databases you may be familiar with. We conclude with more advanced topics including secondary indexes and how DynamoDB handles very large tables.
You will gain the following skills by completing this course:
- How to create DynamoDB tables.
- How to read and write data.
- How to use queries and scans.
- How to create and query secondary indexes.
- How to work with large tables.
You should take this course if you have:
- An understanding of basic AWS technical fundamentals.
- Awareness of basic database concepts, such as tables, rows, indexes, and queries.
- A basic understanding of computer programming. The course includes some programming examples in Python.
See the Intended Audience section.
This Course Includes
- Expert-guided lectures about Amazon DynamoDB.
- 1 hour and 31 minutes of high-definition video.
- Expert-level instruction from an industry veteran.
What You'll Learn
|Video Lecture||What You'll Learn|
|DynamoDB Basics||A basic and foundational overview of DynamoDB.|
|Creating DynamoDB Tables||How to create DynamoDB tables and understand key concepts.|
|Reading and Writing Data||How to use the AWS Console and API to read and write data.|
|Queries and Scans||How to use queries and scans with the AWS Console and API.|
|Secondary Indexes||How to work with Secondary Indexes.|
|Working with Large Tables||How to use partitioning in large tables.|
If you have thoughts or suggestions for this course, please contact Cloud Academy at firstname.lastname@example.org.
About the Author
Ryan is the Storage Operations Manager at Slack, a messaging app for teams. He leads the technical operations for Slack's database and search technologies, which use Amazon Web Services for global reach.
Prior to Slack, Ryan led technical operations at Pinterest, one of the fastest-growing social networks in recent memory, and at Runscope, a debugging and testing service for APIs.
Ryan has spoken about patterns for modern application design at conferences including Amazon Web Services re:Invent and O'Reilly Fluent. He has also been a mentor for companies participating in the 500 Startups incubator.
Let's get started by discussing in more detail what DynamoDB is and some of its advantages and disadvantages.
Amazon DynamoDB is a fully managed NoSQL database service. By "fully managed," we mean that the DynamoDB service is run entirely by the team at Amazon Web Services. There's no database administration required on your end, no servers to manage, no levers to tune, and nothing to back up. All of this is handled for you by AWS. All you have to do is set up your tables and configure the level of provisioned throughput that each table should have. Provisioned throughput refers to the level of read and write capacity that you want AWS to reserve for your table. You are charged for the total amount of throughput that you configure for your tables, plus the total amount of storage space used by your data.
DynamoDB is a NoSQL database, which means that it doesn't use the common structured query language, or SQL. It's not a relational database. Instead, it falls into a category of databases known as key value stores. A key value store is a collection of items or records. You can look up data by the primary key for each item, or through the use of indexes.
DynamoDB tables are considered schemaless, because there's no strict design schema that every record must conform to. As long as each item has an appropriate primary key, the item can contain varying sets of attributes. The records in a table do not need to have the same attributes or even the same number of attributes. This can be very convenient for rapid application development. If you want to add a new column to your table, you don't need to alter the table. Just start including the new field as an attribute when you insert new records. Likewise, you never need to adjust the data type for a column. DynamoDB generally doesn't care about data types for individual attributes.
DynamoDB is offered as a service, available from inside the AWS network or over the internet. DynamoDB uses Amazon Web Services' standard features for identity and access management. You can interact with DynamoDB using the AWS web console, but more often you'll write application code that connects to DynamoDB through its application programming interface, or API.
Some of the advantages of DynamoDB are that it's fully managed by Amazon Web Services. You don't have to worry about backups or redundancy, although you're welcome to set up these kind of safeguards using some more advanced DynamoDB features. As just described, DynamoDB tables are schemaless, so you don't have to define the exact data model in advance. The data model can change automatically to fit your application's needs.
DynamoDB is designed to be highly available. Your data is automatically replicated across three different availability zones within a geographic region. In the case of an outage or an incident affecting an entire hosting facility, DynamoDB transparently routes around the affected availability zone.
DynamoDB is also designed to be fast. Read and writes take just a few milliseconds to complete, and DynamoDB will be fast no matter how large your tables grow. Unlike a relational database, which can slow down as the table gets large, DynamoDB performance is constant and stays consistent even with tables that are many terabytes large. You don't have to do anything to handle this, except adjusting the provisioned throughput levels to make sure you've reserved enough read and write capacity for your transaction volume.
But there are also some downsides to using DynamoDB too. As I just mentioned, your data is automatically replicated. Three copies are stored in three different availability zones. That replication usually happens quickly, in milliseconds, but sometimes it can take longer. This is known as eventual consistency. This happens transparently and many operations will make sure that they're always working on the latest copy of your data. But there are certain kinds of queries and table scans that may return older versions of data before the most recent copy. You need to be aware of how this works, and you may need to adjust certain queries to require strong consistency.
DynamoDB's queries aren't as flexible as what you can do with SQL. If you're used to writing advanced queries with joins and groupings, and summaries, you won't be able to do that with DynamoDB. You'll have to do more of the computation in your application code.This is done for performance reasons, to ensure that every query finishes quickly and that complicated queries can't hog the resources on a database server.
DynamoDB doesn't offer the wide range of data types that many relational databases do. DynamoDB only has a few native data types, strings or text, numbers, Boolean values "True" and "False", and binary data. If you work with other data types like dates, you'll need to represent those as strings or numbers in order to store them in DynamoDB.
DynamoDB also has some strict limitations in the way you're allowed to work with it. Two important limitations are the maximum record size of 400 kilobytes and the limit of 10 indexes per table. There are other limitations that can be adjusted by contacting AWS Customer Support, like the maximum number of tables in an AWS account.
Finally, although DynamoDB performance can scale up as your needs grow, your performance is limited to the amount of read and write throughput that you've provisioned for each table. If you expect a spike in database use, you will need to provision more throughput in advance or database requests will fail with a ProvisionedThroughputExceededException. Fortunately, you can adjust throughput at any time, and it only takes a couple of minutes to adjust. Still this means that you'll need to monitor the throughput being used on each table, or you'll risk running out of throughput if your usage grows.