Modeling a DynamoDB Table
Start course
3h 3m

This course provides detail on the AWS Database services relevant to the AWS Certified Developer - Associate exam. This includes Amazon RDS, Aurora, DynamoDB, MemoryDB for Redis, and ElastiCache.

Want more? Try a lab playground or do a Lab Challenge!

Learning Objectives

  • Obtain a solid understanding of the following Amazon database services: Amazon RDS, Aurora, DynamoDB, MemoryDB for Redis, and ElastiCache
  • Create an Amazon RDS database
  • Create a DynamoDB database
  • Create an ElastiCache cluster

It is important that you consider your data structure and expected access patterns before creating a DynamoDB Table. This will save you time and effort once the implementation begins. Consider the data set on display. It represents a music library where each song is represented as an item. The first detail to clarify is that a SQL database store data in rows and columns. DynamoDB uses items and attributes as the equivalent of rows and columns. DynamoDB schemas are dynamic and each row does not have to contain data for each column as it is needed in traditional databases. Data in DynamoDB tables is queried by focusing on collections of items. It is also common to import data in JSON format as shown. 

In addition to a table name, DynamoDB requires a primary key, and this requires to be defined and needs to be unique for each item in the table. In this example, a good candidate for primary key will be the Song Number. As this is expected to be unique for each song represented in the data set. All other attributes have the potential to exist in duplicate form. As we put this data set into a DynamoDB structure, we can call the Table Music and name each attribute according to its usage for song number, name of the song, artist, album, genre, and year. With that said, searching this table will most likely happen in terms of artist, album, or genre. The Song Number and the Year will not become an attribute to search for this table. We can rearrange the data set to define a composite Primary Key, where we use a Partition Key and a Sort Key to create a unique combination. 

Now, we can model the composite key to consist of the Artist as the Partition Key and the Song Number as the Sort Key. This will continue to create a unique combination for all items in the table. If needed, we can also have any of the attributes available in combination with the Song Number to obtain a unique composite Primary Key. Logically, it makes sense to have the artist as the Partition Key and the Song Number as the Sort Key. This would be a match to a common access pattern for a music library, where we can search by Artist. As long as the combination is unique, we can model it according to how we want to access the data. The idea of secondary indexes allows for alternate ways to query the table. There are two types of secondary indexes:  The first one is called a Local Secondary Index;  The second is called a Global Secondary Index. Global secondary indexes are more common and flexible in that you can create them anytime and permit you to define an alternate Partition Key and Sort Key if needed. In this example, we can define a Global Secondary Index Partition Key with the attribute album, and keep the Song Number as the Sort Key. 

This will give us the ability to query the table by Artist using the original Primary Key and by album using a Global Secondary Index. Local secondary indexes imposed some restrictions and are less frequently used. Let's put them side by side. Global Secondary Indexes can be created anytime during the life cycle of the table. They allow for alternate partition and Sort Keys to give you the flexibility to query based on additional attributes. Local Secondary Indexes need to be created along with the table and behave only as an alternate Sort Key. You need to use the core table's existing partition key with a local secondary index. So, there's a bit of a tradeoff between Global Secondary Indexes and Local Secondary Indexes. 

For example, a Local Secondary Index allows for strongly consistent reads and uses the core tables defined throughput. Global secondary indexes only allow for eventual consistency reads and require their own throughput definition. Depending on your access patterns, you will choose one over the other. In general, Global Secondary Indexes are more practical, and Local Secondary Indexes are used if you always require strongly consistent reads from your operations. Also, please note how the quota per table for Global Secondary Indexes is actually 20 as compared to only 5 Local Secondary Indexes per table limits. Let's get back to our Music example. In our model data set, please note that we cannot use a Local Secondary Index. 

The choice of an Artist as the Partition Key will not allow for a Local Secondary Index album to be useful because the combination of Artist and Album will not be unique enough to satisfy the Primary Key requirements. For a given Artist and albums there are surely to exist multiple songs with both these attributes in common. So, we settled to conclude our table modeling with a Partition Key of Artist, a Sort Key of Song Number, and create a Global Secondary Index with a Partition Key of Album and a Sort Key of Song Number. We can also model a second Global Secondary Index to use Genre as the Partition Key and continue to keep the Song Number as the Sort Key. These steps will give us all three access patterns where we can search by Artist, by Album, and by Genre. Also, please note these are basic examples, and modeling a data set can be done in multiple ways. Documenting entity relationship diagrams and explicitly defining data access patterns in a document will be a good idea for production and more complex systems.


About the Author
Learning Paths

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.