Modeling a DynamoDB Table
Start course

In this course, we present how to create DynamoDB tables including local and secondary indices.

Learning Objectives

  • Creating DynamoDB tables using the AWS Console
  • Creating local and global secondary indices 

Intended Audience

  • Architects and developers looking to understand how to create DynamoDB Tables using the different modalities provided by AWS
  • Those studying for the AWS Solutions Architect Associate and Developer Associate certifications


  • Meet the requirements for the Cloud Practitioner certification or equivalent experience
  • Understand the fundamentals of DynamoDB as presented in the DynamoDB Basics course
  • It will help if you follow up with the Reading and Writing data in DynamoDB course

It is important that you consider your data structure and expected access patterns before creating a DynamoDB Table. This will save you time and effort once the implementation begins. Consider the data set on display. It represents a music library where each song is represented as an item. The first detail to clarify is that a SQL database store data in rows and columns. DynamoDB uses items and attributes as the equivalent of rows and columns. DynamoDB schemas are dynamic and each row does not have to contain data for each column as it is needed in traditional databases. Data in DynamoDB tables is queried by focusing on collections of items. It is also common to import data in JSON format as shown. 

In addition to a table name, DynamoDB requires a primary key, and this requires to be defined and needs to be unique for each item in the table. In this example, a good candidate for primary key will be the Song Number. As this is expected to be unique for each song represented in the data set. All other attributes have the potential to exist in duplicate form. As we put this data set into a DynamoDB structure, we can call the Table Music and name each attribute according to its usage for song number, name of the song, artist, album, genre, and year. With that said, searching this table will most likely happen in terms of artist, album, or genre. The Song Number and the Year will not become an attribute to search for this table. We can rearrange the data set to define a composite Primary Key, where we use a Partition Key and a Sort Key to create a unique combination. 

Now, we can model the composite key to consist of the Artist as the Partition Key and the Song Number as the Sort Key. This will continue to create a unique combination for all items in the table. If needed, we can also have any of the attributes available in combination with the Song Number to obtain a unique composite Primary Key. Logically, it makes sense to have the artist as the Partition Key and the Song Number as the Sort Key. This would be a match to a common access pattern for a music library, where we can search by Artist. As long as the combination is unique, we can model it according to how we want to access the data. The idea of secondary indexes allows for alternate ways to query the table. There are two types of secondary indexes:  The first one is called a Local Secondary Index;  The second is called a Global Secondary Index. Global secondary indexes are more common and flexible in that you can create them anytime and permit you to define an alternate Partition Key and Sort Key if needed. In this example, we can define a Global Secondary Index Partition Key with the attribute album, and keep the Song Number as the Sort Key. 

This will give us the ability to query the table by Artist using the original Primary Key and by album using a Global Secondary Index. Local secondary indexes imposed some restrictions and are less frequently used. Let's put them side by side. Global Secondary Indexes can be created anytime during the life cycle of the table. They allow for alternate partition and Sort Keys to give you the flexibility to query based on additional attributes. Local Secondary Indexes need to be created along with the table and behave only as an alternate Sort Key. You need to use the core table's existing partition key with a local secondary index. So, there's a bit of a tradeoff between Global Secondary Indexes and Local Secondary Indexes. 

For example, a Local Secondary Index allows for strongly consistent reads and uses the core tables defined throughput. Global secondary indexes only allow for eventual consistency reads and require their own throughput definition. Depending on your access patterns, you will choose one over the other. In general, Global Secondary Indexes are more practical, and Local Secondary Indexes are used if you always require strongly consistent reads from your operations. Also, please note how the quota per table for Global Secondary Indexes is actually 20 as compared to only 5 Local Secondary Indexes per table limits. Let's get back to our Music example. In our model data set, please note that we cannot use a Local Secondary Index. 

The choice of an Artist as the Partition Key will not allow for a Local Secondary Index album to be useful because the combination of Artist and Album will not be unique enough to satisfy the Primary Key requirements. For a given Artist and albums there are surely to exist multiple songs with both these attributes in common. So, we settled to conclude our table modeling with a Partition Key of Artist, a Sort Key of Song Number, and create a Global Secondary Index with a Partition Key of Album and a Sort Key of Song Number. We can also model a second Global Secondary Index to use Genre as the Partition Key and continue to keep the Song Number as the Sort Key. These steps will give us all three access patterns where we can search by Artist, by Album, and by Genre. Also, please note these are basic examples, and modeling a data set can be done in multiple ways. Documenting entity relationship diagrams and explicitly defining data access patterns in a document will be a good idea for production and more complex systems.


About the Author
Jorge Negrón
AWS Content Architect
Learning Paths

Experienced in architecture and delivery of cloud-based solutions, the development, and delivery of technical training, defining requirements, use cases, and validating architectures for results. Excellent leadership, communication, and presentation skills with attention to details. Hands-on administration/development experience with the ability to mentor and train current & emerging technologies, (Cloud, ML, IoT, Microservices, Big Data & Analytics).