Choosing a non-relational database on AWS - Part 1
Choosing a non-relational database on AWS - Part 1
4h 25m

This section of the AWS Certified Solutions Architect - Professional learning path introduces you to the AWS database services relevant to the SAP-C02 exam. We then understand the service options available and learn how to select and apply AWS database services to meet specific design scenarios relevant to the AWS Certified Solutions Architect - Professional exam. 

Want more? Try a Lab Playground or do a Lab Challenge

Learning Objectives

  • Understand the various database services that can be used when building cloud solutions on AWS
  • Learn how to build databases using Amazon RDS, DynamoDB, Redshift, DocumentDB, Keyspaces, and QLDB
  • Learn how to create ElastiCache and Neptune clusters
  • Understand which AWS database service to choose based on your requirements
  • Discover how to use automation to deploy databases in AWS
  • Learn about data lakes and how to build a data lake in AWS

If you’ve asked yourself the question:

“Do I need a relational database?” 

There’s a chance you landed on the answer of “maybe not”. This could happen for a few reasons: 

  1. Perhaps you need flexible data storage that enables your database structure to change over time. 

  2. Or maybe your expected growth and scale is beyond what a relational database can achieve with vertical scaling. 

  3. Or your access patterns make it hard for a relational database to serve your needs: maybe your data is more easily queried or traversed in a document, graph, or ledger database. 

If this is the case - then you’re in luck, as most AWS-managed databases are nonrelational. They range from general-purpose databases like key-value, wide column, and document systems, to more specialized solutions, such as time-series, graph, and ledger databases. So which one do you use? For now, I’ll split them into two categories: general-purpose and specialized databases. 

In this lecture, I’ll mainly focus on the general purpose noSQL database services. The big three options here are Amazon DynamoDB, Amazon DocumentDB, and Amazon KeySpaces for Apache Cassandra. 

The first question to differentiate between the three is: 

“Do you need support for transactions?”  

This goes back to ACID principles. If you need transactions, you can choose between either DocumentDB or DynamoDB. Cassandra itself is not a fully ACID-compliant database, and while it does have support for lightweight transactions, we’ll consider it out of scope for this question. 

Now there’s a lot of differentiators between DynamoDB and DocumentDB, so you have to ask the following question:

“What do you care about the most? Low latency at a massive scale? Or flexible data modeling with evolving access patterns?” 

If you need transactions, and you need a database that prioritizes flexibility with mongoDB compatibility, then DocumentDB is the best choice. It’s a database that offers maximum data modeling flexibility, so you don’t have to be locked into the access patterns you’re using today. You can continue to make changes as you design your application over time and easily evolve with the database. 

DocumentDB is optimized for storing data in a JSON format. This is important in terms of your access patterns. If you require nested JSON capabilities and the ability to query on nested documents, then DocumentDB is the right choice. You see this often used in scenarios such as user profiles, product information, and big data use cases. When you query DocumentDB, you use the MongoDB querying language. 

It’s best designed for read-heavy workloads that can grow up to 64 TB of data. One of the big wins is that it supports rich data types that DynamoDB does not, such as timestamps and regular expression. 

Now if mongoDB compatibility isn’t necessary, and you need to prioritize scale, even if that means a little less flexibility with data modeling, DynamoDB would be the better option. This database is infinitely scalable, and there’s effectively no upper limit to how big a table can be in DynamoDB. And the idea is that despite how big it gets, it can still provide single-digit low latency. 

But, what you get in scale, you lose a bit in flexibility, as it’s generally best suited for workloads that have more defined access and query patterns. However, this is the most mature noSQL database service in AWS, so it offers a rich feature set and tight integration with other AWS services. It supports both nested data types like sets, lists, and maps and basic data types, such as int, Boolean, and Strings.  

When you query DynamoDB, you can access data using the key-value format. You can choose to retrieve a single item by its partition key, multiple items by their partition key, or all items within a given table. 

Now the next question on the list is: “What if I don’t need transactions?” 

If you don’t need ACID compliance, then you can use DocumentDB, DynamoDB or Amazon KeySpaces for Apache Cassandra. The biggest factor of when to use Keyspaces is if you need Cassandra compatibility. So if you’ve already built an application with Cassandra, and are looking for a fully-managed database that supports the engine or if you are very comfortable with Cassandra querying language (CQL), then this is a great option. Like DynamoDB, it supports both nested and basic data types. 

If you don’t need transactions, developer preference and familiarity can guide your decision to choosing the right engine. However, if you need rich data type support and flexible data modeling, DocumentDB is best suited for that. If you prioritize having a highly scalable database overall, then DynamoDB takes the win. That’s it for this one - see you next time! 

About the Author
Learning Paths

Danny has over 20 years of IT experience as a software developer, cloud engineer, and technical trainer. After attending a conference on cloud computing in 2009, he knew he wanted to build his career around what was still a very new, emerging technology at the time — and share this transformational knowledge with others. He has spoken to IT professional audiences at local, regional, and national user groups and conferences. He has delivered in-person classroom and virtual training, interactive webinars, and authored video training courses covering many different technologies, including Amazon Web Services. He currently has six active AWS certifications, including certifications at the Professional and Specialty level.