Which database service should I use?
The course is part of this learning path
If you’ve asked yourself the question:
“Do I need a relational database?”
There’s a chance you landed on the answer of “maybe not”. This could happen for a few reasons:
- Perhaps you need flexible data storage that enables your database structure to change over time.
- Or maybe your expected growth and scale is beyond what a relational database can achieve with vertical scaling.
- Or your access patterns make it hard for a relational database to serve your needs: maybe your data is more easily queried or traversed in a document, graph, or ledger database.
If this is the case - then you’re in luck, as most AWS managed databases are nonrelational. They range from general purpose databases like key-value, wide column, and document systems, to more specialized solutions, such as time-series, graph and ledger databases. So which one do you use? For now, I’ll split them into two categories: general purpose and specialized databases.
In this lecture, I’ll mainly focus on the general purpose noSQL database services. The big three options here are Amazon DynamoDB, Amazon DocumentDB, and Amazon KeySpaces for Apache Cassandra.
The first question to differentiate between the three is:
“Do you need support for transactions?”
This goes back to ACID principles. If you need transactions, you can choose between either DocumentDB or DynamoDB. Cassandra itself is not a fully ACID-compliant database, and while it does have support for lightweight transactions, we’ll consider it out of scope for this question.
Now there’s a lot of differentiators between DynamoDB and DocumentDB, so you have to ask the following question:
“What do you care about the most? Low latency at a massive scale? Or flexible data modeling with evolving access patterns?”
If you need transactions, and you need a database that prioritizes flexibility with mongoDB compatibility, then DocumentDB is the best choice. It’s a database that offers maximum data modeling flexibility, so you don’t have to be locked into the access patterns you’re using today. You can continue to make changes as you design your application over time and easily evolve with the database.
DocumentDB is optimized for storing data in a JSON format. This is important in terms of your access patterns. If you require nested JSON capabilities and the ability to query on nested documents, then DocumentDB is the right choice. You see this often used in scenarios such as user profiles, product information, and big data use cases. When you query DocumentDB, you use the MongoDB querying language.
It’s best designed for read-heavy workloads that can grow up to 64 TB of data. One of the big wins is that it supports rich data types that DynamoDB does not, such as timestamps and regular expression.
Now if mongoDB compatibility isn’t necessary, and you need to prioritize scale, even if that means a little less flexibility with data modeling, DynamoDB would be the better option. This database is infinitely scalable, and there’s effectively no upper limit to how big a table can be in DynamoDB. And the idea is that despite how big it gets, it can still provide single-digit low latency.
But, what you get in scale, you lose a bit in flexibility, as it’s generally best suited for workloads that have more defined access and query patterns. However, this is the most mature noSQL database service in AWS, so it offers a rich feature set and tight integration with other AWS services. It supports both nested data types like sets, lists, and maps and basic data types, such as int, Boolean, and Strings.
When you query DynamoDB, you can access data using the key-value format. You can choose to retrieve a single item by its partition key, multiple items by their partition key, or all items within a given table.
Now the next question on the list is: “What if I don’t need transactions?”
If you don’t need ACID compliance, then you can use DocumentDB, DynamoDB or Amazon KeySpaces for Apache Cassandra. The biggest factor of when to use Keyspaces is if you need Cassandra compatibility. So if you’ve already built an application with Cassandra, and are looking for a fully-managed database that supports the engine or if you are very comfortable with Cassandra querying language (CQL), then this is a great option. Like DynamoDB, it supports both nested and basic data types.
If you don’t need transactions, developer preference and familiarity can guide your decision to choosing the right engine. However, if you need rich data type support and flexible data modeling, DocumentDB is best suited for that. If you prioritize having a highly scalable database over all, then DynamoDB takes the win.
This course covers the core learning objective to meet the requirements of the 'Designing Database solutions in AWS - Level 2' skill
- Evaluate an appropriate AWS database based on specific design requirements
- Analyze when caching is required to improve the performance of an AWS database
- Evaluate an appropriate AWS database scaling strategyt to meet both expected and unexpected traffic demands
Alana Layton is an experienced technical trainer, technical content developer, and cloud engineer living out of Seattle, Washington. Her career has included teaching about AWS all over the world, creating AWS content that is fun, and working in consulting. She currently holds six AWS certifications. Outside of Cloud Academy, you can find her testing her knowledge in bar trivia, reading, or training for a marathon.