Amazon DynamoDB: 10 Things You Should Know

Amazon DynamoDB is a managed NoSQL service with strong consistency and predictable performance that shields users from the complexities of manual setup.

Whether or not you’ve actually used a NoSQL data store yourself, it’s probably a good idea to make sure you fully understand the key design differences between NoSQL (including Amazon DynamoDB) and the more traditional relational database (or “SQL”) systems like MySQL.

First of all, NoSQL does not stand for “Not SQL“, but “Not Only SQL“. The two are not opposites, but complementary. NoSQL designs deliver faster data operations and can seem more intuitive, while not necessarily adhering to the ACID (atomicity, consistency, isolation, and durability) properties of a relational database.

There are many well-known NoSQL databases available, including MongoDB, Cassandra, HBase, Redis, Amazon DynamoDB, and Riak. Each of these was built for a specific range of uses and will therefore offer different features. We could group these databases into the following categories: columnar (Cassandra, HBase), key-value store (DynamoDB, Riak), document-store (MongoDB, CouchDB), and graph (Neo4j, OrientDB).

In this post, I’m going to focus on Amazon DynamoDB the giant of the NoSQL world. I believe it’s become a giant because AWS built it for their own operations. Considering how much was at stake financially, anything less than complete reliability would simply not be tolerated. Software created in such a demanding environment and with the use of AWS-scale resources is bound to be epic. The result? Fantastic reliability and durability, with blazing fast service.

Like any other AWS product, Amazon DynamoDB was designed for failure (i.e., it has self-recovery and resilience built in). That makes DynamoDB a highly available, scalable, and distributed data store. Here are ten key features that helped make Amazon DynamoDB into a giant.

1. Amazon DynamoDB is a managed, NoSQL database service

With a managed service, users only interact with the running application itself. You don’t need to worry about things like server health, storage, and network connectivity. With Amazon DynamoDB, AWS provisions and runs the infrastructure for you. Some of DynamoDB’s critical managed infrastructure features include:

  • Automatic data replication over three availability-zones in a single region.
  • Infinitely scalable read-write I/O running on IOPS-optimized solid state drives.
  • A provisioned-throughput model where read and write units can be adjusted at any time based on actual application usage.
  • Data backed up to S3.
  • Integrated with other AWS services like Elastic MapReduce (EMR), Data Pipeline, and Kinesis.
  • Pay-per-use model – you never pay for hardware or services you’re not actually using.
  • Security and access control can be applied using Amazon’s IAM service.
  • Great enterprise-ready features such as a robust SLA, monitoring tools, and private VPN functionality.

2. Amazon DynamoDB has Predictable Performance

AWS claims that DynamoDB will deliver highly predictable performance. Considering Amazon’s reputation for service delivery, we tend to take them at their word on this one. You can actually control the quality of the service you’ll get by choosing between Strong Consistency (Read-after-Write) or Eventual Consistency. Similarly, if a user wants to increase or decrease the Read/Write throughput they’ll experience, they can do it through simple API calls. Amazon DynamoDB also offers what they call Provisioned Capacity, where you can “bank” up to five minutes of unused capacity, which, like the funds in an emergency bank account, you can use during short bursts of activity.

3. Amazon DynamoDB is designed for massive scalability

Being an AWS product, you can assume that Amazon DynamoDB will be extremely scalable. With their automatic partitioning model, as data volumes grow, DynamoDB spreads the data across partitions and raises throughput. This requires no intervention from the user.

4. Amazon DynamoDB data types

DynamoDB supports following data types:

  • Scalar – Number, String, Binary, Boolean, and Null.
  • Multi-valued – String Set, Number Set, and Binary Set.
  • Document – List and Map.

Scalar types are generally well understood. We’ll focus instead on multi-valued and document types. Multi-valued types are sets, which means that the values in this data type are unique. For a months attribute you can choose a String Set with the names of all twelve months – each of which is, of course, unique.

Similarly, document types are meant for representing complex data structures in the form of Lists and Maps. See this example:

{
   Id = 100
   ProductName = "K3 Note"
   Description = "5.5 inches screen, 4G LTE,octa-core processor, 2GB RAM and 16 GB ROM"
   MobileType = "Touch"
   Brand = "Lenovo"
   Price = 100
   Color = [ "White", "Black" ]
   ProductCategory = "Mobile"
}

5. Amazon DynamoDB’s Data Model

DynamoDB uses three basic data model units, Tables, Items, and Attributes. Tables are collections of Items, and Items are collections of Attributes.

Attributes are basic units of information, like key-value pairs. Tables are like tables in relational databases, except that in DynamoDB, tables do not have fixed schemas associated with them. Items are like rows in an RDBMS table, except that DynamoDB requires a Primary Key. The Primary Key in DynamoDB must be unique so that it can find the exact item in the table. DynamoDB supports two kinds of Primary Keys:

  • Hash Type Primary Key: If an attribute uniquely identifies an item, it can be considered as Primary. DynamoDB builds a hash index on the attribute to facilitate the uniqueness. A Hash Key is mandatory in a DynamoDB table.
  • Hash and Range Type Primary Key: This type of Primary Key is built upon the hashed key and the range key in the table: a hashed index on the hash primary key attribute, and a range sort index on the range primary key attribute. This type of primary key allows for AWS’s rich query capabilities.

6. Amazon DynamoDB indexes

There are two types of indexes in DynamoDB, a Local Secondary Index (LSI) and a Global Secondary Index (GSI). In an LSI, a range key is mandatory, while for a GSI you can have either a hash key or a hash+range key. GSIs span multiple partitions and are placed in separate tables. DynamoDB supports up to five GSIs. While creating a GSI, you need to carefully choose your hash key because that key will be used for partitioning.

Which is the right index type to use? Here are two considerations: LSIs limit item size to 10 GB, and GSIs offer only eventual consistency.

7. Amazon DynamoDB partitions

In DynamoDB, data is partitioned automatically by its hash key. That’s why you will need to choose a hash key if you’re implementing a GSI. The partitioning logic depends upon two things: table size and throughput.

Amazon DynamoDB - partitions

The partition for a table is calculated by DynamoDB. Although it is transparent to users, you should understand the logic behind this.
Amazon DynamoDB - calc
(Note: Read Capacity Units – RCU – are measured in 4KB/sec. Write Capacity Units – WCU – are measured in 1KB/sec.)
According to this formula, if we have a table size of 16 GB and we have 6000 RCUs and 1000 WCUs, then:

# of partitions by throughput: 6000/3000+1000/1000 = 3

# of partitions by size: 16/10 = 1.6

So, the # of partitions in total: max(1.6, 3) = 3

Therefore, we will require three partitions. The RCUs and WCUs will be uniformly distributed across the partitions. Here, RCUs per partition will be 3000/3 = 1000. RCUs and the WCUs will be 1000/3 = 333 WCUs. The data per partition will be 16/3 = 5.4 GB per partitions.

8. Amazon DynamoDB streams

DynamoDB streams are like transactional logs for a table. According to the DynamoDB Developer’s Guide:

A DynamoDB stream is an ordered flow of information about changes to items in an Amazon DynamoDB table. When you enable a stream on a table, DynamoDB captures information about every modification to data items in the table.

Streams are applied only to tables, and each stream record appears exactly once in a stream. AWS maintains separate endpoints for DynamoDB and DynamoDB streams. There are all kinds of scenarios where streams can be useful, such as in a messaging application where a message or picture that is updated to a group must be reflected in the message boxes of all the group members, 0r for sending welcome messages to new customers when they sign up for your service.

9. Amazon DynamoDB integration with Amazon EMR and Redshift

NoSQL and Big Data technologies are often discussed together, because they both share the same distributed and horizontally scalable architecture, and both aim to provide high volume, structured, and semi-structured data processing. In a typical scenario, Elastic MapReduce (EMR) performs its complex analysis on datasets stored on DynamoDB. Users will often also use AWS Redshift for data warehousing, where BI tasks are carried out on data loaded from DynamoDB tables to Redshift.

10. Amazon DynamoDB JavaScript Web Shell

AWS has introduced a web-based user interface known as the DynamoDB JavaScript Shell for local development. You can download the tool here.

Steps:

  • Download and extract the appropriate file
  • Run following command:
java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar

Amazon DynamoDB - output

  • Access the console in a browser with the URL: http://localhost:8000/shell

The web page will look like this:

Amazon DynamoDB - interface

  • Click on the button to get some sample commands 5

For example the createTable API will run:

Amazon DynamoDB - createTable API

  • After running this, listTables will show you:

Amazon DynamoDB - listTables
This is a great tool to perform syntax checking before actually going to production.

With DynamoDB, Amazon has done a great job providing a NoSQL service with strong consistency and predictable performance, while saving users from the complexities of a distributed system. One proof of their success is the many systems (like Riak) that chose to build on the DynamoDB design. With a strong ecosystem, Amazon DynamoDB is something to consider when you are building your next Internet-based scale application.

Ready to try it for yourself? Why not use Cloud Academy’s AWS DynamoDB hands-on lab?

Avatar

Written by

Chandan Patra

Cloud Computing and Big Data professional with 10 years of experience in pre-sales, architecture, design, build and troubleshooting with best engineering practices. Specialities: Cloud Computing - AWS, DevOps(Chef), Hadoop Ecosystem, Storm & Kafka, ELK Stack, NoSQL, Java, Spring, Hibernate, Web Service


Related Posts

Avatar
Chandan Patra
— February 21, 2020

Elasticsearch vs. CloudSearch: AWS Cloud Search Choices

Elasticsearch vs. CloudSearch: What's the main difference? Let's compare AWS-based cloud tools: Elasticsearch vs. CloudSearch. While both services use proven technologies, Elasticsearch is more popular, open source, and has a flexible API to use for customization; in comparison, CloudS...

Read more
  • AWS
  • Azure
  • cloudsearch
  • elasticsearch
Alisha Reyes
Alisha Reyes
— February 7, 2020

New on Cloud Academy: Git Labs, CKA and CKAD Lab Challenges, AWS and Azure Learning Paths, AGILE, and Much More

We just kicked off our first Free Weekend of 2020. This means we've unlocked our Training Library for just 72 hours. Until Sunday at 11:59 pm (PST), you can get unlimited access to our industry-leading learning paths, courses, certification prep exams, and our most popular hands-on labs...

Read more
  • agile
  • AWS
  • Azure
  • Google Cloud Platform
  • Linux
  • OWASP
  • programming
  • red hat
  • scrum
Avatar
Stuart Scott
— February 6, 2020

How to Encrypt an EBS Volume

Keeping data and applications safe in the cloud is one of the most visible challenges facing cloud teams in 2020. Cloud storage services where data resides are frequently a target for hackers, not because the services are inherently weak but because they are often improperly configured....

Read more
  • AWS
  • EBS
  • Encryption
Vitaly Kuprenko
Vitaly Kuprenko
— February 4, 2020

Heroku vs. AWS: Which Cloud Solution Works Best in 2020

Heroku vs. AWS: Introduction Сloud-based platforms get more and more recognition. According to Statista, just in the third quarter of 2019, cloud market revenues reached $27.5 billion. By moving to the cloud, businesses can focus on their strategy and other processes instead of dealing...

Read more
  • AWS
  • heroku
Alisha Reyes
Alisha Reyes
— January 31, 2020

How to Unlock Complimentary Access to Cloud Academy

Are you looking to get trained or certified on AWS, Azure, Google Cloud Platform, DevOps, Cybersecurity, Information Security, Python, Java, or another technical skill? Then you'll want to mark your calendars. Starting Friday, February 7 at 12:00 a.m. PST (3:00 a.m. EST), Cloud Acade...

Read more
  • AWS
  • Azure
  • cloud academy content
  • complimentary access
  • GCP
  • on the house
Alisha Reyes
Alisha Reyes
— January 28, 2020

Cloud Academy’s Blog Digest: Top 5 AWS Salary Report Findings, How To Become a Cybersecurity Professional, 8 Financial Benefits of Cloud Migration, and more

Now that it's 2020, how many times have you caught yourself dating a paper 2019? Don't lie. It's happened at least once or twice — or a handful of times — I'm sure. And if you're a member of the "perfect club" that hasn't made any 2020 mistakes, then we're still happy to have you in our...

Read more
  • AWS
  • aws salary
  • blog digest
  • Cloud Academy
  • Cloud Adoption
  • Cloud Migration
  • Cybersecurity
Patrick Navarro
Patrick Navarro
— January 22, 2020

Top 5 AWS Salary Report Findings

At the speed the cloud tech space is developing, it can be hard to keep track of everything that’s happening within the AWS ecosystem. Advances in technology prompt smarter functionality and innovative new products, which in turn give rise to new job roles that have a ripple effect on t...

Read more
  • AWS
  • salary
Alisha Reyes
Alisha Reyes
— January 6, 2020

New on Cloud Academy: Red Hat, Agile, OWASP Labs, Amazon SageMaker Lab, Linux Command Line Lab, SQL, Git Labs, Scrum Master, Azure Architects Lab, and Much More

Happy New Year! We hope you're ready to kick your training in overdrive in 2020 because we have a ton of new content for you. Not only do we have a bunch of new courses, hands-on labs, and lab challenges on AWS, Azure, and Google Cloud, but we also have three new courses on Red Hat, th...

Read more
  • agile
  • AWS
  • Azure
  • Google Cloud Platform
  • Linux
  • OWASP
  • programming
  • red hat
  • scrum
Alisha Reyes
Alisha Reyes
— December 24, 2019

Cloud Academy’s Blog Digest: Azure Best Practices, 6 Reasons You Should Get AWS Certified, Google Cloud Certification Prep, and more

Happy Holidays from Cloud Academy We hope you have a wonderful holiday season filled with family, friends, and plenty of food. Here at Cloud Academy, we are thankful for our amazing customer like you.  Since this time of year can be stressful, we’re sharing a few of our latest article...

Read more
  • AWS
  • azure best practices
  • blog digest
  • Cloud Academy
  • Google Cloud
Avatar
Guy Hummel
— December 12, 2019

Google Cloud Platform Certification: Preparation and Prerequisites

Google Cloud Platform (GCP) has evolved from being a niche player to a serious competitor to Amazon Web Services and Microsoft Azure. In 2019, research firm Gartner placed Google in the Leaders quadrant in its Magic Quadrant for Cloud Infrastructure as a Service for the second consecuti...

Read more
  • AWS
  • Azure
  • Google Cloud Platform
Alisha Reyes
Alisha Reyes
— December 10, 2019

New Lab Challenges: Push Your Skills to the Next Level

Build hands-on experience using real accounts on AWS, Azure, Google Cloud Platform, and more Meaningful cloud skills require more than book knowledge. Hands-on experience is required to translate knowledge into real-world results. We see this time and time again in studies about how pe...

Read more
  • AWS
  • Azure
  • Google Cloud
  • hands-on
  • labs
Alisha Reyes
Alisha Reyes
— December 5, 2019

New on Cloud Academy: AWS Solution Architect Lab Challenge, Azure Hands-on Labs, Foundation Certificate in Cyber Security, and Much More

Now that Thanksgiving is over and the craziness of Black Friday has died down, it's now time for the busiest season of the year. Whether you're a last-minute shopper or you already have your shopping done, the holidays bring so much more excitement than any other time of year. Since our...

Read more
  • AWS
  • AWS solution architect
  • AZ-203
  • Azure
  • cyber security
  • FCCS
  • Foundation Certificate in Cyber Security
  • Google Cloud Platform
  • Kubernetes