This is a guest post from 47Line Technologies.

In Part 2, we looked at Design Considerations of NoSQL and introduced the concept of eventual consistency. In this article, we will introduce the concepts and techniques used while architecting a NoSQL system.

The core distributed systems techniques employed in DynamoDB are – partitioning, replication, versioning, membership, failure handling and scaling. Phew! Did you really think that the internals will be simple? :-)

DynamoDB Internals

The following table summarizes the list of techniques used in DynamoDB:

Problem Technique Advantage
Partioning Consistent Hashing Incremental Scalability
High Availability for Writes Vector Clocks with reconciliation during reads Version size is decoupled from update rates
Handling temporary failures Sloppy Quorum and Hinted Handoff Provides high availability & durability guarantee when some replicas are not available
Recovery from permanent failures Anti-entropy using Merkle trees Synchronizes divergent replicas in the background
Membership & Failure Detection Gossip-based membership protocol & failure detection Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information

I know I have covered a lot of lingo and buzz words. If you really need to take a deep breath, now is the time! Fear not, in subsequent articles we will deep-dive into each of the above mentioned techniques. Remember, the devil is in the details!

System Interface

DynamoDB exposes two interfaces: get and put. The get(key) operation locates the object associated with the key and returns it along with a context. The put(key, context, object) operation uses key to determine the associated replicas and writes the object in those replicas. The context information is invisible to the user and contains metadata such as version. DynamoDB applies an MD5 hash on the key to generate a 128-bit identifier, which is used to determine the replicas that are responsible for serving the key.

Article authored by Vijay Olety