This is a guest post from 47Line Technologies.
In our previous post ‘DynamoDB: An Inside Look Into NoSQL’, we looked at Design Considerations of NoSQL and introduced the concept of eventual consistency. In this article, we will introduce the concepts and techniques used while architecting a NoSQL system.
The core distributed systems techniques employed in DynamoDB are – partitioning, replication, versioning, membership, failure handling, and scaling. Phew! Did you really think that the internals will be simple? 🙂
The following table summarizes the list of techniques used in DynamoDB:
|Partitioning||Consistent Hashing||Incremental Scalability|
|High Availability for Writes||Vector Clocks with reconciliation during reads||Version size is decoupled from update rates|
|Handling temporary failures||Sloppy Quorum and Hinted Handoff||Provides high availability & durability guarantee when some replicas are not available|
|Recovery from permanent failures||Anti-entropy using Merkle trees||Synchronizes divergent replicas in the background|
|Membership & Failure Detection||Gossip-based membership protocol & failure detection||Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information|
I know I have covered a lot of lingo and buzz words. If you really need to take a deep breath, now is the time! Fear not, in subsequent articles we will deep-dive into each of the above-mentioned techniques. Remember, the devil is in the details!
DynamoDB exposes two interfaces:
get(key) operation locates the
object associated with the
key and returns it along with a
put(key, context, object) operation uses
key to determine the associated replicas and writes the
object in those replicas. The
context information is invisible to the user and contains metadata such as version. DynamoDB applies an MD5 hash on the
key to generate a 128-bit identifier, which is used to determine the replicas that are responsible for serving the