RDS vs. EC2
Amazon RDS Costs
Data Lakes in AWS
The course is part of this learning path
This section of the Solution Architect Associate learning path introduces you to the AWS database services relevant to the SAA-C03 exam. We then understand the service options available and learn how to select and apply AWS database services to meet specific design scenarios relevant to the Solution Architect Associate exam.
Want more? Try a lab playground or do a Lab Challenge!
- Understand the various database services that can be used when building cloud solutions on AWS
- Learn how to build databases using Amazon RDS, DynamoDB, Redshift, DocumentDB, Keyspaces, and QLDB
- Learn how to create Elasticache and Neptune clusters
- Understand AWS database costs
- Learn about data lakes and how to build a data lake in AWS
Before we get into DAX itself, I just want to take a minute to cover a 10,000-foot overview of DynamoDB.
Amazon DynamoDB is a fully managed NoSQL database service. There's no database administration required on your end, no servers to manage and nothing to back up. All of this is handled for you by AWS. All you have to do is set up your tables and configure the level of provisioned throughput that each table should have.
As DynamoDB is designed to be highly available your data is automatically replicated across multiple availability zones within a geographic region. In the case of an outage or an incident affecting an entire hosting facility, DynamoDB transparently routes around the affected availability zone.
DynamoDB is designed to be fast. Read and writes take just a few milliseconds to complete, and DynamoDB will be fast no matter how large your tables grow. Unlike a relational database, which can slow down as the table gets large, DynamoDB performance is constant and stays consistent even with tables that are many terabytes in size. You don't have to do anything to handle this, except adjusting the provisioned throughput levels to make sure you've reserved enough read and write capacity for your transaction volume.
Despite its many benefits, there can be some downsides to using DynamoDB too. As I mentioned, your data is automatically replicated to different AZs, and that replication usually happens quickly, in milliseconds, but sometimes it can take longer. This is known as eventual consistency. This happens transparently and many operations will make sure that they're always working on the latest copy of your data. But there are certain kinds of queries and table scans that may return older versions of data before the most recent copy.
This can be a problem in certain circumstances, and sometimes milliseconds might not be fast enough for your use case. You might have a requirement where you need microsecond response times in read-heavy workloads, and this is where DynamoDB Accelerator (DAX) comes in to play. By combining DynamoDB with DAX, you end up with a NoSQL database solution offering extreme performance.
So what is DAX? DAX is an in-memory cache delivering a significant performance enhancement, up to 10 times as fast as the default DynamoDB settings, allowing response times to decrease from milliseconds to microseconds. It is a fully managed feature offered by AWS and as a result, is also highly available.
Dax is also highly scalable making it capable of handling millions of requests per second without any requirement for you to modify any logic to your applications or solutions as its fully compliant with all DynamoDB API calls, and so it seamlessly fits into your existing architecture without any redesign and effort from your developer teams.
Your DAX deployment can start with a multi-node cluster, containing a minimum of 3 nodes, which you can quickly and easily modify and expand, reaching a maximum of 10 nodes, with 1 primary and 9 read replicas. This provides the ability to handle millions of requests per second.
Another benefit of using DAX is that it can also enable you to reduce your provisioned read capacity within DynamoDB. This is because of the fact that data is cached by DAX and so reduces the impact and amount of read requests on your DB tables, instead, these will be served by DAX from the in-memory cache. As we know, reducing the provisioned requirements on your DynamoDB database, will also reduce your overall costs.
From a security perspective, DAX also supports encryption at rest, this ensures that any cached data is encrypted using the 256-bit Advanced Encryption Standard algorithm with the integration of the Key Management Service (KMS) to manage the encryption keys.
If you are new to KMS, then you can find more on this service from our existing course found here.
DAX is a separate entity to DynamoDB and so architecturally it sits outside of DynamoDB itself and is instead placed within your VPC, whereas DynamoDB sits outside of your VPC and is accessed via an endpoint.
During the creation of your DAX Cluster, you will be asked to select a subnet group, this is simply a grouping of subnets across different availability zones. This is to allow DAX to deploy a node in each of the subnets of the subnet group, with one of those nodes being the primary and the remaining nodes will act as read replicas.
To allow your EC2 instances to interact with DAX you will need to install a DAX Client on those EC2 instances. This client then intercepts with and directs all DynamoDB API calls made from your client to your new DAX cluster endpoint, where the incoming request is then load balanced and distributed across all of the nodes in the cluster.
To allow your EC2 instances to communicate with your DAX Cluster you must ensure that the security group associated with your DAX Cluster is open to TCP port 8111 on the inbound rule set.
If a request received by DAX from your client is a read request, such as a
Scan, then the DAX cluster will try and process the request if it has the data cached. If DAX does not have the request in its cache (a cache miss) then the request will be sent to DynamoDB for the results to be returned to the client. These results will also then be stored by DAX within its cache and distributed to the remaining read replicas in the DAX cluster.
With regards to any write requested made by the client, the data is first written to DynamoDB before it is written to the cache of the DAX cluster.
One final point I want to make is that DAX does not process any requests relating to table operations and management, for example, if you wanted to create, update or delete tables. In this instance, the request is passed through directly to DynamoDB.
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.