DynamoDB High Availability
Backup and Restore
Point in Time Recovery
The course is part of these learning paths
This course explores Amazon Web Service's DynamoDB and teaches you how to architect DynamoDB setups—with an emphasis on high availability—ensuring that your internet-scale applications are always available. The course begins by looking at the various option offered by DynamoDB, before moving onto on-demand backup and restore, and rounding off by looking at point in time recovery. In each section, there is a real-world demonstration from the AWS platform which walks you through the topics covered.
If you have any feedback, queries, or suggestions relating to this course, please contact us at email@example.com.
- Understand how to provision and configure DynamoDB in a manner that ensures it is highly available and able to serve all read and write requests to it.
This course has been created for those who are responsible for architecting DynamoDB setups.
To get the most from this course you should be familiar with basic NoSQL concepts, and DynamoDb concepts such as Tables, Items, and Attributes. Consider watching our dedicated “Working with Amazon DynamoDB” course and/or review the “10 Things You Should Know” about DynamoDB blog post, before taking this course.
The following GitHub repository is referenced within this course:
Let's take a quick look at a demo that shows how easy it is to set up and use DynamoDB Global Tables.
In this example I’ll use the aws cli to perform the following sequence:
- Create a new table named “cloudacademy-courses” in the us-west-2 region. This table will contain courses provided by CloudAcademy.
- Populate the “cloudacademy-courses” table with sample course data.
- Convert the “cloudacademy-courses” table into a Global Table, deploying a replica table in the alternate region ap-southeast-2.
- Confirm both tables are in an ACTIVE status.
- Add new course data into the us-west-2 region hosted “cloudacademy-courses” table.
- Watch and observe the time taken to propagate data into the replicated table in the ap-southeast-2 region.
Ok let's begin. I’ll start by browsing to the CloudAcademy DynamoDB GlobalTables Github repo. As you can see the readme for this repo contains all of the instructions for this demo. We’ll simply copy and paste each of the steps from this readme as we later proceed. Regardless, we also need to git clone this repo to give us access to the data files used to populate the global table that we are soon to set up. Therefore I’ll simply copy the git clone URL and then jump into my local terminal and then perform a git clone using the URL just copied.
Navigating into the new “dynamodb-globaltables” directory. I’ll use the tree command to examine and display its contents. Here we can see the 2 data files batch.course.data1.json and batch.course.data2.json which we will use to later populate our global table.
Ok, next I’ll copy the step 1 instruction and then paste it into the terminal. Step 1 simply creates a new DynamoDB table named “cloudacademy-courses” and locates it in the us-west-2 region. The billing mode is set to “PAY_PER_REQUEST” - this is a requirement to be able to later create a global table off this table.
This looks good! Our new “cloudacademy-courses” table has been successfully created. We can confirm this by jumping into the AWS DynamoDB console and then selecting the “tables” view within the Oregon region. Here we can indeed see that the new “cloudacademy-courses” table has been created and has an “Active” status. Ok let’s move on and populate this table.
Before I run the Step 2 command - lets take a quick look at the dataset that we are going to reference and use to populate the new “cloudacademy-courses”. Using the cat command on the batch.course.data1.json file - we can see that it contains 3 course items that will be inserted into the table. Each item has a courseid which acts as the primary key, followed by the attributes company, title, URL, duration, and instructor.
I’ll now execute the Step 2 command which will result in our “cloudacademy-courses” table being populated with these 3 items. Ok that looks good as per the fact that the output contains an empty UnprocessedItems object.
Jumping back into the AWS DynamoDB console - we can navigate into the table and select the “Items” tab to view the current set of items, and as expected we have the 3 new items that we just populated it with.
We are now ready to convert the “cloudacademy-courses” table into a global table - to do so I’ll copy the Step 3 command and then execute it back within the terminal. This command is going to create a replica read/write multi master table in the ap-southeast-2 Sydney region.
This looks good! We can see in the output that the TableStatus is set to “UPDATING” - we’ll wait a few minutes before executing the Step 4 command - just to give the Step 3 command enough time to propagate the table changes across to the new region. While we are waiting we can jump back into the AWS DynamoDB console and take a look at the “Global Tables” tab. Here we can see that there are 2 configured regions for our “cloudacademy-courses” table. The original us-west-2 region which has an ACTIVE status, and the newly configured ap-southeast-2 region which has a CREATING status.
Ok, let’s now copy the Step 4 command and execute it. This simply displays the details of the newly provisioned “cloudacademy-courses” global table in located in the ap-southeast-2 region. Again we can see that it is still in a CREATING status as per the TableStatus attribute.
We need to wait for this table to achieve ACTIVE status - therefore lets periodically poll this table every 30 seconds by executing the Step 5 command. This command re-executes the previous command every 30 seconds and then extracts out and displays the TableStatus attribute value - which we can see is still in CREATING status. We need to pause here until this changes to ACTIVE.
I’ll now speed up the demo to the point where the table reaches the ACTIVE status which we can now see. Ok this is a great result - and implies that our “cloudacademy-courses” global table is ready.
Next, let's watch and observe the replication of data writes to the “cloudacademy-courses” global table setup. To do so I’ll now clear the current terminal and then use the tmux command to split the terminal into 2 panes. Within the terminal, I'm using the key sequence, control plus b together with a double quote. Excellent. To navigate between the two panes, again use the key sequence, control plus b and then the up and down arrow keys.
In the top pane I’ll run the Step 6 pane 1 command. This will setup a watch that continuously performs a read against the ap-southwest-2 hosted “cloudacademy-courses” table every second looking for a new data item that we will next insert into the us-west-2 hosted “cloudacademy-courses” table in the bottom pane. This will allow us to observe the speed at which the global table changes propagate between regions.
Next, I’ll move focus to the bottom pane into which I’ll execute the Step 6 pane 2 command. This command inserts new table data - and of which the top pane command is querying for.
As you can see, the propagation time is approximately 1-2 seconds which is quite impressive considering the data is being replicated from Oregon in the States to Sydney in Australia.
Let’s now reverse the setup and this time write data into the ap-southwest-2 hosted “cloudacademy-courses” table and see it get replicated back into the us-west-2 hosted “cloudacademy-courses” table, emphasising indeed that global tables are configured as multi master read/writes.
To perform this, I’ll make a copy of the batch.course.data2.json file and name it batch.course.data3.json. I’ll then use vim to edit the contents of this file and simply update each of the 3 items courseid keys with new unique values. I’ll save this back to the file system. Next, I’ll update the watch command to query from the us-west-2 region. And then in the other tmux pane I’ll rerun the aws dynamodb batch-write-item command but this time have it insert into the ap-southeast-2 region.
Again we can observe that the propagation time is quick. But more importantly, this time we have demonstrated that our “cloudacademy-courses” global table is truly configured in a multi master read/write configuration - this is very cool!!
Finally, let’s jump back into the AWS DynamoDB console and examine the “cloudacademy-courses” table. Refreshing the items view we can see all of the expected 9 items that we populated the “cloudacademy-courses” global table with. Note that this is the view of the items as currently held within the table located in the us-west-2 region.
We can equally view the items held within the table located in the ap-southeast-2 region by clicking on the Global Tables tab and then clicking on the Sydney region. This will open a new browser tab for the current “cloudacademy-courses” table albeit in the Sydney region - then clicking on the items tab we can see the same 9 replicated table items.
Ok, to summarise what we have just demonstrated.
- We created a new table named “cloudacademy-courses” in the us-west-2 region.
- We then populated the “cloudacademy-courses” table with sample course data
- We then converted the “cloudacademy-courses” table into a Global Table, deploying a second replica table in the alternate ap-southeast-2 region.
- We then confirmed both tables were in an ACTIVE status before proceeding.
- We then added new course data into us-west-2 table, and then confirmed that it was replicated to the ap-southeast-2 table - and confirm replication was very quick.
- We then repeated the previous step but in the reverse direction - that is we added new course data into ap-southeast-2 table, and then confirmed that it was replicated to the us-west-2 table - this time confirming that global tables are setup as multi master read/writes.
About the Author
Jeremy is the DevOps Content Lead at Cloud Academy where he specializes in developing technical training documentation for DevOps.
He has a strong background in software engineering, and has been coding with various languages, frameworks, and systems for the past 20+ years. In recent times, Jeremy has been focused on DevOps, Cloud, Security, and Machine Learning.
Jeremy holds professional certifications for both the AWS and GCP cloud platforms.