1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. Solution Architect Professional for AWS- Domain Five - Data Storage

Things to Remember


Data Services
AWS Data Services
PREVIEW15m 26s
Start course

Course Description

In this course, you'll gain a solid understanding of the key concepts for Domain Five of the AWS Solutions Architect Professional certification: Data Storage. We will explore AWS storage services and how we can implement those in the most effective and efficient manner.

Course Objectives

By the end of this course, you'll have the tools and knowledge you need to successfully accomplish the following requirements for this domain, including:

  • Demonstrate ability to make architectural trade off decisions involving storage options.
  • Demonstrate ability to make architectural trade off decisions involving database options.
  • Demonstrate ability to implement the most appropriate data storage architecture.
  • Determine use of synchronous versus asynchronous replication

Intended Audience

This course is intended for students seeking to acquire the AWS Solutions Architect Professional certification. It is necessary to have acquired the Associate level of this certification. You should also have at least two years of real-world experience developing AWS architectures.


As stated previously, you will need to have completed the AWS Solutions Architect Associate certification, and we recommend reviewing the relevant learning path in order to be well-prepared for the material in this one.

This Course Includes

  • Expert-led instruction and exploration of important concepts.
  • Complete coverage of critical Domain Five concepts for the AWS Solutions Architect - Professional certification exam.

What You Will Learn

  • The various data storage services available within the AWS ecosystem. 
  • Domain Five relevant skills and knowledge for passing the exam. 
  • Synchronous vs asynchornous replication. 

The size of tables is a cost factor with RDS. So finding ways to reduce table size can be an effective cost reduction strategy. Introducing ElasticCache can reduce the provision read requirements for an RDS database. Synchronous replicas cannot be connected to or used as read replicas outside of RDS. You can create a read replica from your multi AZ master database, and use that for your reporting or your read cache, et cetera. If we need to synchronize data, s3distcp enbales you to synchronize, on premise, hadoop clusters with Elastic MapReduce. If you're using s3distcp, you can efficiently copy large amounts of data from Amazon S3, into HDFS, where it can be processed by subsequent steps in your Amazon EMR cluster. Memcached cache clusters are comprised of one to 20 nodes. Each time you change the number of nodes in your memcached cache cluster, you must remap at least some of your key space, so it maps to the correct node. When you scale your memcache cluster up or down, you must create a new cache cluster. Memcached cache clusters always start out empty, unless your application populates it. Caching can become a single point of failure in multi-AZ environments. Memcached does not support replication. To mitigate the impact of a node failure, spread your cached data over more nodes. ElastiCache uses lazy loading, which loads data only when requested. The disadvantages of this are that it results in three writes if there is a cache miss. You can use the TTL and write-through values to tune cache performance. Now with Redis, the master slave replication and eventual consistency are the benefits. Redis also supports clustering. It provides more features with how you manage data. You can do hashes, sorted sets, et cetera. And a Redis replication group is comprised of a single primary cluster, which your application can both read from and write to. And from one to five read-only replica clusters. Whenever data is written to the primary cluster, it is also asynchronously updated to the read replica cluster. With Redis you can also enable your pinned only file option. When AOF is enabled, whenever data is written to your Redis cluster, a corresponding transaction record is written to a Redis pinned only file. If your Redis process restarts, ElastiCache creates a replacement cluster and provisions it. You can then run the AOF against the cluster to repopulate it with the data. Now some of the shortcomings of using Redis AOF to mitigate cluster failures are that it's time consuming. That the AOF can get really big and using AOF cannot protect you from all failure scenarios. You can enable multi-AZ with automatic fail over on your Redis application groups. Whether you enable multi-AZ with auto failure or not, a failed primary will be detected and replaced automatically.

About the Author
Andrew Larkin
Head of Content
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.