Read Replicas


Understanding RDS Scaling & Elasticity
Scaling with RDS
Configuring Operational Parameters for AWS Databases
When to use RDS Multi-AZ & Read Replicas
RDS Multi AZ
Read Replicas

The course is part of this learning path

Read Replicas
2h 8m

This course covers the core learning objective to meet the requirements of the 'Designing Database solutions in AWS - Level 3' skill

Learning Objectives:

  • Analzy targert AWS database platforms when performing a migration
  • Create and deploy an enterprise-wide scalable RDS Database solition to meet and exceed workload performance expectations
  • Create an AWS database slution to withstand AWS global infrastructure outages with minimal data loss

Hello and welcome to this lecture covering read replicas. So, we now know that Multi-AZ provides a feature that allows for the fast recovery of read/write services when your primary RDS instance fails. So, let's look at read replicas. read replicas are not used for resiliency or as a secondary instance in the event of a failover. Instead, they can be used by your application and other services or users to serve read-only access to your database data via a separate instance, a read replica. So, for example, let's assume we have a primary RDS instance which serves both read and write traffic. Due to the size of the instance and the amount of read-intensive traffic being directed to the database for queries, the performance of the instance is taking a hit. To help resolve this, you can create a read replica. A snapshot will be taken of your database, and if you are using Multi-AZ, then this snapshot will be taken of your secondary database instance to ensure that there are no performance impacts during this process. Once the snapshot is completed, a read replica instance is created from this data. 

The read replica then maintains a secure asynchronous link between itself and the primary database. At this point, read-only traffic can be directed to the read replica to serve queries, perhaps on business intelligence tools. By implementing read replicas, it helps to offload this traffic from the primary instance, and therefore, helping with the overall performance. Do be aware when thinking about deploying read replicas that they are only available for MySQL, MariaDB, and PostgreSQL database engines. However, for the latest supported engines for read replicas, it is always best to consult the AWS documentation, as this can change over time. Thankfully, it is possible to deploy more than one read replica for a primary database, and there are a number of different reasons as to why you might want to do this. By adding more than one read replica, it allows you to scale your read performance to a wider range of tools and applications that need to query the data without being restricted to a single read replica. It is also possible to deploy a read replica in a different region, which significantly helps to enhance your DR capabilities. 

It's also possible to promote an existing read replica to replace the primary database in the event of an incident. Also, during any maintenance that is being perform on your primary instance where I/O requests may have been suspended, then read traffic can still be served by a read replica. I now want to talk about read replicas for each DB engine type, and the slight differences between them, starting with MySQL. Read replicas are only supported where the source database is running MySQL 5.6 or later. In addition to this, another prerequisite is that the retention value of the automatic backups of the primary database needs to be set to a value of one or more. Replication is also only possible when using an InnoDB storage engine, which is transactional as opposed to MyISAM which is non-transactional. It's also possible to have nested read replica chains. For example, you could have a read replica which replicates from your source database. 

This read replica can then act as a source database for another read replica, and so on. However, this chain can only be a maximum of four layers deep. If you do nest your read replicas underneath each other, then the same prerequisites discussed previously must also apply to the source read replica. For example, it must be running MySQL 5.6 and have a value of one or greater for automatic backup retention. Also bear in mind that you can only up to a maximum of five read replicas per source database, but a source database could be another read replica using the nested feature as just explained. You might be wondering what happens if you have a read replica created from a source database which has Multi-AZ configured. At which point, an outage occurs and shuts down the primary instance. What happens to the read replicas? Well, the answer is that RDS automatically redirects the read replica source to the secondary database to allow the asynchronous replication of data to occur. From an operational perspective, it's important to understand how your read replicas are performing and if they are maintaining a high level of synchronization with their source database instance. Using Amazon CloudWatch, you can monitor this value through a metic called Amazon RDS ReplicaLag. This value will show you how many seconds the read replica is behind the source database. 

You want this value to be as close to zero as possible, or ideally, read as zero. For the MariaDB engine type, much of the information remains the same as per MySQL read replica limitations. For example, you still need to have the backup retention period greater than zero, and again, you can only have five read replicas per source database. The same read replicas nesting rules apply and you also have the same monitoring metric for CloudWatch, however, you can be running any version of MariaDB for read replicas when running this DB engine. For PostgreSQL, there are a few more differences. I'll start with the similarities, however. You still need the automatic backup retention to be greater than one and the limitation of read replicas is five per source database. However, the replication process is slightly different. For PostgreSQL version 9.3.5 and later, the native PostgreSQL streaming replication is used to handle the replication and creation of the read replica. The connection between the master and the read replica instance allows write-ahead log data to be sent, which replicates data asynchronously between the two instances. 

A specific role is also introduced to manage this replication when using PostgreSQL. This role only has the abilities to handle and manage the replication, and it doesn't have any permissions to modify or change the data being transmitted across the connection. Interestingly, you are able to create a Multi-AZ read replica instance. Meaning that when you create your read replica, RDS automatically configures a secondary read replica in a different AZ of the source read replica, much like Multi-AZ for your source databases as we discussed in the previous lecture. This feature can be used even if the source database of the first read replica isn't configured for Multi-AZ itself. This means your read replica could be more resilient than your source database if your source database isn't configured for Multi-AZ. It's not possible to have a nested read replica when using PostgreSQL like you can with MySQL and MariaDB database engines. However, you can still use the same monitoring metric of ReplicaLag. That now brings me to the end of this lecture covering RDS read replicas.

About the Author
Learning Paths

Stephen is the AWS Certification Specialist at Cloud Academy. His content focuses heavily on topics related to certification on Amazon Web Services technologies. He loves teaching and believes that there are no shortcuts to certification but it is possible to find the right path and course of study.

Stephen has worked in IT for over 25 years in roles ranging from tech support to systems engineering. At one point, he taught computer network technology at a community college in Washington state.

Before coming to Cloud Academy, Stephen worked as a trainer and curriculum developer at AWS and brings a wealth of knowledge and experience in cloud technologies.

In his spare time, Stephen enjoys reading, sudoku, gaming, and modern square dancing.