Scaling with RDS


Understanding RDS Scaling & Elasticity
Scaling with RDS
Configuring Operational Parameters for AWS Databases
When to use RDS Multi-AZ & Read Replicas
RDS Multi AZ
Read Replicas

The course is part of this learning path

Scaling with RDS
2h 8m

This course covers the core learning objective to meet the requirements of the 'Designing Database solutions in AWS - Level 3' skill

Learning Objectives:

  • Analzy targert AWS database platforms when performing a migration
  • Create and deploy an enterprise-wide scalable RDS Database solition to meet and exceed workload performance expectations
  • Create an AWS database slution to withstand AWS global infrastructure outages with minimal data loss

One of the greatest benefits of using Amazon RDS is that it is an AWS managed service. This management covers patching, security updates, and other low level undifferentiated heavy lifting. Alleviating this burden helps to deal with many of the annoying aspects of scaling up or out your database.

Let begin by taking a look at how RDS scales vertically for both reads and writes

Scaling your RDS Database vertically is probably the simplest way to alleviate burdens on your ready or write throughput. Switching out the underlying instance for one with more CPU and RAM is literally just a button click away. You can scale vertically up to a maximum of 32 vCPUs and 244 GiB of RAM. Now it is important to know that this scaling does cause downtime for your database, but you can always schedule this around your normal maintenance windows.

In regards to scaling your RDS databases, it's important to note that the storage volume and the instance type are decoupled. Which means when you vertically scale up or down, your storage will stay the same size. However RDS does support storage autoscaling, which can alleviate this problem, otherwise you can set it yourself.

Vertical scaling is a fine answer to many throughput problems, but it won't be super cost effective if your issues are very read heavy or very write heavy. Since upgrading the hardware provides an increase in both of those dimensions, you only get half of the benefit.

With that in mind let's see what options are available for read heavy workloads.

RDS provides a fantastic way to increase your read throughputs without having to change the size of your underlying database instance. It does this by using a horizontal scaling method called read replicas. 

A read replica is a copy of your database that gives the user another access point to retrieve data from. This helps to alleviate the bottleneck on your primary database. The read replica is kept in sync with the primary database and only allows its users to read data. If this wasn't the case, there would be synchronization issues and race conditions that are troublesome to deal with.

RDS creates a read replica by building a snapshot of your primary database instance, and creates a full read only database copy from it. You will experience a short  I/O suspension that will last for about a minute, on your source database while this snapshot occurs.

Amazon RDS then uses the asynchronous replication method for the DB engine to update the read replica whenever there is a change to the source DB instance.

RDS allows you to create up to 5 read replicas for each DB instance. This is supported by Amazon RDS for MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server.

And if at any time your primary database was to go down, or become corrupted in some fashion, you have the ability to promote your read replica into a new primary database. Your traffic can migrate over to this copy using route 53 failover routing and health checks.

You need to use Amazon Route 53 weighted record sets to distribute requests across your read replicas. You do this by creating individual record sets for each DNS endpoint associated with your read replicas and giving them the same weight. Then, direct requests to the endpoint of the record set.

Scaling your rds database for writes can be very difficult. There isn't a simple built in way for RDS to improve the write throughput besides scaling your whole database vertically. However there is a technique called sharding that can be implemented to get around this.

Sharding is similar in a way to using read replicas, in that you create an additional database to share the load of the primary. This database however is a fully working version, that can both read and write. The catch however is that each database deals with different parts of your entire dataset.

For example: you could have 1 shard of your database that deals with all customers whose last name begins from a to m, and a second shard that deals with all customers from n to z.

Since they do not share any portion of the dataset in common, there are no worries about synchronicity. Additionally each of these shards have all the scaling capabilities we have already discussed. They too can have read replicas, as well as the ability to scale the underlying instances themselves.

When thinking about sharding your database, it's important to reiterate that RDS does not handle this natively. You need to deal with the logic from the application side on which database contains what you are looking for - and handling writing and reading from the appropriate one. 

Sharding is something that should be considered on the onset of creating your architectures and database. One of the downsides to sharding is that you lose the ability to easily do joins on these separate datasets. You would specifically have to engineer that ability and that adds another layer of complexity.

There are many ways you can shard your database. It is extremely important that you design a solution that will work long term. Resharding your databases if they become overburdened again is also a possibility but adds more downtime and creates complexity.


Course Introduction - Database Scalability - Course Summary 

About the Author
Learning Paths

Stephen is the AWS Certification Specialist at Cloud Academy. His content focuses heavily on topics related to certification on Amazon Web Services technologies. He loves teaching and believes that there are no shortcuts to certification but it is possible to find the right path and course of study.

Stephen has worked in IT for over 25 years in roles ranging from tech support to systems engineering. At one point, he taught computer network technology at a community college in Washington state.

Before coming to Cloud Academy, Stephen worked as a trainer and curriculum developer at AWS and brings a wealth of knowledge and experience in cloud technologies.

In his spare time, Stephen enjoys reading, sudoku, gaming, and modern square dancing.