Database Migration Service
Understanding RDS Scaling & Elasticity
Configuring Operational Parameters for AWS Databases
Amazon RDS Performance Insights
When to use RDS Multi-AZ & Read Replicas
The course is part of this learning path
This course covers the core learning objective to meet the requirements of the 'Designing Database solutions in AWS - Level 3' skill
- Analzy targert AWS database platforms when performing a migration
- Create and deploy an enterprise-wide scalable RDS Database solition to meet and exceed workload performance expectations
- Create an AWS database slution to withstand AWS global infrastructure outages with minimal data loss
Hi and welcome to lecture three, setting up the database migration service. Following this course we will be able to recognize and explain the core steps taken when using the AWS database migration service for a homogeneous database migration. So at a high level when we're using the AWS database migration service, we need to do the following. First we provision a replication server. Second we define a source endpoint database. Third we define a target endpoint database. And fourth we recreate a replication task. The database migration service can be sent out from the AWS management console. Once the EC2 instance has been set up to host the database migration service, the service connects to the target database you defined using a public endpoint you specify. As this service is running on the public domain, we will also need to set up security groups and access permissions to restrict access to the services. The service then requests and reads the source data. Most of the response processing happens in memory. However large transactions may require some buffering on disk. Cached transactions and log files are also written to disk. There are a number of options we can set to improve how the database migration service manages migration of the data. The service only supports migration to an AWS hosted database. So it's not intended to be used to migrate to another cloud provider at this stage. Most common use case is to port your on premise database to a cloud version of the same database. Or to migrate your on premise database to a open source database hosted in the AWS infrastructure. As with everything AWS, the list of supported engines will be updated frequently so you need to check back on the AWS website to see what the current versions supported are. Currently you can use the following databases as a source for data migration using the database migration service. For our on premise and EC2 instance databases, we can use Oracle versions 10.2 and later, 11g and 12c for the Enterprise, Standard, Standard One and Standard Two editions. For Microsoft SQL Server, versions 2005, 2008, 2008R2, 2012 and the 2014 versions and for the Enterprise, Standard, Workgroup and Developer editions. Now do note that the Web and Express editions are not supported as a source type. For MySQL we've got version support for 5.5, 5.6 and 5.7. For MariaDB it's supported as a MySQL-compatible data source. For Postgres version 9.4 and later. We can use the SAP Adaptive Server Enterprise 15.7 and later. Now if we're using the Amazon RDS instance databases as a source, we can support Oracle versions 11g, versions 184.108.40.206 version one and later and 12c for the Enterprise, Standard, Standard One and Standard Two editions. For Microsoft SQL Server we get support for versions 2008R2, 2012 and 2014 for the Enterprise, Standard, Workgroup and Developer editions. Note though that change data capture or CDC operations are not supported. The Web and Express editions are currently not supported as a source running on the Amazon RDS instance. So for MYSQL we've got versions 5.5, 5.6 and 5.7. For MariaDB it's supported as a MySQL-compatible data source. And for Postgres version 9.4 and later. We also get Amazon Aurora, which is supported as a MySQL-compatible data source when running from Amazon RDS as a source. So in terms of our targets, you can use the following databases as a target for data replication. For on premises and Amazon EC2 instance databases, we can use Oracle versions 10g, 11g, 12c for the Enterprise, Standard, Standard One and Standard Two editions. For Microsoft SQL Server we get support for versions 2005, 2008, 2008R2, 2012, 2014 and 2016. That's for the Enterprise, Standard, Workgroup and Developer editions. But the Web and Express editions are not supported. For MySQL we've got support for 5.5, 5.6 and 5.7. For MariaDB it's supported as a MySQL-compatible data target. For Postgres we can support versions 9.3 and later. And again we can use the SAP Adaptive Server Enterprise or ASE version 15.7 and later as a target. If we're running into Amazon RDS instance databases or Amazon Redshift, then we get support for Oracle versions 11g, which is versions currently 220.127.116.11 version one and later and 12c for the Enterprise, Standard, Standard One and Standard Two editions. For Microsoft SQL Server we get support for versions 2008R2, 2012 and 2014 for the Enterprise, Standard, Workgroup and Developer editions. But do note as well that the Web and Express editions are not supported as a target either. For MySQL we get versions 5.5 and 5.6 and 5.7. For MariaDB MySQL support as a compatible data type. For Postgres versions 9.3 and later. We can also support Amazon Aurora and Amazon Redshift. Now with the Amazon web services database migration service, you're only paying for the resources that you use. The database migration service replication instance that you create will be live, not running in a sandbox so you will incur the standard AWS usage fees for the instance until you do terminate it. During any migration, the service tracks changes being made on the source database so that those changes can be applied to the target database to eventually keep the two databases in sync. All other source and target databases can be of the same engine type. They don't need to be. A typical task consists of three major phases. The load phase or full load, the application of cached changes and ongoing replication. During the full load, data is loaded from tables on the source database to tables on the target database. It's done in eight tables at a time, which is the default for the tool. While the full load is in progress, changes made to the tables that are being loaded are cached on the replication server. These are the cached changes. Now with cached changes, it's important to know that the capturing of changes for a given table doesn't begin until the full load for that table starts. So in other words the start of change capture for each individual table will be different. After the load or full load for a given table is complete, you can begin to apply the cached changes for that table immediately. When all the tables are loaded, you can begin to collect changes as transactions for the ongoing replication phase. After all cached changes are applied, your tables are consistent transactionally and you can move to the ongoing replication phase, applying changes as transactions. Upon initial entry into the ongoing replication phase, there will be a backlog of transactions causing some lag between the source and target databases. After working through this backlog, the system will eventually reach a steady state. At this point then when you're ready, you can shut down your applications, allow any remaining transactions to be applied to the target, restart your applications pointing to the new database target. Database migration services will create the target scheme or objects that are needed to perform the migration. The database migration service takes a minimalist approach and creates only those objects required to efficiently migrate the data. In other words database migration services will create tables, primary keys and in some cases, unique indexes, but it won't create secondary indexes, non-primary key constraints, data defaults or other objects that are not required to efficiently migrate the data from the source system. So in most cases when you're performing a migration, you'll also want to migrate some or all of your source schema. Now if you're performing a homogeneous migration, you can accomplish this by using your engine's native tools or performing a no data import/export of the schema. If your migration is heterogeneous, you can use the AWS schema conversion tool to generate a complete target schema for you. Now one thing to keep in mind is that any inter-table dependencies such as foreign key constraints, must be disabled during the full load and cached change application phases of the database migration service processing. Also, if performance becomes an issue, it would be a good idea to remove or disable secondary indexes during the migration process. So what are a few common anti patterns that we should be aware of when we're considering whether or not to use the database migration services? Now the database migration service will only create the objects needed to perform an optimized migration of your data. So it does the least basically. So if you're looking to change engines, you should look to use the AWS schema conversion tool to convert an entire schema from one database engine to another. Now most databases offer a native method for migrating between servers and platforms and sometimes using a simple backup and restore or export/import is gonna be the most efficient way to migrate your data into AWS. So if you're considering a homogeneous migration, you should first access whether a suitable native option exists. And in some situations you might choose to use native tools to perform the bulk load and then use the database migration service to capture and apply changes that occur during that bulk load. As an example when migrating between different flavors of MySQL or Amazon Aurora, creating and promoting a read replica is mostly likely the best way of doing it. If you can successfully set up a replica of your primary database in your target environment by using native tools, then you should consider using a native method for migrating your system. And a few common examples that come to mind are MySQL read right replicas, using a standby database for Oracle or Postgres and the always on availability groups for Microsoft SQL Server.
Stephen is the AWS Certification Specialist at Cloud Academy. His content focuses heavily on topics related to certification on Amazon Web Services technologies. He loves teaching and believes that there are no shortcuts to certification but it is possible to find the right path and course of study.
Stephen has worked in IT for over 25 years in roles ranging from tech support to systems engineering. At one point, he taught computer network technology at a community college in Washington state.
Before coming to Cloud Academy, Stephen worked as a trainer and curriculum developer at AWS and brings a wealth of knowledge and experience in cloud technologies.
In his spare time, Stephen enjoys reading, sudoku, gaming, and modern square dancing.