Migrating Virtual Machines to the AWS Cloud
Migrating Virtual Machines: why?In this blog post, we'll discuss how to migrate your on-premise virtual machines to AWS - focusing mostly on the ...Learn More
The AWS Schema Conversion Tool is a component of AWS Data Migration Service (DMS) which is still in preview mode at the time of writing this post (December 2015). We have discussed the benefits of migrating to AWS in a previous post.
With DMS, Amazon is wooing corporate customers with a low-cost means of moving their database workloads to the cloud. Amazon already has a managed database service called RDS (Relational Database Service). However RDS has its own limitations, and one of those limitations is related to data migration. Until now, there was no simple way of migrating schema and data from an on-premise or EC2-hosted database to RDS. Because RDS is a managed service it doesn’t support a number of features specific to the database engines. Previously, the only solution would require DBAs and developers writing their own migration scripts and testing them in a painful and an iterative manner until all the wrinkles have been ironed out.
Amazon addresses this issue with Data Migration Service. As Amazon’s public website for DMS states, the service will allow seamless and continuous data replication between a source and a target environment during a migration process. These environments may use either the same database engines or different ones as source and target. It is worth noting that the source can be an on-premise database instance, an EC2-hosted instance, or even another RDS instance. The target can be an EC2 instance or another RDS instance. AWS now supports a number of database engines, so that these combinations can support a wide variety of use cases. Migrating data to AWS is progressively easier with each passing week. Here’s a good post about the AWS Data Migration Tool that was written at re:Invent 2015.
Anyone who ever worked in a data migration project knows the process is never simple — never. A number of disciplines are involved here: solutions architects, infrastructure architects, engineers, DBAs, developers, and quality assurance testers. All these roles form components of project teams whose work can span from a few weeks to many months. Successful database migration is generally iterative in nature. A repeatable process may recreate the target database structure every time, and load it with fresh source data repeatedly for quality assurance purposes. This process continues until everything works perfectly and the approvals are given for the next step. In my experience, the sooner this iterative process is put into practice, the better the chance of successful migration.
But a seamless data migration can only happen when the source and destination databases use similar or, at least, compatible structures. It may be relatively easy to migrate an on-premise SQL Server database to an RDS instance running the same engine, but, not so easy if the target is, say, MySQL. That’s because both engines use different methods of defining database objects and their code syntax is also widely different. There is no easy way out here. When the source and target databases are different, DBAs and developers have to bite the bullet and script the schema objects and programming logic in the target environment’s own language. More often than not, this becomes the most time-consuming part of migrating data to AWS.
This is where the AWS Schema Conversion Tool can come in really handy. You are probably wondering, “can the AWS Schema Conversion Tool save time for the database engineer?” The answer is yes! The AWS Schema Conversion Tool can connect to a source database engine and reverse engineer a source database structure to a format suitable for the target database. This includes converting both database schema objects (tables, indexes, constraints etc.) and database logic (functions, stored procedures, triggers etc). Where it can’t convert a source item, it will flag it and recommend remediation steps. You can then apply the converted schema and logic to the target environment from this same tool and make any manual changes to the target as suggested.
Migrating data to AWS has been common for a while and database migration toolkits are nothing new either. Most commercial database vendors offer them for free, but there are few that offer multiple source and target options, and I can’t think of any others available for almost all operating systems. The AWS Schema Conversion Tool is a desktop product that runs on Windows, Mac OS, Fedora and Ubuntu Linux. It’s free and can be downloaded once you sign up for a preview of the Database Migration Service.
Being the first of its kind from AWS, this tool has some limitations in its first release. It’s used for converting databases to RDS instances only — more precisely MySQL or Aurora in RDS. The source database engines can be either MS SQL Server or Oracle. That means there’s no support for PostgreSQL or the more recent addition, MariaDB.
We tested the AWS Schema Conversion Tool by installing it on a Windows 10 machine as well as on a Mac (running Yosemite). The installation process is fairly simple and fast in both cases.
Once the tool is installed, you need to download and install the JDBC drivers for the supported database engines. This is because The AWS Schema Conversion Tool uses JDBC to connect to databases. In our case, we downloaded and installed two:
Once the installation is complete, you need to tell the AWS Conversion Tool where to find those drivers. This is done by selecting the Settings > Global Settings menu option and choosing Drivers from the dialog box. In the image below, we have specified the paths for both the drivers:
With the initial setup complete, you can start a new migration project (every migration has to be part of a project). In our case, we chose to convert the well-known AdventureWorks database running in an EC2 instance to a MySQL 5.6 RDS instance.
The first step is to define the project properties. When in this screen, you see the source can be either SQL Server or Oracle and the destination can be MySQL or Aurora.
Once the project is created, it’s time to connect to the source instance. In the image below we are connecting to our SQL Server 2012 instance in EC2.
Although SQL Servers have instance names (default instances typically have the same name as the Windows box), we did not specify it here. The user account was created before. One thing to note here is, that the user account must be a SQL Server standard login and not a Windows domain or local account. If the SQL Server is running in Windows-only authentication mode, this will not work because no standard logins (even sa) will be available.
Once the source is connected, its databases and schemas under each database are loaded in the left side of the screen.
Now if you are a SQL Server DBA or developer, you will notice a couple of things immediately:
If you are wondering why the tool works this way, remember, the Schema Conversion Tool is built for converting only database components, nothing else. That’s why it shows database objects from the source instance only.
The same thing can be said about Oracle databases. There are many different links for schemas, tables, users, directory objects, jobs or RMAN backups within the Oracle Enterprise Manager Database Control. Again, only database objects for different schemas are viewable from the conversion tool.
The reason for this difference is in the naming convention. The term “database schema” is loosely used to mean two different things. One meaning refers to schema as the overall structural definition of the database. This definition includes the actual command to create the database as well as the commands to create objects within the database. The other meaning of schema refers to a namespace that belongs to a user account. In traditional database engineering terms, a schema is related to a database user — all the objects owned by a user belong to that user’s schema. A database then becomes just the holder for all the schemas in it.
SQL Server did not implement the concept of schemas in its older versions. However it’s very much a part of the database engine now and Microsoft has implemented the ANSI version of schema definition in its database product. On a high-level though, everything is still seen from an individual database perspective.
AWS Schema Conversion Tool sees things differently. It treats “schemas” in the traditional sense, a grouping of database objects from the same user. This means when you convert the source database components, the highest level you can convert from is the schema level, not the SQL Server database level. That means each separate schema under an SQL Server database will end up as a separate MySQL database in the target.
It’s now time to check what components from our source databases (AdventureWorks) can be converted to MySQL format. For this we selected a schema, right-clicked on it and selected “Create Report” from the pop-up menu.
This will kick in the schema conversion engine. AWS Schema Conversion Tool will take its time to reverse engineer every object within the schema and check if the generated code can be run against the target instance without any change. If it can, it will make a note of this finding. If it can’t, it will make a note about why it’s not possible. At the end, the assessment will generate a report. That report is called the Database Migration Assessment Report.
You can access the Database Migration Assessment Report from the View menu of the conversion tool. There are two parts of the assessment report: the Summary part and the Action Items part.
The Summary page gives an overall picture of the migration possibility. It shows how many objects in each type of component it can convert and how many it can’t.
The Action Items page will go deeper into the analysis. It will list every object that can’t be converted to the target engine and the reason for their failure. I find it hugely helpful that the AWS Schema Conversion Tool highlights the particular command or syntax in the generated code that caused the problem. This code is viewable from the lower half of the Action Items screen.
In the following images, we are seeing the summary and action items report for two schemas. You can see the bottom half of the Action Items screen shows the generated code for a particular object that can’t be converted. When you select an individual problem item from the top half of the screen, its related object is highlighted in the left pane and its generated code is shown in the bottom.
With the assessment report giving you the overall migration possibility, you have two choices:
Migrating the schema in the target environment is a two-step process:
With the schemas converted to MySQL format, the target instance will show the databases created. However, they don’t exist yet. To physically create the database objects, we had to apply the schemas from the target database pane:
And that completes the initial schema migration. If you are interested in migrating to virtual machines, you may find our post on Migrating Virtual Machines to the AWS Cloud useful.
We found the AWS Schema Conversion Tool a simple and intuitive piece of software with an elegant installation.
As noted before, there is no option to migrate to or from a PostgreSQL databases. Postgres is run in many corporate database workloads and we don’t know when this functionality will be available in the tool. The same can be said about other database engines like MySQL or MariaDB as sources. The target environment can only be one type and that’s RDS. This may help Amazon push its customers to adopt RDS, but it isn’t helpful if a customer wants to run their database in EC2.
Even if the schema migration runs smoothly and most of the objects can be transferred easily, database engineers still have to support other “moving parts.” This includes a multitude of components like: SQL Server Agent Jobs, Oracle Jobs, custom cron jobs, Windows scheduled tasks, network shares, users and privileges, linked servers, linked databases, partitions, full-text indexes, and so on.
The philosophy behind this tool’s purpose is focused. The tool is not meant to be a magic bullet for a turn-key migration – no tool can claim to have such functionality. Data mapping is the biggest time-sink for most migrations and this tool address that issue very well.
As we saw from our example, it’s not easy to migrate even a sample database in a lift-and-shift manner. That may disappoint many DBAs. After all, everyone wants a smooth migration where the tool does all the heavy lifting. We believe the AWS Schema Conversion Tool’s strengths lie in its ability to pinpoint exactly where the migration problems are, and suggesting ways to address those issues.
Amazon Web Services’ resource offerings are constantly changing, and staying on top of their evolution can be a challenge. Elastic Cloud Compute (EC2) instances are one of their core resource offerings, and they form the backbone of most cloud deployments. EC2 instances provide you with...
AWS's WaitCondition can be used with CloudFormation templates to ensure required resources are running.As you may already be aware, AWS CloudFormation is used for infrastructure automation by allowing you to write JSON templates to automatically install, configure, and bootstrap your ...
As companies increasingly shift workloads to the public cloud, cloud computing has moved from a nice-to-have to a core competency in the enterprise. This shift requires a new set of skills to design, deploy, and manage applications in cloud computing.As the market leader and most ma...
The announcements at re:Invent just keep on coming! Let’s look at what benefits these two new EC2 instance types offer and how these two new instances could be of benefit to you. If you're not too familiar with Amazon EC2, you might want to familiarize yourself by creating your first Am...
Google Cloud Platform (GCP) has evolved from being a niche player to a serious competitor to Amazon Web Services and Microsoft Azure. In 2018, research firm Gartner placed Google in the Leaders quadrant in its Magic Quadrant for Cloud Infrastructure as a Service for the first time. In t...
In order to understand AWS VPC egress filtering methods, you first need to understand that security on AWS is governed by a shared responsibility model where both vendor and subscriber have various operational responsibilities. AWS assumes responsibility for the underlying infrastructur...
Is it possible to create an S3 FTP file backup/transfer solution, minimizing associated file storage and capacity planning administration headache?FTP (File Transfer Protocol) is a fast and convenient way to transfer large files over the Internet. You might, at some point, have conf...
Microservices are a way of breaking large software projects into loosely coupled modules, which communicate with each other through simple Application Programming Interfaces (APIs).Microservices have become increasingly popular over the past few years. The modular architectural style,...
There are many use cases for tags, but what are the best practices for tagging AWS resources? In order for your organization to effectively manage resources (and your monthly AWS bill), you need to implement and adopt a thoughtful tagging strategy that makes sense for your business. The...
Amazon S3 is the most common storage options for many organizations, being object storage it is used for a wide variety of data types, from the smallest objects to huge datasets. All in all, Amazon S3 is a great service to store a wide scope of data types in a highly available and resil...
One of the main promises of cloud computing is access to nearly endless capacity. However, it doesn’t come cheap. With the introduction of Spot Instances for Amazon Web Services’ Elastic Compute Cloud (AWS EC2) in 2009, spot instances have been a way for major cloud providers to sell sp...
A Comparison of Machine Learning Services on AWS, Azure, and Google CloudArtificial intelligence and machine learning are steadily making their way into enterprise applications in areas such as customer support, fraud detection, and business intelligence. There is every reason to beli...