image
Phase 1: Assess

Contents

Database Migration Strategies for GCP
1
Introduction
PREVIEW1m 17s
2
Phase 1: Assess
PREVIEW8m 49s
6
Summary
1m 30s

The course is part of this learning path

Phase 1: Assess
Difficulty
Intermediate
Duration
27m
Students
9
Ratings
5/5
starstarstarstarstar
Description

In this course, we cover Google’s recommended best practices for migrating any database to the Google Cloud Platform. The four main phases covered are Assess, Plan, Deploy, and Optimize.  

Learning Objectives

  • Choose the best migration tools
  • Select a migration strategy
  • Define a fallback plan
  • Migrate and validate your data
  • Measure and optimize your results

Intended Audience

  • Database administrators
  • Database engineers
  • Anyone preparing for a Google Cloud certification

Prerequisites 

  • Some experience working with databases
Transcript

So, I am going to assume if you are taking this course, then you need to know how to migrate a production database to Google Cloud.  Well, there is a lot to consider.  You have to choose the right technologies and tools.  You have to figure out how to copy your existing data.  And you have to consider how the move will affect your current applications.  There are often many dependencies involved, and that means a lot can go wrong.

To prevent you from getting overwhelmed, I am going to break the process down into four main phases: Asses, Plan, Deploy, and Optimize.  These cover Google’s recommended best practices for migrating any database to Google Cloud Platform.  So let’s start by talking about the Assess Phase.

The goal of the Assess phase is to thoroughly document your existing system.   You need to understand all the different components and the environments in which they run.  

This is not limited to just listing out all your databases and applications.  It also includes listing out the dependents and dependencies.  Dependents are components of your system that rely on your database.  Dependencies are components that your database relies upon.  If you are going to make any changes, you need to understand what can potentially break.

You also should include a detailed breakdown of your current costs.  You want to understand the financial impact of any proposed changes.  And you also should have clearly defined performance requirements.  So, if you need to support 10,000 writes per second, you don’t want to migrate to a system that can only support 5,000.  Remember, attempting to blindly migrate any system is a recipe for disaster.  Before implementing the solution, you first need a clear understanding of the problem.

So I recommend you start out by taking a complete inventory.  List out every application, server and resource.  This will make it easier to identify your dependents and dependencies.  Think about all the clients that directly connect to the database.  Are they reading records?  Or writing records?  Or are they doing both?  How often are your records being changed?  Which components of your system are affected when your database goes offline?

Next, you need to identify any special requirements.  Think about the kind of data you are storing.   How much data do you have?  Is it growing?  If so, how quickly?  Does any of your data need to be encrypted?  Are there any geographic or compliance requirements?  And what are the requirements for performance, availability, and failover?

You need to make sure to document everything.  Write it all down and share it with the rest of your team.  Make sure you all fully understand how your system works.  Only then can you accurately gauge how difficult your migration will be. This will also ensure that you can make the best decisions and minimize any complications.

Now, during the Assess phase, you also want to try to identify the database software and tools that you are going to be using.  Copying a MySQL instance to GCP is going to look a lot different then migrating PostgreSQL to Cloud Spanner. Once you understand your current system, you need to determine what your new system should look like.

Sometimes you can get away with a simple “lift and shift”.  That would be, reproducing your current setup without any modifications.  This can be as easy as simply importing your VM into Compute Engine.  However, if you are interested in adding extra features, you might need to pick something else.

In order to choose the right database, you need to carefully consider the features offered by each product.  And if it looks like multiple products could work, then you should consider building some prototypes.  Maybe you need to export a small sample dataset and test out all your different use cases.  Don’t focus on just finding a single solution that will work.  You also need to consider things like: ease of use, ease of maintenance, scalability, availability, and pricing.  Picking the right database is a significant decision.  Make sure to investigate all options before locking yourself into a final decision.

In addition to picking the right database, you also need to consider how you will be migrating your data.  If your database is small and does not change often, you can probably just do a simple export.  But what if your database is really large?  What if it is constantly updated?  You might need some special tools to copy your data and sync any changes as they occur.

For example, if you are interested in a tool that can do all the heavy lifting for you, you might consider checking out Database Migration Service.  Now the Database Migration Service can automatically migrate a MySQL or PostgreSQL database to Google Cloud.  You just need to provide the connection details to your old instance, and then specify any settings you want for the new instance.  Google is also currently working on adding support for Oracle databases as well.

Now if you are using an unsupported database, or maybe you need extra control, you might consider using Cloud Storage.  You can use your browser to upload an exported SQL or CSV file to a bucket.  Or you can also use the “gsutil” command line tool to rsync entire directories.  Either way, once you create your new instance on GCP, you can then directly import your records from Cloud Storage.  As long as the files aren’t too big and your internet connection is fast enough, that should be all you need.

Ok, so what do you do if you have a slow internet connection?  Well, in that case, you could look at some services that can help you speed up the transfer.   For example, instead of trying to use the public internet, you could use Cloud Interconnect or Direct Peering.  These two options can potentially increase your upload speeds to Google. 

Now, if your data already exists on another cloud provider like in an Amazon S3 bucket or Microsoft Blob Storage, then you could use the Storage Transfer Service.  The Storage Transfer Service can directly between clouds.  So that means you don’t have to download it and then reupload it yourself.

Now what if your connection is fast, but your database is just so big it is going to take forever to copy everything?  Well, in that case, you might consider using a Google Transfer Appliance.  The transfer appliance is essentially a really big storage device that Google will ship to you.  You plug it in, copy your files locally over your own network, and then ship the device back to Google.  This can significantly reduce your transfer times.  I highly recommend using a transfer appliance if you need to move more than 20 terabytes.  Or if you figure out that your transfer will take longer than a week to complete.

Now some migrations are just going to be more complicated.  And they will require more than just simple replication.  For example, in certain cases, you will need to modify your records as well as copy them.  Cloud Data Fusion can allow you to construct custom ETL or ELT data pipelines.  You extract (or pull) the required data from various sources.  You transform the data into whatever format you need.  And you can load the results to your location of choice.  So this would allow you to do more advanced things like load unstructured data into BigQuery.  Or you could use it to migrate a MySQL database to Cloud Spanner.

Now, if you need to do something really complicated, you can use Cloud Composer.  Composer supports building multiple pipelines.  And you can use it to create a unified data environment by connecting across different clouds.  You can build, schedule, and monitor all the workflows you need.

As you can see, Google offers a lot of different options.  Just remember, your first step is to identify your requirements.  This means documenting all the components and dependencies.  Then use this knowledge to identify the appropriate database software and tools.  No matter how simple or complicated your migration path is, Google has the tools to get you to your destination.

About the Author
Students
32207
Courses
36
Learning Paths
15

Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.

Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.

When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.