This course is a "live" scenario discussion where the Cloud Academy team tackle a migration project. Our customer needs to migrate out of their current data center by a certain date. They also would like to modernize their business applications.
Our brief in the exercise is to deliver:
- A target architecture that addresses the challenges described by the customer
- A migration plan detailing how to move the service to AWS with minimal interruption
- A recommendation on how to approach DR to achieve RPO of 24 hours and RTO of 4 hours
- An application optimization plan with a proposed enhancement roadmap
As a scenario, this series of lectures is recorded "live" and so is less structured than other Cloud Academy courses. As a cloud professional you often have to think and design quickly, so we have recorded some of the content this way to best emulate the type of conditions you might experience in the working environment. Watching the team approach this brief can help you define your own approaches and style to problem-solving.
Intended audience
This course discusses AWS services so it is best suited to students with some prior knowledge of AWS services.
Prerequisites
We recommend completing the Fundamentals of AWS learning path before beginning this course.
If you have thoughts or suggestions for this course, please contact Cloud Academy at support@cloudacademy.com.
Updates
22-01-2020: Duplicate lecture removed
A structured walk through is an excellent way to stress test your design.
It is a simple high value exercise to ensure all components of the design have been considered while helping you work out who is doing what by when for the tasks identified as required for each stage.
Stage one lift and shift. We want to move as much as the environment as possible without changing core systems. The first step is to deal with the storage exhaustion issue, we identifed a number of options to deal with this, including archiving of old objects, creating a hybrid architecture for storing new objects on AWS.
We could also use Amazon EFS or a 3rd party NAS solution if it wasn’t feasible to redevelop the Data tier, but the changes required to move from a file system read/rights to S3 object put/get are pretty minimal and can be achieved by the in-house developer.So it is best we focus on just transferring the content to S3 as quickly and efficiently as possible.
The customer already has an AWS account so we need to have the customer administrator set up a user account and the correct roles required for us to work on the account. We will enable multi factor authentication on these user accounts to reduce the risk of compromise during the migration project.
We will need API access to the account so need to request keys be generated 6i9frto enable this.
Once we can log in to the console with the appropriate permissions we need to create the S3 buckets to store the data file. We then order two 80gig snowball appliances from within the AWS console. You need to answer a few questions and choose a shipping provider. We could consider using Snowball Edge if we needed to transform the assets before the transit. That’s not going to be necessary, so a standard Snowball appliance will suit.
In our brainstorm we projected a 10 day migration window for the lift and shift stage, Shipping of Snowball devices takes time and we need to remember we will be working around expertiseplease staff and to office hours. As such we will do well to budget 5 to 7 days to order, ship and copy content to the Snowball devices.
When the devices arrive we hook up the device to the NAS utility with an RJ45 lead, select the bucket we want the files transferred in to and begin copying 80TB of data to each snowball appliance. This will take an hour or more as it’s still a large amount of data. The devices then need to be picked up and shipped to AWS, so most likely 7 days in total.
A key part of our proposal is to use as many managed services as possible. So in stage one we implement Route53 for top level domain management. We can use this for zone configuration, and later to enable geo, latency or weighted traffc routing. We also enable CloudHSM on the customers account so we can transfer the encryption keys quickly out of the data center/
We need to set up a Virtual private network connection between the customers VPC and the data center so we can connect back to the data centre and run a sync task to check and transfer any objects that have changed once we cut over the system.
Now is a good time to set up the environment required for disaster recovery. We need to create a separate VPC dedicated to disaster recovery. With that connection in place we can enable EC2 snapshots, RDS point in time recovery, S3 versioning and Infrastructure automation with CloudFormation to re-create resources will help meet the RTO and RPO requirements.
Once the 160 TB of objects have been transferred to our Amazon S3 bucket from the snowball device and with the VPN working we can create the sync job to run as a lambda function or a simple cron job.
Cron jobs run under the root user which will need to have s3cmd configured. To do this we can copy our user settings to the root user's home directory:
At this point we can also move static content from the web servers to S3 and implemented CloudFront to serve this content. That reduces the load on the front end web server. We can update the apache configuration to perform URL rewrite for the objects to avoid having to change the application itself. This also means that the apache tier will continue to handle internal redirects.
In our design brainstorm we discussed whether to move applications or servers. If the environment was running on VMware we could have made use of the AWS Server Migration Service to migrate servers. There are also multiple third-party solutions exist to support the migration of existing servers to AWS.
However we have a bespoke app, so we opt to containerize the existing application and run it on AWS ECS, this simplifies the migration and enables the customer to work with other AWS partners on configuration and management.
With the stage one priority tasks completed we can look to make two minor improvements to the environment to get us closer to our target architecture.
First we enable Amazon RDS for Oracle and shift the latest iteration of the Oracle database to Oracle RDS using the data migration service. We could use Oracle Logship to do this, however DMS provides a simple way to transfer the DB schema and data.We plan to migrate off Oracle to Aurora or Postgres, so the sooner we do discovery on the DB the better/. The DMS migration report is excellent for planning a migration. The report helps us present and discuss the benefits of migrating the DB off Oracle wih the expertiseplease management team.
Meanwhile back to the minor improvements, we can implement S3 lifecycle policies to move objects older than 30 days to S3 Infrequent Access class to optimize the costs further.
Stage Two
Stage Two is all about optimization so in this stage we want to continue to implement as many AWS managed services as possible.
First we remove the apache web tier, letting CloudFront and S3 handle static content and using the Application Load Balancer path-based routing feature to determine where dynamic requests should be sent.
We need to make sure that the applications in the Auto Scaling groups are stateless and update the Jboss application servers to store user sessions in an Amazon ElastiCache Redis cluster.
Consider - If the application were moved to Tomcat, we could leverage DynamoDB.
The Digitizer instances and the bastion host can be run in an Auto Scaling group to provide self-healing. Because the IP address must be fixed, we use a secondary Elastic Network Interface and a combination of CloudWatch events and a Lambda function to re-attach the ENI each time Auto Scaling replaces an instance.
We can update the third-party file transfer and batch submission process to leverage S3 directly. They obtain a URL redirect with a pre-signed URL, this can either be short lived or granted for a longer period of time, allowing them to upload documents via HTTPS direct to S3. Once a file is uploaded, S3 triggers a Lambda function to invoke submission and process of the file. This change requires that the external third parties move to the new functionality and the existing SFTP solution would have to be maintained in the interim.
We can now improve the performance and cost efficiency of the database tier by migrating to Amazon Aurora; we can make use of the AWS schema conversion tool to assess the amount of work involved to make this transition and then use AWS database migration service to migrate. Amazon Aurora supports encryption at rest, negating the need for Oracle TDE.
Next we can re-implement some of the existing monolithic application modules into micro-services using server-less technologies such as AWS Lambda and Amazon DynamoDB. The registration and login modules are fairly self-contained and are redeveloped to become micro-services. The Lambda functions will be invoked from the application cluster and will allow the application to handle increased login and registration requests, without placing significant additional load on the application cluster.
We can also redevelop the encryption service to leverage the AWS Key Management Service, this would then allow us to start using S3 KMS server side encryption. We can use S3 object metadata to determine which decryption mechanism to call until all of the existing objects are converted to use KMS master keys. Once all of the objects and other data are converted from the old encryption method to KMS based encryption we can remove the CloudHSM instances from the architecture and stop paying for the service.
Stage three
The third and last step of our transformation approach needs to happen before the 18-month deadline. ExpertisePlease.com struggled to deliver new features because of the monolithic application. Step 3 consists of re- architecting the application to become even more micro-service orientated, so that each module can be maintained and evolve separately.
Where possible we try to leverage serverless services such as Amazon API Gateway to allow us to build both web and mobile applications using the same common services; again, using AWS Lambda and Amazon Dynamo DB, however for those modules which require more significant change we can use managed application services such as Elastic Beanstalk to provide Java based application environments.
The document manager and payment modules we can make Lambda functions with data being stored in DynamoDB.
The payment module consists of subscription information which is fulfilled by a third party digital wallet and exposes a number of APIs via API gateway.
The document module keeps additional metadata relating to objects stored in S3; we could have also of leveraged S3 metadata but DynamoDB provides a richer set of features to store and retrieve data. Also, separating these modules moves the majority of the public facing APIs to serverless services.
The presentation and core module still contains a lot of business logic and can be re-architected over time into additional micro-services. However, to simplify operations, application updates and provide scalability the application will be hosted as an Elastic Beanstalk web application.
The administration module is purely for managing backend settings and can also be hosted as an Elastic Beanstalk web application and interact with the other modules via API gateway.
The Batch Processing and Digitizer modules can be merged into a Document Processing service. The jobs are processed by an Elastic Beanstalk worker environment. We use Beanstalk instead of Lambda because jobs might take longer than 5 minutes to complete, the current maximum run time for a Lambda function.
All three of the Elastic Beanstalk environments store application configuration and related data inside the Amazon RDS Aurora Database,
We continue to use Amazon Elasticache Redis to cache information for performance purposes.
In order to ensure that access to internal service API’s is protected we will make use of Amazon API gateway client certificates and ensure that all requests to internal APIs are digitally signed to provide additional security.
For EC2 based web applications we can also implement a host-based security solution, such as Trend Micro Deep Security, as an additional protection to mitigate vulnerabilities specific to the application environment.
Ok that looks to be a solid plan for the migration project.
Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built 70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+ years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.