1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. Solution Architect Professional for AWS - Domain Seven: Scalability and Elasticity

Adding Elasticity with Auto Scaling

Start course

Welcome to domain Seven - Scalability and Elasticity - in the Solution Architect Professional for AWS learning path. In this group of lectures, we will walk through building a flexible, available and highly resilient application in the Amazon web services environment.


Hi, and welcome back to Domain Seven, Elasticity and Scalability. In this lesson, we are implementing Auto Scaling for Application. Auto Scaling is one of the most important services for building highly available applications. When properly used, it can eliminate a single point of failure by distributing traffic across availability zones. It can self heal when instances stop responding by launching new instances. It is simply an amazing service. There's a lot to cover, so let's get started. Before we can make a launch configuration and setup Auto Scaling, we need to create an Elastic Load Balancer. An Elastic Load Balancer is a service provided by AWS, to distribute incoming traffic evenly across healthy EC2 instances that are under its control. Healthy is the key word here. The Elastic Load Balancer performs periodic configurable health checks, and makes decisions on where to send traffic. Let's head to the EC2 Dashboard. Under Network and Security, click on the Load Balances link. Like most resources in AWS, we have to give it a name. We will create our ELB in our one and only VPC. We will leave the create an internal load balance option unchecked. This will direct the DNS name to a IP address, which is public. If checked the DNS name would be pointed to a private IP address instead. Let's check the Enable advanced VPC configuration option, which will lead us to the signed subnets to the ELB in a later step. The Listener Configuration allows us to map incoming ELB traffic to EC2 instance ports. The default port AD mapping will suffice for our application. Next up we create our health check. Our options include standard HTTP, TCP, HTTPS, and SSL. We will stick with standard HTTP, and direct it to our robots.txt file. If our web server cannot serve up the static request, then we can safely assume something is wrong with the instance, and no further traffic should be sent to it until it becomes healthy. With the current settings under an EC2 instance will be checked every 30 seconds. It has five seconds to respond to the request. Failure to respond in the allocated time, means the instance could be unhealthy. Two consecrations unhealthy checks will put the EC2 instance into out of service status. To become healthy again, it must pass 10 consecutive health checks before it will begin receiving traffic. Since this ELB will be limited to outward tier, we will add each subnet we created for outward service. It is important to note that we can only add one subnet per availability zone. Now we need to set a Security Group for our ELB. We are going to select our pre-configured ELB security group, and continue. The add EC2 instances step allows us to add running instances. Since we do not have any currently running, our list is empty. Note, we are not required to add any during the creation process. EC2 instances that are created by Auto Scaling will be attached to this ELB automatically. The settings we are most concerned about on this step, are under the availability zone distribution section. We need to insure that cross load balancing is checked. Without it our high availability design is pretty useless. The other option enable connection draining determines how the traffic is handled when an instance is being de-registered, or has been declared unhealthy. We're going to leave connection draining enabled, and continue. Everything on this review step appears as it should, so we can go ahead with clicking create. It takes a few seconds for the ELB to be created. Once it is finished, we are ready to create our launch configuration, and Auto Scaling Policy,. Before we can set up our Auto Scaling Policy, we have to configure what to launch. First we select the Amazon Machine Image or AMI to launch. There are many options available to us. We can create a basic system, which is an AWS optimized operating system. We can select an AMI that we've created from and EC2 instance. This is exactly what we're going to do or our application. The other options include using a marketplace AMI, or a community AMI. Both AMIs created by a third party, all of which include an alley cost on top of the standard EC2 instance rates. Once the AMI's selected, we move onto selecting an instance. Selecting and instance time is a critical step in assuring performance and costs. Our application can and will run under a T1 Micro instance, albeit slowly, with even the most minimal user loads, much like the RBS instance we created earlier, a T1 Micro instance is not the best choice for many situations. Be sure to select an instance type that works best for your particular use case. Next, we name the launch configuration. The purchasing option allows us to specify if we want to use spot instances. In short, spot instances let you bid on unused server capacity to fire up your EC2 instances. You are bidding against other AWS users. If your bid does not exceed the current spot price, no instance is launched. If your bid exceeds the current spot price, your instance get launched and will run until either you terminate it, or the spot price exceeds your bid. The latter means that your instance could be shut down in the middle of a processing request. Not ideal for our simple web application. Therefore, we're not going to use spot instances. Our instances do not need an IAM role, since they are not accessing any secured resources. CloudWatch detailed monitoring will take metrics every minute, and aggregate data across the instances in the launch configuration. Standard CloudWatch Monitoring at five minutes is acceptable for our application. We're not making any changes under advanced details since it's not relevant to our high availability design. The root volume with eight gigabytes is perfect to launch our snapshot. From the volume type, we will accept standard, which is rate for a light IO, with or without a lot of boosting. Other options include Provisioned IOPS, and General Purpose. Provisioned IOPS are capable of consistently high IOPS performance, Provisioned and General Purpose storage uses solid-state drives, while Standard uses magnetic drives. Each instance will be configured using our pre-created web app security group, which allows HTTP traffic from our load balancer only. Direct HTTP traffic to our launched instances is not permitted in this configuration by design. We also do not allow SSH connections without launched instances, which generates a warning when we attempt to move on. For our application, we are limiting access as a standard security practice. You may do things differently in your environment. We will acknowledge the SSH warning, and review our launch configuration. Everything looks good, so we can click the Create Launch Configuration button. We are presented with a prompt regarding a key pair to use. Our options include use an existing key pair, or create a new key pair, or proceed with that one. We will use an existing key pair, and acknowledge the disclaimer about our private key. Once that launch configuration has been created, we begin the set up of our Auto Scaling group. As you can see, the launch configuration we created has already been selected. Next we will name the Auto Scaling group. Group size will tell AWS how many instances to start with. We want to start with three instances. To have our instances evenly distributed among our availability zones, we need to select our VPC, and add the subnets to use. In our example we will add each of US East One availability zones. Under advanced details, we need to check the received traffic from elastic load balances option. This will give us the option to select with ELBs to use. We add our one an only load balancer to the list, and continue to the next step. There are two options for scaling policies. Keep the group at its initial size, and use scaling policies to adjust the capacity of this group. If we choose keeping the group at it's initial size, we will always have our three instances running regardless of capacity. Choosing to use Auto Scaling as a policy allows us to set minimum and maximum instances, as well as to set CloudWatch alarms to increase of decrease the size of our group. This is really useful when hour application needs to handle a steady and sustained increase in load. It also helps save money by decreasing the number of running instances, as the increase load begins to decline. We do not need a Scaling Policy, for this application. We can change the sit insulator if we want to. Auto Scaling policies can send out notifications when a new instance is launched. An instance is terminated if instance fails. We don't need to do this for our application though, as Eric made it that you consider adding notifications to your set up. Next, we will tag our instances with a name. This will identify which EC2 instances are part of the Auto Scaling group. Very useful if we have many EC2 instances running in one region. We need to make sure that the tag new instances check box is checked to pass this tag through to create EC2 instances. This is the only tag we would use. We could add up to 10 tags if needed. Finally, we review and create our Auto Scaling Group. It takes a few seconds to create. Once created, we can navigate to our EC2 instance list, and see three instances have been launched. One in each availability per our requirements. Before these instances begin receiving traffic from the ELB, they have to pass the health checks we configured earlier, when setting up the Elastic Load Balancer. Our ELB checks our EC2 instances every 30 seconds. Each instance must pass 10 consecutive checks, meaning it will take at least 300 seconds or five minutes before traffic flows through. After waiting for at least five minutes, we can see our EC2 instances are in service. A quick check of our ELB DNS name shows our web site is operational. With our application running, we can more to our next lesson on using S3 and CloudFront in order to deliver content faster, as well as meet our service desegregation goals.

About the Author
Andrew Larkin
Head of Content
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.