Designing Cost-Optimized Architectures in AWS


Designing Cost-Optimized Architectures

The course is part of these learning paths

Optimizing compute services

In this module, we will first introduce the concepts of cost optimization and how AWS compute services can be selected and applied to optimize costs. We will review the various instance types available, and how the purchasing options available can be selected and combined to provide a cost-optimized solution.

Next, we review how we can optimize storage costs by selecting the appropriate storage services or storage classes to create the most optimal and economical way to store objects and data in AWS cloud storage.


Elastic Compute Cloud or EC2 is usually one of the largest components of any AWS build. So, it's the first place you need to look for ways to optimize and reduce costs. It's important to choose the right instance types and the right usage method. There are four different cost models. Make sure you do check the Simple Monthly Calculator for the latest available instance pricing. The instance families go like this. First, there's on-demand. With on-demand pricing, you pay hourly for however long you run your EC2 instance at a price set per instance type. If your EC2 instance does not run the full hour, you are still billed for the full hour. The second option is Spot Pricing, and Spot Pricing is marketplace pricing based on supply and demand. You are bidding for unused AWS capacity. There is no guarantee that you will get a spot instance. When you do, there is no guarantee that you will have it for any length of time. Now, this makes spot pricing useful in situations where jobs are not time-constrained, i.e., they can spin up and shut down without a negative impact on the system they're interacting with. Keep in mind, spot instances can be terminated. 

Reserved instances. Reserved pricing offers discounted hourly rates per instance type with an upfront commitment of either one year or three years. The upfront commitment comes in the form of a one-time payment, which offers the steepest hourly discount, a partial upfront payment, or no upfront payment at all. RIs suit predictable usage where you can safely explain or expect a certain level of compute will be required. 

Scheduled instances are like reserved instances; however, you can reserve the capacity in advance so that you know it is available when you need it. You pay for the time that the instances are scheduled, even if you do not use them. Scheduled reserved instances enable you to purchase capacity reservations that reoccur on a daily, weekly, or monthly basis, with a specified start time and duration for a one year term. Scheduled instances are a good choice for workloads that do not run continuously but do run on a regular schedule. For example, you can use scheduled instances for an application that runs during business hours or for a batch processing job that runs at the end of the week, as an example. 

For applications that benefit from low cost per CPU, you should try compute-optimized instances first. For applications that require the lowest cost per gigabyte of memory, use memory-optimized instances, the MOC classes. If you're running a database, you should also take advantage of the EBS optimization or instances that support placement groups. For applications with high internode network requirements, you should choose instances that support enhanced networking. Placement groups are a logical grouping of instances within a single availability zone that offer a low-latency, 10-gigabyte-per-second network. You can launch multiple EC2 instances into one placement group. Placement groups can enhance performance of clusters. The EC2 instances must be the same in a placement group. They must also be in the same AZ. Placement groups work with limited instance sizes. They do not support medium instances, for example. For the best performance, you should use instances with enhanced networking. The most common use for placement groups is EC2 instances that host applications requiring low network latency or high network throughput. There is no additional cost for using placement groups with your EC2 instances. Reviewing usage types should be a priority. Reserved instances provide a cheaper buy price, and will provide better economy as an option in most scenarios. However, you always need to be sure that any proposed instances meet the requirements as described. 

So, let's just quickly review our use cases. Spot instances suit applications that have flexible start and end times, perhaps applications that are only gonna be feasible if we get a very low compute price like a large data-crunching task that we need the information for, but it's not by a specific date. And spot instances really suit those users who have an urgent compute need where they need a lot of additional resource for number-crunching or for large database migrations, perhaps. Reserved instances suit applications with steady-state or predictable usage, and they may require reserved capacity to meet demand over a predictable pattern. And one thing that really helps with reserved instances is having a clear idea of what that predictable pattern is. So, if we've been running an application for a year or two, and we can see that between Monday and Friday, nine to five, we have a certain usage pattern, that would make it possible to make an informed decision about making an upfront payment to reduce our total computing cost, by using a partial or fully upfront one-year reserved instance. And their other family is the on-demand instances, which suit users who want that low cost and flexibility of EC2 without any upfront payments or long-term commitments. And that suits just about every use case. Any application with short-term, spiky, or unpredictable workloads suit on demand. Often, it's a blend of all three that gives you the best optimization. Obviously the best flexibility comes from on-demand, but the pricing difference and the optimization you're able to achieve with reserved instances and spot instances is well worth considering. 

Okay, so let's just think this through. Let's envisage we've got a, let's say we've got a business app that's been running for a year. It's quite CPU-bound. We've been using a fleet of m1.xlarges. We're up to nine presently, just trying to keep up with the current demand, where it's likely that the demand is gonna double over the next year. So we're thinking what do we need to buy to keep our application running without maxing out of the 100%, which is what we've been seeing over the last month or two. Inside our fleet, let's say we've got a couple of more compute-optimized instances, let's say a c3.2xlarge, and let's say we bought two of those a month ago, and those are only running at 20 or 30% CPU utilization compared to the 100% utilization we're getting from the m1.xlarges. So, this may be a good opportunity to just shift into less instances but make those instances compute optimized. And again, it's about making sure that your instance type matches your use case. So, once you know that we've got a compute-bound application or network-bound, or if there's any particular constraint that you're seeing or pattern that you're seeing, that can really help you shuffle things around. So, it may be worth considering reducing the number of m1.xlarges and increasing the number of c3.2xlarges, because they're gonna give us quite significantly better compute power. C3.xlarge has eight VCPUs versus four in the m1.xlarge, which equates to 28 units verses eight. So the c3.xlarge has significantly more CPU capacity. So while the actual unit price of the c3.xlarge is more than the m1.xlarge, over time, you're probably gonna get a better ROI using the larger, more CPU-optimized instance. 

Now, if we looked at using reserved instances, we could lower our overall cost even further. If we, and let's just use this anecdotally, these numbers don't reflect the current pricing. If you are looking at pricing any solution, you need to check the Amazon Simple Monthly Calculator for the latest pricing. This is just for the sake of our discussion. Let's say that an on-demand one yearly cost for our c3.2xlarges would equate to $3689. If we were to use the same performance for a one-year all-upfront reserved instance, it would be around $2170, which would equate to a saving of $1519. So $1500 saving over a year. And of course, if we went further out front to a three-year partial or all upfront commit, we could be saving up to or above $6,000 over three years. So that's a significant difference. So a reserved price is always going to be or net us a better result. 

And another option to blend in here could be our spot instances. So, spot instances are perfect for processing that's not time-dependent. Generally, the earlier generation machine types are cheaper. The spot price is determined by demand. So you can do a quick summary report of what the current pricing looks like, and what the last, what the trend of pricing is over the last three months. But spot pricing can reduce our cost even further. So we might use a blend of two c3.2xlarge RIs, one year upfront for our day-to-day processing. We add two c3.2xlarge to our on-demand instances to our autoscaling group. And then we might also have an option to add one or two spot instances to handle regular monthly reports or the like. And that could give us a better result than the nine m1.xlarges that we're currently running. 

Okay, great. So how do we use spot instances with autoscaling? When you use autoscaling to launch spot instances, you set your bid price in the launch configuration. You can't use a single launch configuration to launch both on-demand instances and spot instances. You can change your bid price, however. You first must create a launch configuration with a new bid price, and then you associate it with your autoscaling group. Take note that the existing instances continue to run as long as the bid price specified in the launch configuration used for those instances is higher than the current spot market price. If the market price for spot instances rises above your spot bid price for a running instance in your autoscaling group, EC2 terminates your instance. If your spot bid price exactly matches the spot market price, whether your bid is fulfilled depends on a couple of factors, such as whether there's available spot instances. Keep in mind that spot pricing changes frequently. It's based on demand, and spot instances can be turned off at any point. So a spot instance only suits non-time-critical processing jobs. 

In September 2017, AWS introduced per-second billing for on-demand, reserved, and spot Linux-based instances. This means Linux-based instances are billed in one-second increments, with a minimum billable unit of 60 seconds. Now, that optimizes compute costs a great deal. For other instance types, pricing is per instance-hour, consumed for each instance type. So let's review the per-hour rules, as you're more likely to find a question on per-hour billing over a per-second billing in the certification exam. Partial instance hours consumed are billed as full hours. EC2 billing can be quite complex, so let's just step through this. 

When you terminate an instance, the state changes to shutting down or terminated, and you are no longer charged for that instance. When you stop an instance, it enters the stopping state, and then the stopped state, and you are not charged hourly usage or data transfer fees for your instance after you stop it, but AWS does charge for the storage of any Amazon EBS volumes. Now, each time you transition an instance from stopped to running, AWS charges a full instance-hour, even if these transitions happen multiple times within a single hour. When you reboot an instance, it doesn't start a new instance billing hour. 

Let's go through the difference between rebooting, stopping, starting, and terminating. So, in terms of the host, the instance stays on the same host when we reboot, but the instance may run on a new host computer when we stop or start - underline "may". When we terminate, there's no impact. In terms of public and private IP addresses, when we reboot, the addresses stay the same. With EC2 classic, the instance gets a new private and new public IP address. With EC2 VPC, the instance keeps its private IP address, and the instance gets a new public IP address unless it has an elastic IP address, an EIP, which doesn't change during a stop or start. With elastic IP addresses, the EIP remains associated with the instance when you reboot it. For instance store volumes, when we reboot, the data is preserved. When we stop or start, the data is erased, and when we terminate, the data is erased. So, remember that with instance store volumes, data are gone when you stop it or terminate it. The root device volume is preserved during a reboot, and the volume is preserved during a stop or start event, but the volume is deleted during, by default, during termination. And with billing, during a reboot, the instance hour doesn't change. Each time an instance transitions from stopped to running, AWS starts a new instance billing hour. When you terminate an instance, you stop incurring charges for that instance as soon as the state changes to shutting down. 

You can use the consolidated billing feature to consolidate payment for multiple Amazon Web Services accounts or multiple Amazon International Service accounts within your organization by designating one of them to be the payer account. With consolidated billing, you can see a combined view of AWS charges incurred by all accounts, as well as getting a cost report for each individual account associated with your payer account. The major benefit is you get to see things like S3 usage rolled up into one usage amount. How this is set up is quite unique. The payer sends an email request to the payee. So if you're the payer account and you want to link someone else's account, you send an email request to them. They accept the invitation, and then the payee account is added to your payer account. 

Now, consolidated accounts are unrelated to hub and spoke peering or AWS connectivity, so consolidated billing is administrative only. You can't share access to other account network connections, for example, and consolidated billing doesn't, by default, grant IAM users to the master account access. So if you consolidated two accounts under your own by inviting your development team manager and your sysops manager in, you don't necessarily, by default, get to share IAM roles. Now, you can do this, but you need to use cross-account roles. So, say you have decided to have AWS accounts for your dev, test, and production accounts. You have the one master account, and you plan to link each of the dev, test, and production accounts bill to your master AWS account using consolidated billing. But you'd also like a bit more control over these accounts. So you'd like to be able to stop, start, or terminate any of the instances in these other develop, test, or production accounts. Now the best way to do this would be to create IAM users in the master account and then create cross-account roles that have full admin permissions, and then grant the master account access. That concludes this lecture on determining how we design cost-optimized compute services.

About the Author
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.