1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. Solution Architect Professional for AWS - Learning Path Primer

High Availability and Security


Learning Path Primer
Start course

This course serves as a "primer" for the Solution Architect Professional learning path. The objective of the course is to refresh our understanding of baseline concepts before we explore the more advanced topics relevant to the AWS professional certification domains. 

In our first lesson we’ll review some of the terminology and key concepts inherent to the SA professional exam. This first lesson will help us understand questions and scenarios better by ensuring we have a consistent vocabulary.

Next we review aspects of high availability, security and business continuity relevant to the professional certification domains.

We finish with a high level review of some of the AWS Services relevant to the Solution Architect Professional domains. 


During our Solutions Architect Associate for AWS learning path we learnt how Elasticity and Scalability help us design cloud services, and how AWS provides the ability to scale up and down to meet demand rather than having two provisions systems on estimated usage. And how that ability increases our agility and reduces our cost, as we only pay for what we use. We saw how the four pillars of the AWS well architected framework can be a guide for designing with best practices. AWS provides inherent services to make it possible to design for high availability and fault tolerance. Amazon Simple Storage Service, Amazon Simple Queue Service, and Amazon Elastic Load Balancing have been built with fault tolerance and high availability in mind. Amazon Elastic Cloud Compute and Amazon Elastic Block Store provide specific features, such as availability zones, elastic IP addresses and snapshots. However, you need to implement these services correctly to create a highly available system. Exam questions will test your ability to identify the right mix of products to achieve the desired business objective. So we ran through the 10 AWS components that can help us design cost efficient highly available fault tolerance systems when used together. And those were briefly, if you remember; Regions, AZ's, which are designed for fault isolation. So having multiple availability zones within one region can often provide a high level of durability, and high availability, without the need to use more than one region. If we do want to extend our customers footprint to another region that's also very possible to migrate AMI's and to migrate data services etc from one region to another. Virtual Private Cloud which is that secure section of the AWS cloud. It gives us a side or block between /16 and /28. The default VPC comes with subnets for your availability zones and internet gateway, a default route table, a network access control list, and a security group. A subnet is a public subnet if it has an internet gateway and a route in the route table to that internet gateway. Then we looked at the Elastic Load Balancer. It's a managed service, which detects the health of instances, and routes traffic to the healthy ones. Now Elastic Load Balancer adds another layer of availability and security as a managed service, ELB can terminate or pass through SSL connections. And then we had Simple Queue Service that enables us to increase fault tolerance by decoupling layers, reducing dependence on server state, and helping us manage communications between services. And of course, Elastic Cloud Compute, EC2. That on demand computing, there's instant STYP's available in various flavors on demand we pay hourly. Reserved instances, we pay either a one, or three year partial upfront to reduce the cost of predictable usage patterns. Then we have scheduled instances, which can be booked for a specific time of the day, week or month. And their idea where you have patterns of usage that a quite regular or reports that need to be done on a certain date every month or every year. Spot Pricing is marketplace pricing, based on supply and demand basically. Where you're bidding and paying for unused AWS capacity. Often its a blend of those that can give you the best price. Now remembering the placement groups must be in the same availability zone. And placement groups do not support micro or medium sized instances. The Elastic IP Addresses allow us to maintain service levels by swapping resources behind an Elastic IP Address. We can have up to five Elastic IP Addresses per region. With our Elastic IP Addresses if you stop an instance the Elastic IP Address remains associated with the instance. And then Route53, That powerful DNS service. We can manage our top little domains. It can provide graceful failover to a static site in the event of an outage, Which can be hosted on S3. It can do active, active/active, passive failovers, based on Elastic Load Balancer health checks, or EC2 health checks. And it can support weighted or geotargeted traffic distribution. So CloudWatch are the eyes and ears of our environment. Great monitoring tools, CloudWatch, CloudTrail and AWS Config. For CloudWatch, you've got basic EC2 monitoring enabled by default. Basic monitoring provides seven metrics at five minute intervals, and three metrics at one minute intervals. Elastic Load Balancing, is by default a one minute interval response. Detailed monitoring enables one minute intervals on the same metrics but it comes with a charge, so you have to pay extra to use detailed monitoring. CloudWatch also has things like an agent, which can send log files to CloudWatch and so provide us more instance bugging and reporting information. Now CloudWatch notifies of a change in state and the three reporting states are, OK, Alarm, or Insufficient Data. If an instance or ELB has just started it would most likely return an insufficient data state. Right, Auto Scaling has three core components, The Launch Configuration, The Auto Scale Group, and the Scaling Plan. So the Launch Configuration is your template for what you want your machines to do when Auto Scale starts them. You can basically configure that machine to do exactly what you want with your Launch Configuration. The Auto Scale Group is literally the group of services that are run inside that group. Then the Scaling Plan defines how services are added or removed from that Auto Scale Group. So scaling in, we want to make our Auto Scale Group smaller to reduce costs. The whole point of scaling down is to reduce your costs, so you're only paying for what you use. So these are the steps that Auto Scaling goes through to determine which machine to terminate first. First off, it looks are there instances in more than one availability zone. If there are, Auto Scaling applies its policy to the availability zone that has the most number of instances in it. So if you have two AZ's, one's got three instances running and one's got two, Auto Scaling will apply its rule to the AZ with the three instances in it first. That's the first piece of logic. The next logic point is select the instance with the oldest launch configuration. If there are multiple instances using their oldest launch configuration then select the instance closest to the next billing hour. If there are multiple instances close to the next billing hour, then select an instance at random. Three key steps, first of all, choose the availability zone that has the most instances and apply the rule to that. Second, if there are multiple instances, terminate the one with oldest launch configuration. If there's multiple instances on that same launch configuration, choose the one closest to the next billing hour. If you still can't find a difference between them, choose one at random. Now remember that that availability zone rule applies even if you have a custom Auto Scaling policy. Designing for the Cloud often means that biggest isn't necessarily best. And decoupling your services and reducing components into loosely coupled units, that could run on smaller machines may improve performance and reduce single points of failure. For many solution designs you may achieve an acceptable level of fault tolerance, at a lower cost using multiple availability zones in one region. Our goal is to create the best possible outcome for our paying customer. Now that may mean using smaller, more loosely coupled services, rather than going straight for biggest and best available. We need to always be looking for ways to reduce single points of failure and to reduce costs. AWS has a global footprint but we may not need to use the biggest instances in multiple regions. And it may be that by using multiple availability zones within one region, and by using a blend of On Demand, Spot, and Reserved Instances, We can create a highly available cost efficient solution. So in exam questions looks for clues to help you determine the business requirements and constraints in any of the scenarios you get. Look for the Recovery Time Objective and the Recovery Point Objective. The Recovery Time Objective is the maximum amount of time the customer can be without this system in the event of a disaster. The Recovery Point Objective is the last possible point in time that the business data must be recoverable to. Now remember that the Recovery Point Objective is generally a time value as well. There are four design patterns we can deploy in AWS to meet RPO and RTO objectives. The first is backup and restore, which is like using AWS as a virtual tape library. It's generally going to have a relatively high Recovery Time Objective since we're going to have to bring back archives to restore first, which could take four to eight hours, or longer. We're going to have a generally high Recovery Point Objective as well, simply because our point in time will be at our last backup, and if for example we're using daily backups only then it could 24hours. Cost wise backup and restore is very low and easy to implement. The second option is Pilot Light, and that's where we have our minimal version of our environment running on AWS, which can be lit up and expanded to production size from the Pilot Light. Our Recovery Time Objective is likely to be lower than backup and restore as we have some services installed already. And our Recovery Point Objective will be since our last data snapshot. The third option is Warm stand by, Where we have a scaled down version of a fully functional environment always running in AWS. Now that's gonna give us a lower Recovery Time Objective than perhaps Pilot Light, as some services are always running and it's likely that our Recovery Point Objective will be lower as well since it will be since our last data write. The benefit of Warm stand by is that we can use the environment for dev tests or for skunk works to offset the cost. And the fourth option is Multi site, where we have a fully operational version of our environment running in AWS or in another region. And that's likely to give us our lowest RTO simply because it could be a matter of seconds if we're using active, active failover through Route53. So the cost and maintenance overhead of running a multi site environment needs to be factored in and considered. The benefit is that you have a regular environment for testing DR processes. And another component is AWS Storage Gateway. So AWS Storage Gateway Connects your on premise storage with your AWS S3 Storage. There's three options that are available, you have a Gateway-cached volume, Gateway-stored volumes, and then we have a Gateway VTL, which presents itself like a virtual tape library. The benefit of all three of those is that two in users, each of the storage gateway connections look like iSCSI connections. Our choice of database replication is a factor when we're talking about disaster recovery in high availability. So synchronous replication is where we have an atomic update to both databases, and it's bandwidth and latency dependent. So we need very good bandwidth and very high networking, synchronous replication of databases. Generally comes at a higher cost. Asynchronous replication is a non-atomic update that happens to the secondary as network and bandwidth permit. AWS has a shared security responsibility model. AWS manages the global infrastructure, the regions, the availability zones, and the edge locations. And some of the foundation services such as compute, storage, database and networking. And then everything else on top of that is managed by us, the customers. So AWS manages security of the cloud, and AWS customers manage security in the cloud. Now we looked at the four pillars of security in the cloud. Data protection, which is protecting data in transit and at rest. Privilege management, which is ensuring our users have least privilege to resources. Infrastructure protection, keeping the facilities and networks secure is the job of AWS. And those detective controls, that regular monitoring and testing to avoid compromise. So some of the tools that AWS makes available to us via IAM, we have multi-factor authentication, which is an additional layer that should be applied to your route account and any privileged users. We can interface with identity providers using AWS roles. We have our passwords, and roles provide a very efficient way for us to connect to applications and third parties, without us having to share our security credentials. When we're integrating with other corporate networks we can use single sign on or directory services, and the Amazon Temporary Token Service, or STS and a role enable us to connect to AWS via identity broker. Temporary credentials expire after a given period. We can also use identity providers such as Facebook, Amazon, Youtube etc. To enable end users to sign into an application using STS and IAM roles. Amazon Cognito provides a service that does a lot of this for you, and Amazon Cognito comes included in the iOS, Fire, Android, Unity, and JavaScript SDK's. Now platform compliance, PCI, DSS, SOC one, two and three, iSO 9001, HIPAA compliance, along with compliances and a number of frameworks and alignments that make it easier for third parties to check or comply with compliance reporting. And the AWS security center and the well architected frameworks can provide some really good guidelines for how third parties can respond to RFP's or run things like penetration testing, or to do compliance audits using roles and third party connectors. So securing data in transit. All AWS end points support SSL, and one of the key benefits of Elastic Load Balancer is that it can terminate or pass through SSL connections. So if we're securing data at rest, two key services, The AWS KMS, or the Key Management Service, and Amazon Cloud HSM. So looking at the options we have for using this, there's three. The first is where you control the encryption method, and the entire key management infrastructure. So you can take your whole KMI out of AWS and manage that yourself. The second option is where AWS manages key storage for you. You will manage the encryption method, You choose whichever way you want to encrypt your content, and you manage your own keys. They're stored in CloudHSM. And the third option is where AWS manages encryption, and the key management, and the KMI infrastructure for you. So they do everything on your behalf basically. Okay we're going to be looking at Threat Mitigation. Remember it's about protecting them layers, so we want to reduce our surface area. And it's our responsibility to put in place additional controls to limit access. What additional filtering or blocking can we add on top of security groups and network ACL's to provide additional levels of threat protection. Okay so that brings to a close our refresher of services and our Solution Architect Professional Certification Primer.

About the Author
Andrew Larkin
Head of Content
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.

Covered Topics