Please note that this course has been replaced with a new version that can be found here: https://cloudacademy.com/course/management-saa-c03/management-saa-c03-introduction/
This section provides detail on the AWS management services relevant to the Solution Architect Associate exam. These services are used to help you audit, monitor and evaluate your AWS infrastructure and resources. These management services form a core component of running resilient and performant architectures.
Want more? Try a lab playground or do a Lab Challenge!
Learning Objectives
- Understand the benefits of using AWS CloudWatch and audit logs to manage your infrastructure
- Learn how to record and track API requests using AWS CloudTrail
- Learn what AWS Config is and its components
- Manage your accounts with AWS Organizations
- Learn how to carry out logging with CloudWatch, CloudTrail, CloudFront, and VPC Flow Logs
You've made it through another section, it's a great work. You've definitely covered the lion's share of what you need to know in becoming a solutions architect. So, we've just covered topics, looking at AWS management services and concepts, and there's a lot to take in. So, let me break down some of the core elements I believe are a must know for the exam.
So, let's start with CloudWatch. Now, if you're presented with any questions that relate to the health of resources or metrics, monitoring or logging, then it's likely that CloudWatch will be in one of the answers available and for good reason. Now remember, CloudWatch is the go to service to understand the operational performance of your resources and applications. At its core, CloudWatch collects metrics from supported services and any custom metrics that you have added. And it can display these in a visual dashboard. It can help you to detect anomalies, review logs, trigger alarms, and also automate responses to help you optimize your infrastructure. So, it's a great tool to help you maintain and monitor your environment.
So, for the exam, you need to be familiar with some of the features that it offers, as you might be asked how you could use CloudWatch to detect respond or to identify potential issues that arise in the performance of your infrastructure. So, let's review some of the important features of the service. So, firstly, dashboards. Now, this is a customizable dashboard that lets you build a visual status of your resources using different types of widgets.
Now, the key point is that they are fully customizable, allowing you to design the dashboard how you need to represent your data. So, if you get a question asking how to best represent the status of many of your resources that perhaps relate to a specific project or region or resource type, then you can build a dashboard in CloudWatch for this. Next, you are 100% going to need to be aware of metrics from an exam point of view at least. If you see a question that mentions metrics, then you will almost definitely see something relating to CloudWatch. It's CloudWatch metrics that enable you to monitor a specific element of an application resource using time series data points. So, you might see a question relating to EC2 IO Performance and what metrics you could check, perhaps disk reads or disk writes.
Now by default, metrics are collected every five minutes, but detailed monitoring can be enabled for a small cost, and this will collect metrics every minute. Now, what comes hand in hand with metrics is CloudWatch alarms, and these enable you to implement automatic actions based on specific metric thresholds. So, you might be given a scenario asking the best way to notify your engineering team when your EC2 instance reaches 75% CPU utilization.
So, what could you do to achieve this? Well, with CloudWatch, you could set an alarm that monitors for the CPU utilization metric, and then when it reaches 75%, trigger SNS to send an email to the engineering team automatically. So, remember that CloudWatch also has this integration with other services for alerting like SNS. Now, also be sure you're aware of the different states of alarms as well, of which there are three. You have an OK state, an alarm state, and insufficient data state.
Now, remember CloudWatch EventBridge provides a means of implementing a level of real time monitoring. And this allows you to respond to events that occur in your application as they happen. And lastly, I just want to touch on CloudWatch Logs because these often come up in the exam one way or another. And in fact, logging is assessed at many different levels, which is why we focus on a few logging services in this course. So, what are some of the key things to remember with CloudWatch logs? Well, it acts as a central repository for real time monitoring of log data for different AWS services that provide logs as an output, such as CloudTrail EC2, VPC Flow Logs, etc. in addition to your own applications.
So, data is sent to a log stream within CloudWatch logs to differentiate between different logs. And you can filter for specific entries within these logs to help you identify potential issues. And you can also use the unified CloudWatch agent to collect logs and additional metric data, which is over and above the default metrics collected by CloudWatch against your EC2 instances. Now, this agent is best installed using EC2 Systems Manager known as SSM.
Okay, let's look a couple of example questions relating to CloudWatch. So, the first one, with detailed monitoring enabled on an EC2 instance, how often is metric data sent to CloudWatch? So, this is a fairly simple question. Just assesses your understanding between the default monitoring and also detailed monitoring. So remember, the default monitoring is every five minutes whereas the detailed monitoring is every minute. So, if we look at those answers there, as long as you understand those differences it's a fairly quick and simple answer to get. So, A, every five minutes, or as I just mentioned, that's the default monitoring not detailed, every minute, which is the correct answer for detailed monitoring. So, the answer here is B. C, D, and E are all incorrect because it doesn't measure every 30 seconds or every 15 minutes or every 45 seconds. Just remember, default monitoring is five minutes, detailed monitoring is every minute.
Okay, so another example question. So, Amazon CloudWatch, something, allows you to implement automatic actions based on specific thresholds that you can configure related to each metric. So, this is just assessing your understanding of some of the different components relating to CloudWatch. So, let's take a look at the options that we have here. So, do anomaly detections allow you to implement automatic actions? No. So, the anomaly detections is used to look at the data that's coming in and assesses if there's any anomalies in the patterns of data. So, it's not that. Rules, but we don't really have any rules in CloudWatch as such. So, that's not going to help us either. Then C, alarms. Well, alarms certainly allow you to implement automatic actions based on specific thresholds. So, I'll definitely highlight that one. And then D, events. Now, as we mentioned earlier, events are related to EventBridge and this provides a means of implementing a level of real time monitoring. They don't actually allow you to make automatic actions based on specific thresholds. So, the correct and most appropriate answer here is C, alarms.
Okay, let's leave CloudWatch logs there and move onto CloudTrail. Now, the key thing to know about CloudTrail is that it's used to log, record, and track all API calls in your environment. If you remember that, you'll be able to eliminate a couple of wrong answers if anything comes up about tracking API calls. Now, API calls are pretty much made for every action either made by you or made by another AWS service. They are all recorded with CloudTrail. And it's a great tool for auditing because of this. Now, in addition to tracking the API it also tracks the user or service who initiated it, the time, date, and other metadata, such as source, IP address, etc. And where would CloudTrail send all this data? Well, to S3 of course as logs. Now S3 is used by many services for storing data and you probably know that by now anyway.
Now, a quick point worth mentioning is that CloudTrail logs can also be sent to CloudWatch logs for additional review, triggers, and automated responses that CloudWatch can provide like we've already talked about. So, this is another great integration between two services. Sometimes CloudTrail can be used as a security analysis tool, for example, identifying APIs that shouldn't be called, or it can be used to assist with auditing as I mentioned previously. So, let's look at a question again covering this topic. So, what are the benefits of CloudTrails integration with CloudWatch logs? So, this covers a couple of the points that we mentioned in this course. So A, it delivers SDK activity captured by CloudTrail to a CloudWatch logs log stream. Now, it doesn't actually capture SDK activity. As I mentioned, CloudTrail is used to capture API activity. So, it's not A. B, it delivers API activity captured by CloudTrail to a CloudWatch logs log stream. Now, we know CloudTrail captures API activity. So, that's definitely correct. But does it have the integration with CloudWatch? Yes, it does. So, it can send its logs to a CloudWatch log stream in addition to Amazon S3 as well. So, B is certainly correct. Let's keep going. C. It doesn't exist. Well, the integration certainly does exist between CloudTrail and CloudWatch. So, that's incorrect. And D, it delivers API activity captured by CloudTrail to an S3 bucket. Again, that is a true statement, but it doesn't answer the question. The question is, what are the benefits of CloudTrail integration with CloudWatch logs? So, it's not actually mentioning CloudWatch logs there at all. So, although it's a true statement as such, it's not answering the question. So, the answer here is B, it delivers API activity captured by CloudTrail to a CloudWatch Logs log stream.
Okay, so next up we looked at AWS Config. And I see CloudTrail and Config appear in the same question or the same set of answers for a particular question. So, it's certainly worth noting the main difference between each of them and what each service is used for. So, let's take a look at config. So, AWS Config is designed to record and capture resource changes within your environment. It's a great service for helping you collate and review data about a specific resource type within your environment. Now you can check its configuration history to see all of the changes that have occurred on the resource since you first provisioned it, or you can see a snapshot in time of its current configuration. Now, again, this also has integration with SNS and CloudTrail to offer automated notifications of any resource Config changes and which APIs triggered those. Now, one great benefit of AWS Config is its ability to implement Config rules which ensure that your resources are meeting a specific specification. Now, this is great if you get any questions that relate to compliance. So, let's say the question was asking for ways of ensuring that your EFS file system were encrypted with KMS at all times, which was perhaps needed to meet specific regulatory requirements. What service could you use to help maintain this?
Well, the answer here would be AWS Config using managed rules. So, Config would assess your EFS file systems and alert you if an EFS file system was deployed without encryption. Now, this would allow you to correct the non-compliance that it's met. So, let's take a look at an example question that relates to AWS Config. So, the question reads, you are a cloud engineer for a mid-sized enterprise organization, and your compliance team has approached you with a request to implement a configuration management solution in your AWS environment. You decide that AWS Config may be the best solution to meet the needs of your compliance team. What three features does AWS Config provide as a part of a comprehensive configuration management solution? (Choose three answers). So, again, this is one of those questions where you can eliminate half the text and just read the last sentence. All we're trying to understand here is the features that AWS Config provides to help you implement a comprehensive configuration management solution. So, let's take a look at the answers and see which are the most appropriate.
So, A, does it provide security analysis? Well, certainly you can set a specific Config rules to check for certain security compliance, for example, the encryption scenario that I mentioned earlier. So, A is certainly a feature. B, analysis of identity and access management changes. Well, identity and access management changes, I would say focuses on user's roles and permissions. They're not actually resources that you can configure such as an EC2 instance, for example, or an EBS volume. So, AWS Config doesn't look at the changes between policies and IAM user settings for example. So, B is not a correct answer. C, managing and troubleshooting configuration changes. Again certainly, AWS Config is great for this because if there's an incident, then you can check your Config timeline to see what changes have been made and identify if any of those changes caused a problem with your environment. So, C is a correct answer as well. D, agent-based configuration management. But AWS Config doesn't actually use any agent-based monitoring. So, D is not a correct answer. And then we have E, auditing and compliance. So, again, we know it can help with compliance and also it's great for auditing as well because we can keep a track of all the configuration, all history changes made on a specific resource. So, the answers here are A, C, and E.
Okay, so we also looked at management from an AWS account level perspective as well, and this focused on AWS organizations. Now this service is certainly mentioned in the exam, so make sure you know when you choose it and what it does and also some of the components that are used which make it a really effective account management service. And I'm going to tell you some of the main points to remember. Now the primary benefit that this service brings is its ability to essentially manage multiple accounts from a single AWS account known as the master account. Now, of course, by doing this, it helps to maintain security compliance and account management under a single umbrella.
Now there are two options to deploy AWS organizations, you can either deploy all with all features, which is the default, and this uses enhanced account management features, or just with consolidating billing features enabled. And this just gives a subset of features providing basic management tools enabling you to manage billing essentially across all of your accounts. So, the organization of your accounts essentially forms a family tree structure allowing you to group certain accounts with others. So, know the difference between the root object, the organizational unit objects and also account objects as well. Now one benefit of organizations is that you can use service control policies or SCPs to control what services and features are accessible from within an AWS account or group of accounts. So, when a service control policy is applied to an organizational unit, all child accounts that fall under that OU will be under the same controls that are applied within the SCP.
Now, some people think SCPs as permission policies, however, they don't actually give permissions, rather they just limit what permissions can be given within the corresponding account. So, they act as a permission boundary instead. So, if the SCP denied all S3 access, then no one in the associated account would be allowed to use S3. Even if their IAM permissions allowed it, it would be denied at the SCP level. So, if you get any questions relating to multi-account permissions and restrictions, then it's likely that AWS organizations will be mentioned, specifically regarding service control policies. Okay, time for another question.
A company is using AWS organizations to manage several organizational units and AWS accounts. Management needs all the accounts to have access to all the services in AWS except Amazon RDS, and wants to use a deny list strategy to control access to AWS services. Which of the following steps would a Solutions Architect apply to meet this requirement? (Choose 2). Okay, so here we're using multiple accounts with AWS organizations. We need all the accounts to have access to all services apart from RDS. So, what is the best way to apply this security strategy using AWS organizations? So, let's take a look.
So, A, leave the default AWS-managed SCP that grants FullAWSAccess in the root account and at all levels. So, AWS recommends that you don't detach the SCP at the root of your organization without thoroughly testing the impact the policy has on the accounts. And as the question states, we want all accounts to have access to all AWS services apart from RDS. So, I would say we should leave this root account policy, but then apply some kind of deny SCP for RDS. So, let's go through the rest of the answers and see what you have. B, remove the default AWS-managed SCP that grants FullAWSAccess from the root account, all OU levels, and accounts. Well, it's not recommended that we do that and we kind of need this to give access to everything else. So, I'd be wary of that option. C, associating an SCP with each OU and account granting RDS access. Well, as a part of the question, we actually need to deny RDS, so that's certainly not an option. D, associating an SCP with each OU and account denying RDS access. So, that's exactly what we need to do. So, D is definitely an answer, and if we combine that with A, then we have a solution that allows all accounts within our organization full access to all AWS resources other than RDS. So, the answers here are A and D.
Okay. Lastly, I just want to make sure that you know what a VPC Flow Log is and where you can use them. I've seen this topic come up on the exam a couple of times before. Now you might be asked how to monitor specific network traffic between your subnets or different interfaces within your infrastructure, and what would be the best solution to do this? Well, VPC Flow Logs will certainly help you here. You'll need to have a basic understanding of what a VPC Flow Log is, what it can capture, and when they can be used to help you answer questions relating to this topic. So, VPC Flow Logs capture all the IP traffic flowing between your network interfaces on your resources within your VPC, and then this log data is then sent to CloudWatch logs. Now, once the VPC Flow Log has been created, it can't be changed. The only way you could change it would be to delete it and then recreate another. So, VPC Flow Logs can be configured against a network interface, one of your subnets in your VPC, or the VPC itself. So, just remember those three levels of application.
Okay, let's take a look at our last question for this summary. The question is, a DevOps team needs an application hosted on resources in an Amazon Virtual Private Cloud that consists of a public and private subnets. The public subnets includes a NAT Gateway, and all Internet-bound traffic from EC2 instances running in the private subnet is routed to the NAT Gateway. Now we should be familiar with NAT Gateways and VPCs at this point as we covered this extensively in a previous course. Now recently, the NAT Gateway's data transfer costs were higher than expected, and the team has asked the Solutions Architect to investigate the problem. What steps can the Solutions Architect take to understand the source of the higher than expected data transfer costs? So, let's go through our options and see what would be most appropriate.
So, A, so we can turn on flow logs for the VPC and use CloudTrail to query instances sending the most traffic to the NAT Gateway. Now turning on flow logs for the VPC will certainly capture the network data, but CloudTrail doesn't really help us there because CloudTrail is used to track any API calls, but we want to look at the actual network data within these flow logs and we can't use CloudTrail to query that data. So, that's not an option. B, set up a CloudWatch Alarm to send administrators a notification when the outbound traffic through the NAT Gateway exceeds a specified number of bytes. Now, again, that's not really going to help us. All that's going to do is let us know how much traffic has gone through, it doesn't actually help us understand the source. So, that's not really an option either. C, create a CloudWatch dashboard to monitor the number of outbound bytes to the Internet destination through the NAT Gateway and the bytes inbound from the destination. Now this won't actually monitor any traffic that's been sent to the NAT Gateway from our instances, and that's what we're trying to understand in the answer here. So, again, C is out of the question. So, that leaves D. Turn on flow logs for the VPC and use CloudWatch Logs Insights to query instances sending the most traffic to the NAT Gateway. So, again, VPC Flow Logs, that would capture all the network traffic, and then we can use the CloudWatch Logs to query instances sending the most traffic, and that will help us identify the source of the higher than expected data transfer. So, the answer here is D, because VPC Flow Logs sends data to CloudWatch Logs and then we can query that data using the Logs Insights.
Okay, so we've now reached the end of this course, so a few things not to forget. So, if you need to monitor health of different resources, set alerts, gather metrics, then use CloudWatch. If you need to capture API calls being made across your AWS account, then AWS CloudTrail is your go to answer. If you need to monitor, manage, and assess the configuration state of your resource, then AWS Config can help you here with managed Config rules. If you have to set up management and security controls across a multi-AWS account level, then AWS Organizations is the service you need. And lastly, if you need to capture network traffic at an interface, subnet, or VPC level and review the logs in CloudWatch, then VPC Flow Logs should be at the forefront of your mind. Okay, that's me done. Let's take a step away from the keyboard and take a break before tackling the next section.
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.