This course provides an insight into some of the different logging capabilities and techniques that different AWS services have. Logging is a fundamental element of being able to optimize, identify and isolate issues and incidents, in addition to being a core component to some auditing and governance controls. As a result, it's essential to have an understanding of how AWS is able to implement some of these logging features.
The objectives of this course are to introduce you to the benefits of logging, followed by an understanding of the logging mechanisms used by the following services and features
- The Amazon CloudWatch Logging Agent
- AWS CloudTrail Logging
- Monitoring AWS CloudTrail with Amazon CloudWatch
- Amazon CloudFront Access Logs
- Amazon VPC Flow Logs
This course is designed for those who are in roles such as:
- Cloud Security Architects
- Cloud Administrators
- Cloud Support & Operations
- Compliance Managers
As a prerequisite to this course, you should have a basic understanding of AWS, including Amazon CloudWatch, AWS CloudTrail, Amazon CloudFront and Amazon VPCs.
Hello, and welcome to this lecture focusing on the logging capabilities and configuration of AWS CloudTrail. You should already be familiar with what CloudTrail is and what it does. However, to quickly summarize: It's a service that has a primary function to record and track all AWS API requests made. These API calls can be programmatic requests initiated from a user using an SDK, the AWS Command Line Interface: CLI, from within the AWS Management Console, Or even from a request made by another AWS service. For example, when auto-scaling automatically sends an API request to launch or terminate an instance. These API requests are all recorded by CloudTrail. When an API request is initiated, AWS CloudTrail captures the request as an event and records this event within a log file which is then stored on S3. Each API call represents a new event within the new log file.
CloudTrail also records and associates other identifying metadata with all events. For example, the identity of the caller, which can be the user or the account that made the API call, the timestamp of when the request was initiated, and the source IP address. The logs generated are the output of the CloudTrail service and they hold all of the information relating to the API calls that have been captured. So as a result, it's important to know what you can do with these logs in order to maximize the benefit of the data they contain.
So, what is a log file and what does it look like? Log files are written in a JSON format. Much like access policies within IAM and S3. Every time an API is captured, it's associated with an event and written to a log. And new logs are created approximately every five minutes or so, but they are not delivered to a nominated S3 bucket for persistent storage for approximately 15 minutes after the API was called. So if you expect to see the log file for an API called seven minutes ago, then you may not see the log as expected for potentially another eight minutes. The log files are held by the CloudTrail service until final processing has been completed. Only then will it be delivered to S3, and optionally, AWS CloudWatch, depending on your configuration of the trail.
When an event reflecting an API call is written to a log, a number of attributes are also written to the same event capturing key data about that call. As you can see from this example. Without going through every attribute here, I just want to point out some of the more interesting ones. These being eventName. This refers to the name of the actual API that was called. EventSource. This refers to the service as to which the API called was made against. EventTime. This was the time that the call was made. SourceIPAddress. This displays the source IP address of the requester who made the API call. This is a great piece of information when trying isolate an attacker from a security perspective. UserAgent. This is the agent method that the request was made through. Example values of these are: Signin.amazonaws.com and this is what we have in our example and it simply means that user made this request from within the AWS management console. Console.amazonaws.com, this is the same as the previous, however, if this was displayed, it would mean that the request was made by the root user of the account, and lambda.amazonaws.com, this is fairly obvious, this would reflect that the request was made with AWS lambda. UserIdentity. This contains a larger set of attributes that provides information on the identity that made the API request. Once events have been written to the logs and then delivered and saved to S3, they are given a standard name and format, as shown.
The first three elements of this naming structure are self-explanatory. The AccountID, Name of the Service delivering the log, CloudTrail, and the region that it came from. The next part relates to the date and time. The year, months, and days. The T indicates the next part is the time reflecting hour and minutes. The Z simply means that the time is in UTC. The UniqueString value is a random 16 alphanumeric character string that is simply used by CloudTrail as a unique file identifier to ensure that it doesn't get overwritten with the same name of another file. Currently, the FileNameFormat is defaulted to json.gz which is a compressed GZ version of a JSON text file. While we are looking at structures, let me also talk about the bucket structure way your logs are stored.
You may feel that the logs are all stored in one folder within your S3 bucket. However, there is a lengthy but very useful folder structure as follows: Firstly, you have your dedicated S3 BucketName that you selected during the creation of your Trail. Next, is the prefix that is also configured during Trail creation and is used to help you organize a folder structure for your logs corresponding to different Trails. Following this, is a fixed folder name of AWSLogs. Followed by the originating AWS account ID. Then another fixed folder name of CloudTrail indicating which service has delivered the logs. And after that, the RegionName of where the log file originated from. This is useful for when you have Trails that apply to multiple regions. The last three folders show the year, month and day that the log file was delivered. As you can see, although there are multiple folders underneath your nominated S3 bucket, it does provide an easy navigation method when looking for a specific log file.
This folder structure comes into even greater use if you have multiple AWS accounts delivering logs to the same S3 bucket. Some organizations may be using more than one AWS account, and having CloudTrail logs stored in different S3 buckets across multiple accounts can be inconvenient in certain circumstances and require additional administration to manage. Thankfully, AWS offers the ability to aggregate CloudTrail logs for multiple accounts into a single S3 bucket belonging to one of these accounts. This is why there is an accountID folder within your S3 bucket. Please note that you are unable to aggregate CloudTrail logs for multiple AWS accounts into CloudWatch logs that belongs to a single AWS account.
So to have all your logs from your accounts delivered to just one S3 bucket is a fairly simple process with the end result allowing you to essentially manage all your CloudTrail logs. Let's take a look at how this solution is configured. Firstly, you need to enable CloudTrail by creating a Trial in the AWS account that you want all log files to be delivered to. Permissions need to be applied to the destination S3 bucket allowing cross account access for CloudTrail. And once permissions have been applied to your policy, you need to edit the bucket policy and add an additional line for each AWS account requiring access. Then you need to create a new Trial in your other AWS accounts and select to use an existing S3 bucket for the log files. When prompted, add the bucket name used in step one and when alerted, accept the warning that you want to use a bucket from a different AWS account. An important point to make here when configuring the bucket selection is to ensure that you use the same prefix as the one you used when you configured the bucket in the first step. That is unless you intend to edit the bucket policy to allow CloudTrail to write to the location of a new prefix you wish to use. When you have configured your Trail, click create and your new Trail will now deliver it's log files to the S3 bucket in your AWS account used in the first step. Again, this is a great solution that allows you to essentially manage all of your CloudTrail logs in one single account and S3 bucket. However, there may be uses such as system administrators who manage the other AWS accounts where the logs have come from that might need access to data within these logs. So how would they gain access to the S3 bucket to allow them only to access their CloudTrail logs that originated from their AWS account?
It could be done quite easily by configuring a few elements within IAM. Firstly, in the master account, IAM Roles would need to be created for each of the other AWS accounts requiring read access. Secondly, a policy would need to be assigned to those Roles allowing access to the relevant AWS account logs only. Lastly, users within the requesting AWS accounts would need to be able to assume this Role to gain read access for their CloudTrail logs. The easiest way to show you how to configure the permissions required is by a demonstration while I shall perform the following steps: I shall create a new Role. Apply a policy to this Role to only allow access for AWS account B's folder in S3. Show the Trust Relationship between AWS account A and B. I will then create a new IAM user in account B. And create a Policy and apply the sts:AssumeRole permissions to this user allowing them to assume the new Role we created in account A. So let's take a look at how and where we apply these permissions.
Start of demonstration
Okay, so, as I just said the first thing we need to do is create a new Role in our primary account. So, if we go across to IAM, which is under Security, Identity and Compliance and then once that's loaded, we need to go across to Roles and then Create New Role. So let's give this Role a name. I'll call it 'Cross Account CloudTrail' Click on Next Step. We then need to select a Role type, and what we want to do is select Role for Coss-Account access because we will allow users in another AWS account to access the log files in this primary AWS account. And this will set up the trust relationship between this account and then my secondary account. So, for that we will select this top option of providing access between AWS accounts you own. Then next, I'll need to enter the secondary account ID that I want to create the trust relationship with. So I'll just enter that number.
Okay, then after you have entered your account ID click on next step. And now we need to attach a Policy to this Role. Prior to this demo, I set up my own Policy and this allows cross-account access to read only from my secondary account to the bucket on this primary account. But I'll explain this Policy in a few moments and I'll show exactly what it contains. And from here click on next step and this is just a review of the Role. So we have the Role name, the ARN, the Amazon Resource name, the trusted entities. So this is the secondary account ID that I entered, and then the actual Policy and then the link that we can give to users in the secondary account to allow them to switch Roles. So, create Role. And there we go, the cross-account CloudTrail Role that we just created. So, let's take a look at this.
Firstly, I'll show the trust relationships. So, because we added a cross-account Role access and then we entered the secondary AWS account ID, we can see that this account is trusted by our primary account and that allows entities in this account to assume this Role. Now, I mentioned earlier that I previously set up a Policy with permissions in. So let's take a look at that Policy. I named it Cross-Account read only for CloudTrail. So if I show the Policy, as you can see it. I made a very small Policy. Very simple. Now we have an effect of allow which will allow any S3:Get and any S3:List command, so essentially, read only access on this resource here specified by this line. Now this resource links to the bucket and folder where CloudTrail logs are delivered for our secondary account as you can see here. So, essentially, what this Policy does is allow read only access to any folders within the secondary account's CloudTrail log folders. So this account won't be able to access any other accounts CloudTrail logs, which is important. So, if we come out of this. So let's just have a quick recap of what we've achieved so far.
So, so far, what we've done: We've created a Role in our primary account for our secondary account access. And we've also assigned an access Policy to this Role in order for the secondary AWS account to access the relevant folder in S3. So now what we need to do is assign a user in the secondary account and then apply the permissions to that user to enable them to assume the new Role in the primary account. So let's go ahead and do that.
Okay, so I've now logged into the secondary account where I'll need to create a new user and assign the correct permissions. So, to start with, I'm going to set up a permission Policy to assign to the user. So if I go down to Security, Identity and Compliance and select IAM. And then go across to Policies, and from here I want to create a new Policy. And I am going to create my own Policy, so I'm going to select the bottom option. I'm going to call this AssumeRoleforCloudTrail And description will be Assume role in primary AWS account. And for the Policy document, I'm going to paste in a Policy that I've already created. As you can see, it's only a very small Policy again. And we have an allow effect that allows the Assume Role action from the security token service against the following resource, and this resource links back to a Role on our primary account where we created the Role cross-Account CloudTrail.
So this Policy will allow the user to assume this Role in the primary account. So let's go ahead and create that Policy. Let's validate it first. And then create. Now what we need to do is to assign a user to use that Policy. Now, I created a new user earlier prior to this demo. So, let's just find our new Policy that we just created. And here it is at the bottom, AssumeRoleforCloudTrail. And I'm going to attach a user. And I've called our user CloudTrailUser1. And then attach Policy. And there we go.
So we now have one user attached to this Policy. So that's all the actions and steps necessary to allow a user in a secondary account to access CloudTrail log files that have been delivered to an S3 bucket in a primary account. And it would do this by using the permission Policy that we just applied to that user to access the Role in the primary account. And that Role has a Policy attached that allows S3 read access to it's own CloudTrail logs.
End of demonstration
CloudTrail allows you to enable a feature called Log File Integrity Validation. Which simply allows you to verify that your log files have remained unchanged since CloudTrail delivered them to your chosen S3 bucket. This is typically used for security and forensic investigations where by the integrity of the log files are critical to confirm that they have not been tampered with in any way.
When a log file is delivered to an S3 bucket a hash is created for it by CloudTrail. A hash file is a set of characters that are unique that are created from a data source. In this case, the log file. The hashing algorithms used by CloudTrail are SHA-256. In addition for a hash for every log file created,
CloudTrail creates a new file every hour, called a digest file, which is used to help verify your log files have not changed. The digest file contains details of all the logs delivered within the last hour along with a hash for each of them. These files are stored in the same bucket as the key pair. When it comes to verifying the integrity of your log files, the public key of the same key pair is used to programmatically check that the logs have not been tampered with in any way. Verification of the log files can be achieved via a programmatic access and not via the console. Using the AWS CLI, this can be checked by issuing the following command. The folder structure for the digest is very similar to the CloudTrail logs, as you can see. But the digest files are clearly distinguishable by the CloudTrail digest folder.
That has now taken me to the end of this lecture. Coming up next, I'll explain how you can use CloudTrail and CloudWatch together as a monitoring solution.
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 90+ courses relating to Cloud reaching over 140,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.