Centralized Log Management with AWS CloudWatch: Part 1 of 3
AWS CloudWatch is a monitoring and alerting service that integrates with most AWS services like EC2 or RDS. It can monitor system performance in ne...Learn More
This is the third and final installment of our coverage on AWS CloudWatch Logs. In the first two parts, we saw how different sources of logs can be redirected to CloudWatch. Here, we will see what we can do with those logs once they are centralized.
Many businesses and government agencies need to keep their application logs available for a specified period of time irrespective of their business value. This is commonly necessary for organizational compliance with government legislation or industry regulations/practices.
CloudWatch logs can be retained either indefinitely, or for up to ten years, which for most purposes, is a safe retention window. The retention is set at the log group level. This means all log streams under a log group will have the same retention setting.
By default, log groups are created with a “Never Expire” setting. Clicking on the link will open a dialog box where you can set the retention. As you can see below, the granularity starts from 1 day and goes up to ten years:
Once set, any log data older than the retention period will be deleted automatically.
CloudWatch Logs would not be of much use if they were only a place for capturing and storing data. The real power of log management comes when you find critical clues, visualize trends, or receive proactive alerts from thousands of lines of logged events.
Searching for a particular piece of information from a large log file can be a daunting task. This is especially true for applications that generate logs in JSON or other custom formats. Although there are both free and commercial log management solutions in the market, AWS CloudWatch offers some really comparable benefits in this area. These include:
We will start with metric filters and see how they can be used to search within log data. A metric filter is basically a search criterion where the data returned is published in some custom metric.
If you have worked with CloudWatch before, you may remember there are out-of-the-box metrics for services like EC2, EBS or RDS. With metric filters, we can create our own metric for a log group.
In our test case, we decided to search for the text “Error” in our SQL Server log files which are being sent to CloudWatch. For more information on how this was done, refer to the second part of this series. Metric filter search strings are case sensitive, so “error” and “ERROR” would return two different search results. Similarly, the position of the searched string can be anywhere within a logged event. We can’t tell CloudWatch to search at the beginning, middle or end of a line.
To generate errors, we created a SQL Server Agent job that would simulate an error condition every 15 seconds, and log a message.
The following 5 steps show how to create a metric filter.
Step 1.Select the SQL Server log group in the CloudWatch Logs console and click the “Create Metric Filter” button:
Step 2. In the next screen:
Step 3. In the following screen:
Once you click on the “Create Filter” button, the metric will be created. This is visible from the log group’s properties:
Clicking on the “1 filter” link will take you to the metric’s page:
From here, clicking on the metric’s name (“SQL_Server_Errors” in our example) takes you to the custom metric’s page. When you select the metric, its graph is shown in the lower half of the screen just like any other CloudWatch metric:
From the graph in this particular image, you can see there have been some instances of the phrase “Error” in the SQL Server logs. You can create an alarm for this metric from the graph. These alerts will send you an e-mail when the number of occurrences exceeds a specified threshold.
You can also create dashboards from metrics you create on your log groups. These dashboards essentially show the same type of graph we just saw. If you think about it, CloudWatch log management now offers a whole new way of systems monitoring where you can have:
The following 5 steps show how we created a dashboard for plotting the trend of errors in our SQL Server log.
Step 1. Click on the “Dashboards” link from the CloudWatch console. This will open up an empty page. From here, click on the “Create Dashboard” button, then provide a name for the dashboard.
Step 2. Once the empty dashboard is created, click on its name from the CloudWatch Dashboards console. In the next screen, click on the “Add widget” button with the dashboard selected:
Step 3. A dialog box appears which gives you two options: one to add a metric graph, the other to add a text widget. With the metric graph selected, click on the “Configure” button:
Step 4. From the “Add Graph” screen, choose the namespace from the custom metrics drop-down list. In our case, it was “LogMetrics.”
Step 5. In the next screen, select the metric name from the drop-down list and click on the “Add widget” button. This will create the dashboard as shown below. It can be resized and its time axis can be scaled for different intervals. Once you are happy with how it looks, save the dashboard.
It’s also possible to use ClodWatch Logs as a data source for other AWS services like S3, Lambda or Elastisearch. Sometimes you may want to export data to S3 for further analysis with a separate tool, or load data to a Big Data workflow. The export process is fairly simple: just select the log group from the CloudWatch Logs console and select the “Export data to Amazon S3” option from the “Action” menu:
The export process is fairly simple. Select the log group from the CloudWatch Logs console and select the “Export data to Amazon S3” option from the “Action” menu:
You can use the same menu to export logs to AWS Lambda or Elasticsearch. These services can further analyze or process the data or pass it on to other downstream systems. For example, a Lambda function can subscribe to an AWS CloudWatch log group and send its data to a third-party log manager whenever a new record is added.
AWS CloudWatch has come a long way since its inception and CloudWatch Logs will probably only improve in future releases. Despite the obvious benefits to AWS customers, there are some limitations though:
Many organizations serious about log management will probably build their own syslog server, or use a third-party solution, like Loggly or Splunk. However, it’s still possible to continue using AWS CloudWatch Logs by integrating it with a third-party tool. Most of these solutions already have a CloudWatch connector that can be used to ingest logs from it. For companies or engineering teams still not using a centralized log manager, CloudWatch offers a viable, low-cost alternative.
Cloud Academy offers a hands-on lab Introduction to CloudWatch that may be useful in applying many of the concepts I have covered. Cloud Academy offers a 7-day free trial so that you may explore Courses, Hands-on Labs, Quizzes, and Learning Paths. It is worth checking out regularly as the resources are expanding on a weekly basis. We encourage you to review the previous two posts of this series if you have not already and hope you would gain valuable insight from this multi-part coverage. Your comments and feedback are welcome.
Amazon Web Services’ resource offerings are constantly changing, and staying on top of their evolution can be a challenge. Elastic Cloud Compute (EC2) instances are one of their core resource offerings, and they form the backbone of most cloud deployments. EC2 instances provide you with...
AWS's WaitCondition can be used with CloudFormation templates to ensure required resources are running.As you may already be aware, AWS CloudFormation is used for infrastructure automation by allowing you to write JSON templates to automatically install, configure, and bootstrap your ...
As companies increasingly shift workloads to the public cloud, cloud computing has moved from a nice-to-have to a core competency in the enterprise. This shift requires a new set of skills to design, deploy, and manage applications in cloud computing.As the market leader and most ma...
The announcements at re:Invent just keep on coming! Let’s look at what benefits these two new EC2 instance types offer and how these two new instances could be of benefit to you. If you're not too familiar with Amazon EC2, you might want to familiarize yourself by creating your first Am...
Google Cloud Platform (GCP) has evolved from being a niche player to a serious competitor to Amazon Web Services and Microsoft Azure. In 2018, research firm Gartner placed Google in the Leaders quadrant in its Magic Quadrant for Cloud Infrastructure as a Service for the first time. In t...
In order to understand AWS VPC egress filtering methods, you first need to understand that security on AWS is governed by a shared responsibility model where both vendor and subscriber have various operational responsibilities. AWS assumes responsibility for the underlying infrastructur...
Is it possible to create an S3 FTP file backup/transfer solution, minimizing associated file storage and capacity planning administration headache?FTP (File Transfer Protocol) is a fast and convenient way to transfer large files over the Internet. You might, at some point, have conf...
Microservices are a way of breaking large software projects into loosely coupled modules, which communicate with each other through simple Application Programming Interfaces (APIs).Microservices have become increasingly popular over the past few years. The modular architectural style,...
There are many use cases for tags, but what are the best practices for tagging AWS resources? In order for your organization to effectively manage resources (and your monthly AWS bill), you need to implement and adopt a thoughtful tagging strategy that makes sense for your business. The...
Amazon S3 is the most common storage options for many organizations, being object storage it is used for a wide variety of data types, from the smallest objects to huge datasets. All in all, Amazon S3 is a great service to store a wide scope of data types in a highly available and resil...
One of the main promises of cloud computing is access to nearly endless capacity. However, it doesn’t come cheap. With the introduction of Spot Instances for Amazon Web Services’ Elastic Compute Cloud (AWS EC2) in 2009, spot instances have been a way for major cloud providers to sell sp...
A Comparison of Machine Learning Services on AWS, Azure, and Google CloudArtificial intelligence and machine learning are steadily making their way into enterprise applications in areas such as customer support, fraud detection, and business intelligence. There is every reason to beli...