Amazon RDS: Monitoring
The course is part of this learning path
This introductory course provides a solid foundation in monitoring Amazon RDS using AWS tools.
It begins by getting you acquainted with monitoring databases hosted on the Amazon RDS service and then moves on to explore the available AWS tools that can be used for this purpose.
If you have any feedback relating to this course, please reach out to us at firstname.lastname@example.org.
- Learn about database monitoring in AWS
- How monitoring databases in the cloud is different from on-premises
- Understand the AWS tools available inside RDS for monitoring
- Become aware of the AWS infrastructure monitoring tools that can be used to monitor RDS
This course is intended for anyone who is new to database monitoring — or monitoring in general — and needs to monitor databases hosted in Amazon RDS.
To get the most out of this course, you should have a basic knowledge of cloud computing (Amazon Web Services in particular) and have a high-level understanding of how relational databases work.
What does it mean to monitor a database? Why should it be monitored at all? I'd like to start with a quick discussion of what monitoring is. Monitoring is a general term and its meaning depends on its context. In information technology, it refers to the process of becoming aware of the state of a system.
As with monitoring the meaning of the word state depends on the context in which it is used. The overall database state could be online, offline, or recovering. When monitoring performance state could be about the speed of queries or CPU utilization. The state of storage could be available disk space, the number of inputs and outputs per second, or throughput. State awareness is a process that is both proactive and reactive.
Proactive monitoring is a type of surveillance. It involves watching visual indicators, such as time-series data in dashboards. The techniques used in monitoring of systems like relational database include realtime processing, statistics, and data analysis. Reactive monitoring uses automation to trigger notifications when there has been a change in a system state that is deviated significantly from a baseline. This is often called alerting.
The collection of software components used for data collection, it's processing, and presentation is referred to as a monitoring system. Whether migrating existing databases to AWS or building database solutions directly in the Cloud, the development of a monitoring system promotes efficient and cost-effective operation.
One way to think about monitoring is to compare it to an intensive care unit in a hospital. Every patient is being monitored and while most will have the same vital signs being checked, their values, and when the alarm is unique to each patient. The acceptable heart rate and blood pressure for one person might be dangerous for another.
The same is true for monitoring databases and workloads inside AWS. Cloud monitoring requires a customizable and flexible approach based on business needs. In a development environment, I might not care about disc space utilization. I'm gonna delete that data every night. In production, I'm going to care and I need to know immediately if there's an issue. Cloud computing is fluid. It changes rapidly and no solution will last forever.
Monitoring solutions drive change, and also have to adapt to the iterative nature of the cloud.
In the cloud change is a constant. Adapting to change can be difficult, but it doesn't have to be. Personally, when I hear someone say that change is hard or bad, I have to wonder if they ever change their socks. Changing my socks happens with some regularity it's predictable or more accurately it's predictable until it isn't. When it rains and I step in a puddle the change happens sooner than I expected. Monitoring systems can't predict the weather, but they can help identify climate changes, send alerts, and help automate responses like having a spare pair of socks ready. Monitoring is the process of discovering changes in a system state. Database monitoring specifically is a key part of maintaining a relational database's reliability, availability, performance, and efficiency.
Efficiency is notable because in the Cloud costs are calculated based on consumption. Improper resource allocation results in money being wasted. However, some waste is to be expected. There's usually a difference between expected usage and actual consumption. Over time having a monitoring plan brings that usage and consumption into alignment.
AWS has tools for monitoring general cloud usage. Some services like RDS, have a variety of built-in tools for monitoring specific aspects of usage. They can be used alone or in conjunction with one or more of the other monitoring services to get a larger picture of what's happening inside a cloud environment.
Inside RDS some of the monitoring systems are on by default and their costs are included as part of the service. These services include Amazon CloudWatch, detailed monitoring, RDS, enhanced monitoring, and performance insights.
There are additional AWS services that can be activated and configured based on need. The cost for these additional services vary. It depends on usage and the related resources such as storage that are consumed.
While outside the scope of this course, it is also possible to use third-party solutions or use the AWS APIs to create custom monitoring and reporting.
A database's state is monitored using metrics. However, what exactly is a metric? A good working definition of a metric is that it is the measure of a variable, something that changes over time. To be a little more specific a metric is a set of numbers that gives insight into a specific process or activity. Metrics represent activity numerically as time-series data, numbers collected over time. This provides data for mathematical modeling and prediction. What aspects of your database do you need to manage? CPU utilization, number of concurrent users, disc space, IOPS. These can all be represented numerically over time.
Once you decide what needs to be watched, you'll know what to measure and how often to do it. These measurements, observable data that can be watched over time, are metrics. Most organizations have objectives related to how a database should perform in production, benchmarks. To track against these benchmarks database administrators gather performance data from the database, store it, and then compare values over time. Database instance metrics include CPU and memory utilization, disk space utilization, and query response times. There is no single set of metrics that is based for all use cases. It depends on an organization's needs. This is why before you start collecting metric data, you should have a plan.
Having a systematic monitoring plan, no matter how simple can help identify and address potential issues with your database before they become incidents that cost time and money. Designing and implementing a monitoring strategy can be an involved process. It is important to choose the appropriate metrics. Doing this is vital to ensuring database's performance, secure, and available.
When making a monitoring plan, consider these topics, "Service level agreements, incident types, and escalation paths." A service level agreement or SLA defines expectations. It defines how the performance by which a service or application is measured and includes penalties for times when those expectations fail to be achieved. It will also establish customer expectations around availability and should include times for scheduled maintenance. In the context of database performance monitoring, having accurate measurements is the difference between when something feels slow and when there's an actual performance issue.
Metrics need to have meaning. They must serve a purpose. Vanity metrics, ones that are captured because they seem interesting are like dandelions, they look pretty, but in reality, they're simply weeds using resources that add zero value. Data points are abundant, but gaining insight from them can be challenging. Insights are generally defined as actionable data-driven findings that create or add value to a process or a system. Insights derived from the intelligent use of data are powerful. The lesson here is to be sure to choose metrics that allow you to take actions that improve value. This value could be cost efficiency, performance enhancements, or tightened security.
A monitoring plan will define what is normal. A quick reminder that normal is relative it is not an absolute value. Normal is also a range. Having baseline measurements for comparison is an important part of a monitoring plan.
This baseline defines what activity is normal for your organization. The challenge when creating a baseline is deciding on which metrics have a high level of importance and what the acceptable ranges are. These acceptable ranges become the definition of normal and can be put into a service level agreement or SLA. Having an SLA helps define what should be monitored, what is expected of normal behavior, and aid in the development of a responsibility assignment matrix. You cannot always control the data, but you can control what you care about and how you react to changes.
A responsibility matrix is used to define, describe, and clarify the roles and responsibilities for monitoring and addressing issues with the database. RACI, R-A-C-I is an acronym derived from the four key responsibilities that are typically included. These are those people that are responsible, accountable, consulted, and informed.
Developing a responsibility matrix will improve communication between the stakeholders in the development chain. This will help ensure that the correct metrics are being chosen and modified as needed.
With perhaps the exception of an organization that has a small number of people, no one person is responsible for monitoring all aspects of a database. DevOps is a philosophy of software development and operations that promotes the continuous delivery of quality software that is high in value. Using this philosophy systems administrators, database administrators, and application developers need to be jointly involved in the monitoring for performance, security, and compliance.
As you build a monitoring plan, be sure to include every part of the development team. Database administrators and development teams can benefit from sharing a common set of performance data to support application databases. Whether high level or granular the data can and should be relevant to the stakeholders' needs.
Stephen is the AWS Certification Specialist at Cloud Academy. His content focuses heavily on topics related to certification on Amazon Web Services technologies. He loves teaching and believes that there are no shortcuts to certification but it is possible to find the right path and course of study.
Stephen has worked in IT for over 25 years in roles ranging from tech support to systems engineering. At one point, he taught computer network technology at a community college in Washington state.
Before coming to Cloud Academy, Stephen worked as a trainer and curriculum developer at AWS and brings a wealth of knowledge and experience in cloud technologies.
In his spare time, Stephen enjoys reading, sudoku, gaming, and modern square dancing.