Application Monitoring and Alerting
Platform Monitoring and Alerting
The course is part of these learning pathsSee 2 more
In this lesson, you will learn how to monitor the health of Azure using Azure Health.
We will discuss the three levels of information available through Azure Health:
- Azure Status: a dashboard displaying the global health of different Azure components.
- Azure Service Health: similar to Azure Status but focused only on your Azure resources
- Azure Resource Health: provides the status of specific instances of Azure resources
In addition to data reviews, we will discuss analysis and system events using two other resources:
- Azure Advisor: examines all of your Azure resources and identifies ways to optimize them. It focuses on security, availability, performance, and cost.
- Azure Activity Log: a subscription log that records events using data from Azure Resource Manager. It records six specific types of events:
o Service Health
You will learn how to access Azure Activity Log from a wide range of tools such as:
- CLI tools
- Azure Monitor REST API
We will go over the other abilities provided including:
- Store data offline and query it with scripts
- Set alerts against specific types of event
- Stream Activity Log data to external event hubs
Finally, we will cover the limitations of the Activity Log.
Out first monitoring priority is just tracking the health of Azure itself. Azure Health is your first line of defense for identifying problems with Azure platform resources. It is the most high level view of your cloud system and is generally the best place to start if you are experiencing problems and have no other leads.
Azure Health offers three basic levels of information about your infrastructure. At the highest level is Azure Status, which is simply a dashboard with information about the health of all of Azure’s different components. This information is not specific to your account; rather it will tell you if there is a global issue with a particular Azure service, such as virtual machines or storage. A quick check of the Azure Status dashboard is the fastest way to rule provider level problems when debugging failures.
The next level is Azure Service Health. It is similar to Azure Status but focused only on Azure resources in your account. Like Azure Status it is a dashboard only this one is customizable. You can arrange elements to focus on specific Azure products and regions. Also like Azure Status, it will give you a general health check for each product category and inform you of issues that need your attention. Azure Service Health will notify you of planned maintenance that may affect your system. It will also warn you if you are approaching a resource quota or if a feature you use is about to become deprecated. After a quick check of the Azure Status dashboard, Azure Service Health is often the next best place to quickly examine when you are in the middle of responding to some kind of system failure.
Finally you have Azure Resource Health, the most granular level of inspection in the Azure Health suite. With Azure Resource Health you can get the status of specific instances of Azure resources, such as a single VM. It will let you know if a given resource is unavailable and it keeps a very useful log of platform events. You can see if an Azure platform SLA was violated at any point. With its very simple status messages and historical data, Azure Resource Health is the simplest way to monitor individual resources from the Azure platform’s perspective.
The Azure Health set of services gives us a lot of information to track our system’s health over time. However we often need to more than just data; we need some analysis and a more granular view of system events. To address these two issues we will use Azure Advisor and Azure Activity Log respectively.
Azure Advisor, as the name implies, is a personalized cloud consultant. It automatically examines all of your Azure resources and identifies ways to optimize them. It will spit out tons of recommendations automatically focusing on security, availability, performance, and cost. The nice thing about Advisor is that it’s really simple to use. Literally all you do is click “Advisor” in the left pane menu and you’ll get a summary of recommendations. You can then drill down to one of the four categories I just mentioned. So for example, you might get a cost recommendation telling you to scale down your storage provisioning if you are not actually using much of it.
So Azure Health has given us copious amounts of information about our system and Advisor continually helps us optimize. The final component is tracking system changes over time. For this we have Azure Activity Log.
Azure Activity Log is a subscription log that records events using data from Azure Resource Manager. It records six specific types of events: Administrative, Service Health, Security, Alert, Autoscale, Recommendation. There is also a seventh category called “Policy and Resource Health,” but as of publication of this course, it is not in use.
Azure Activity Log is accessible from a wide range of tools. You can get event data from the console, Powershell, CLI tools, and the Azure Monitor REST API. You can do a lot of useful things aside from just viewing system event data in the dashboard. You can store the data offline and query it programmatically with scripts. You can set alerts against specific types of event. You can also stream Activity Log data to external event hubs such as a third party analytics tool or monitoring system.
Activity Log is a very flexible tool and is very simple to use. That said, it is important to understand its limitations. It is not a replacement for proper application logging and monitoring. Activity Log is primarily concerned with Azure Resource Manager events. Some of the older “classic” type resources will not by default send events to Activity Log. You will need proxy resource providers to make the operations appear in Activity Log.
So now we are at the end of our dive into platform resource monitoring. You should have a basic understanding now of how Azure can monitor its own assets, track events over time, and automatically help us use resources most efficiently. So next we move onto all of that space in between our Azure resources: the network. In the next lesson we will learn how to properly monitor our network infrastructure in Azure.
Let’s get to it.
Jonathan Bethune is a senior technical consultant working with several companies including TopTal, BCG, and Instaclustr. He is an experienced devops specialist, data engineer, and software developer. Jonathan has spent years mastering the art of system automation with a variety of different cloud providers and tools. Before he became an engineer, Jonathan was a musician and teacher in New York City. Jonathan is based in Tokyo where he continues to work in technology and write for various publications in his free time.