Resource Consumption Alerts and Responses
Start course

In this course, we will learn the concepts of microservice and spring framework and focus on Microservice concerns.

Learning Objectives

  • Health Checks
  • Alerts
  • Error Handling
  • Security

Intended Audience

  • Beginner Java developers
  • Java developers interested in learning how to Build and Deploy RESTful Web Services
  • Java Developers who want to develop web applications using the Spring framework
  • Java Developers who want to develop web applications with microservices
  • Java Developers who wish to develop Spring Boot Microservices with Spring Cloud


  • Basic Java knowledge

Hello, my dear friends, so far we have learned about health and metrics. Now, we'll talk about alerting methodology and monitoring the overall resource consumption of the services. We will use the Prometheus timescale database and the Grafana observability tool. Both are open source and widely used tools. Before moving on to Prometheus and Grafana, I want to make some changes to our services. The first thing I want to do is to make services accessible to the public. For now, one can reach them only from local host. To make them accessible, I'm opening the server.xml file and finding the HTTP endpoint tag and adding the host='*' expression. From now, on other computers will be able to access these services. I made it that way because I will run Prometheus and Grafana as docker containers and I don't want to face any access problems and I want to make a change on NIN service about application metrics. I've defined an application metric on the number service. It's for counting invocations of the number generator. Now, I want to make the metric a little more comprehensive.

I'm adding a timed annotation to measure the execution time of the service. Adding a metered annotation to measure the throughput rate of the service, and adding some extra description to the existing counted annotation. Okay, I changed it. Now, let's run the service, make some CURL operations and list the metrics. NIN service is running now. I will make some CURL. Okay, I did several CURL.

Now, I am opening the metrics page. Because of a certificate issue, it asks me if I'm sure I want to open the page. I click 'Advanced and accept the risk'. It asks me to enter my username and password. I'm writing the username and password that I've defined in the server.xml earlier admin, admin. The application metrics we just defined can be found at the very bottom of the page. You can use them to effectively monitor your service. Now, I will launch Prometheus and Grafana using Docker compose tool. Docker compose simplifies the deployment of applications and services to the Docker platform in bulk. I've prepared a compose file before, I want some important parts of the Docker compose elements. In Docker, the compose.yaml file. There are two services, Prometheus and Grafana. Under the Prometheus part, I've defined a volume that allows me to inject the Prometheus configuration file.

We will have a look at the file soon. I've defined a network named network-net. It allows the services to communicate between themselves. I've exported the needed ports and under the Grafana part, I've defined a user name and password to use when I log in and I've defined a volume that includes Prometheus data source connection info. Okay, let's take a look at the Prometheus YAML file. It includes some info about configuration. The essential part for us is scrape_configs, it specifies from which sources the Prometheus gathers the metrics. As the configuration, Prometheus gathers the metrics from itself, citizen-service and nin-service. Okay, it's time to go, run the code docker-compose up -d. In this case, -d allows us to avoid attaching the transaction flow of the services. Okay, both started. We can check the running containers with docker container list command. Okay, they are successful.

Now, I'm opening a new web explorer page and connecting to local host:9090. Okay, the Prometheus page came. There are all the metrics that exist on the micro-profile metrics page. For example, let's take one of them, paste it into the box and press the execute button. Okay, a value is listed below. You can see the current value of the metric on the very right of the row. When you press the graph tab, you can see the temporal variation of the metric. You can use that box as a filtering tool. For example, let's write up. As you can see, three records are listed. It represents whether a service is live or not. If the value is 0 then the service is dead and if the value is 1, then the service is alive. As you see here, nin-service and Prometheus are alive and citizen-service is not alive because we haven't started the citizen-service yet. We can filter the results writing conditionals in the box.

For example, when we write up == 1, that time alive services come like this, or if we write zero then not alive services come. Okay, we can see alerts on alert tab, but there is no alert now because we haven't defined any alerts yet. Let's define some alerts with Prometheus. I want to write service down alerts for our citizen and nin-services. We define alert rules in the YAML format. First, I'm writing an alert rule for citizen-service. We can write the condition for the alert like this. The alert will occur for sure in five seconds and I'm writing an alert for the nin-service. Okay, I completed the alarm configuration file. Now I need to apply it in Prometheus configuration file. I will add a configuration named rule files, and last I will add the file into the Docker compose file as a volume of Prometheus.

Okay, I run the Docker compose command again. Okay, I opened the Prometheus page again, refresh it, click on the alerts. As you see there are two alerts, citizen- service is red because citizen-service is down, nin-service is green because it runs. You can see the details expanding the alerts. Now I will start the citizen-service. Okay, it started. Let's check again for alerts. Okay, citizen-service is up now, now let's look on the Grafana. I'm opening a new web explorer tab and write local host:3000. Grafana asks you to enter username and passwords. I write admin and admin like we've specified it on the configuration file. When you log in the first time, it asks you to change the default password. Grafana can get metrics from a lot of source and allows you to construct your own dashboard in alert system. We've specified our Prometheus source in our configuration file, so we can see our Prometheus metrics and alerts on Grafana. To see the alerts we have defined on Prometheus, click on the left menu. As you see our alert rule is here. Both services are up and there is no problem. You can define your own alert with Grafana by clicking 'New alert rule'.

Like I've said, Grafana is an open source platform, you can find a lot of ready dashboards in many contexts. For example, we can use one of them for our micro-profile metric platform. Let's search for one. Write microprofile prometheus grafana dashboard. Click on the suitable link. In Grafana environment, every dashboard has an ID. We can import them by ID. So, click the button 'copy ID to clipboard'. Okay, now open the Grafana page. Hover your mouse on this dashboard item, a menu is opened. Here, click on 'Import'. Paste your copied ID and press 'Load'. Okay, some information about the dashboard is listed. Below, select the Prometheus data source and click 'Import'. Okay, our dashboard opened, but as you see almost nothing is active because the dashboard adjusted for an earlier version of the microprofile metrics and we need to make some changes. I'm starting to change with this variables, click 'Dashboard settings' button, select variables from left side, click service ID. Here, this first parameter is not required, remove it and click 'Update'. Click again variables and select instance this time. This first parameter is not required too, remove it and click 'Update', click 'Save dashboard', go back to the dashboard.

Now, we'll fix the components here one by one. First is status, click 'Edit'. Here, no need to instance parameter, remove it and click 'Apply'. Okay, as you see its status became up. Now CPU load, click 'Edit'. As you see it uses vendor system CPU load metric. I guess it may have changed in new versions. Let's check it through Prometheus metrics. I think it's renamed as base_cpu_systemLoadAverage, copy it. Click 'Apply'. As you see metric data comes regularly and the graph is drawn. I will change other broken graphs in a similar way. Okay, I fixed all the panels. You can see the metrics in a visualized manner using Grafana by service name. It allows you to make your own alert rules easier. You can easily detect a potential bottleneck in your service and take the necessary measures against it. So, I'll see you in the next lesson.


About the Author
Learning Paths

OAK Academy is made up of tech experts who have been in the sector for years and years and are deeply rooted in the tech world. They specialize in critical areas like cybersecurity, coding, IT, game development, app monetization, and mobile development.