Logging is very important today given the volume and variety of data we deal with across different customer use cases. This course will enable you to take a more proactive approach towards identifying faults and crashes in your applications through the effective use of Google Cloud Logging. As a result, you will learn how to delegate the operational overhead to GCP through automated logging tools, resulting in a more productive operational pipeline for your team and organization.
Learning Objectives
Through this course, you will equip yourself with the required skills for streaming log data to Google Cloud Logging service and use metrics to understand your system's behavior. The course will start with an introduction to the Cloud Logging Service and then demonstrate how to stream logs using Cloud Logging Agent and the Python client library.
Prerequisites
To get the most out of this course, you should already have an understanding of application logging.
Intended Audience
This course is suited for anyone interested in logging using Google Cloud Platform (GCP) Cloud Logging.
Resources
- Source code for this course: https://github.com/cloudacademy/Managing-Application-Logs-and-Metrics-on-GCP
- Google Cloud fluentd source code: https://github.com/GoogleCloudPlatform/google-fluentd
- Google Cloud fluentd additional configurations: https://github.com/GoogleCloudPlatform/fluentd-catch-all-config
- Google Cloud fluentd output plugin configuration: https://cloud.google.com/logging/docs/agent/logging/configuration#cloud-fluentd-config
- Package release history: https://pypi.org/project/google-cloud-logging/#history
- Metrics Explorer pricing: https://cloud.google.com/stackdriver/pricing#metrics-chargeable
We now have a good understanding of the Cloud Logging agent. In this section, we will learn to install and configure the Cloud Logging agent and send log data to Cloud Logging service. For this demonstration, we will have a Nginx web server running inside a Google Compute VM Instance. We will stream the Nginx access logs from the VM instance to Google Cloud Logging using the Cloud Logging Agent. This is the Google Cloud Console homepage and I will navigate to the VM Instance page using the navigation menu.
Here, I have a Debian based VM instance nginx-webserver that is running an Nginx Webserver application and serving a simple webpage over the internet. Let's SSH into the VM using the SSH button. This creates a WebSSH session that allows us to run shell commands directly through the web browser. Let's navigate to the Web Browser to verify that we can access the Nginx web page. I will click on the External IP which will open the connection on the HTTP port, and we can see the Welcome to Nginx webpage.
Now let's go back to the WebSSH session and tail the Nginx access log by running command tail -f /var/log/nginx/access.log. Here we can see the log entries related to the webpage. Okay, here are all the Nginx logs and we want to stream these logs to Google Cloud Logging Service. Now, how do I stream these logs to GCP Logging service? And once these logs are in GCP, where can I find them? And what can I do with this log data once it is in GCP? We will find answer to above questions in this demo.
So far, we have an application, Nginx in this case, that writes logs to the disk and we want them to send over to GCP using Cloud Logging Agent. Google Cloud Compute Engine VM Instances do not come with Cloud Logging Agent pre-installed. So first we need to install the Cloud Logging Agent on this VM Instance. To install the Cloud Logging agent, VM Instance should have access to the remote repositories of the agent package and its dependencies. If you are running the VM instance in the private network, then it might affect the agent dependencies installation. Installing the agent dependencies in the private network is beyond the scope of this course.
Without further ado, let's begin the Cloud Logging Agent installation. First, I will run this cURL command to download a script that adds the agent package repo on this VM instance. We now have the script locally, so let's run this using the bash command and update the apt repositories list. There are many versions of Cloud Logging Agents available and to list all available versions, we run the command, sudo apt-cache madison google-fluentd. The output shows the different versions available like 1.8.6, 1.7.0, 1.6.30, et cetera. For our demo, we will pick the latest version, which is 1.8.6-1 at the time of recording this demo.
Now, we will run the command, sudo apt-get install -y google-fluentd=1.8.6-1 to install google-fluentd package with version 1.8.6-1. The google-fluentd package is now installed. Now, let's go ahead and install the configuration files that help to stream standard logs like Syslog, Apache access logs, Nginx access logs, et cetera. Here on the same shell, we can run the command sudo apt-get install -y google-fluentd-catch-all-config to install configurations. A restart is required for these configurations to take effect, so let's run sudo service google-fluentd restart.
To verify the status of Cloud Logging agent, we can run the command sudo service google-fluentd status that shows the agent is actively running or we can tail the cloud logging agent logs and ensure there are no errors. To tail the logs, run the command tail /var/log/google-fluentd/google-fluentd.log. The log shows that fluentd-worker is running. We also see a few lines which say, log not found and continuing without tailing it. Don't worry about these log messages. When we installed the agent configurations using, catch-all-config, it installs config for many standard tools like Chef, Cassandra, et cetera. And we are not running those tools on this VM and hence Logging Agent cannot find those logs.
What we need to make sure is, it can tail the Nginx log and we can do so by using grep on the agent log file and look for the Nginx keyword. Let's run the command less /var/log/google-fluentd/google-fluentd.log|grep nginx, and its output shows that Logging agent is tailing the Nginx access log file /var/log/nginx/access.log. Now the question that comes to mind is, how does Cloud Logging Agent know the location of Nginx access log?. The answer to the above question is, using the Configuration file. Cloud Logging Agent configuration files are located at /etc/google-fluentd/config.d/ and if we list this directory, we see many config files installed by catch-all-config package. Going through each config file is beyond the scope of this course, but you are encouraged to look through different files to get a feel of logging agent configurations.
For this course, we will focus on the nginx.conf. Let's open this file and see how Cloud Logging Agent configs are defined. The config file has two source blocks, one for the access log and another for the error log. Each source block has configurations that define the location of the log file and how to read the log file.
Here, @type tail instructs the agent to tail the log file. Path configuration tells the location of the log file. Read_from_head true suggests reading the file from the beginning. Tag allows tagging an input stream with the name that can be used later at the filter or output stage. Pos_file helps the logging agent to maintain the current read position in the log file. We can use format to define the format of the log entry. Here it is not defined. Let's quickly check the syslog.conf to see how format can be declared. Here we can see that the message format is defined as a regular expression.
So far, we have learned how to install the Cloud Logging Agent and how to configure the input. The next step is to understand how Cloud Logging Agent streams, logs to the Cloud Logging Service. But before we jump into that, we need to ensure that the VM Instance has sufficient access to write logs to Cloud Logging Service. We can run this cURL command to check the authorization scope. We can see the logging.write scope, so we should be good to go.
Writing data to Cloud Logging Service is also controlled by Logging Agent output plugin configurations and we find them at /etc/google-fluentd/google-fluentd.conf. At the start of the file, we have another source block. This block instructs to send the fluentd core metrics collected by Prometheus but we can ignore this block for now. Using this filter block, Cloud Logging Agent is adding a unique insertID to each log entry. This helps to preserve the order and avoid log duplication.
Coming to the output plugin, we have defined google_cloud as an output plugin under this match block with few other parameters. @type defines the name of the output plugin and we are using google_cloud to stream logs to Google Cloud Logging Service. If you want to stream logs somewhere else, for example, Elasticsearch or Splunk, then you need to use the appropriate plugin name here. In this match block, we can define different configurations to control the output plugin behavior. For example, buffer_type file instructs the logging agent to buffer the log entries in a disk file rather than in memory. Buffer_path to define the location to store the buffer files. buffer_chunk_limit, defines the size of each chunk, flush_interval defines how long before we have to flush a chunk buffer. The logging agent flushes a buffer chunk when one of the two conditions is met.
First, flush interval is reached or the buffer reaches the size defined by buffer_chunk_limit. There are other configurations available such as retry_limit. This number tells the logging agent to retry a number of times in case it fails to flush the chunk, num_threads, number of threads that can be used for processing. For a full list of available options, please refer to the Google Cloud documentation linked in the description below. We are all set from the configuration point of view. So let's exit from this config file and move back to Google Cloud Console.
So far, we have learned to install and configure the Cloud Logging Agent that streams the logs to GCP Cloud Logging Service. Now, let's look where to find these logs and what can be done with them. Here on Google Cloud Console, we navigate to the Cloud Logging Service by clicking on Navigation Menu, then scrolling down to Operations and select Logging. This takes us to the Log explorer page. As the name suggests, this is the place where we can find the logs we streamed from an Nginx web server and other logs from different GCP services.
In the previous lecture, we went over the different components of the Log explorer. So, please refer to it if you want different options we have on this page. In this lecture, we will focus on the log data. Under Query results, we see the logs received by Google Cloud Logging Service within the last hour. Here, we see the logs and those are mostly coming from our Nginx application because nothing else is running in this gcp-demo project. But you might see some different logs when you come to this page within your GCP project based on the different applications and services running.
Looking at this, we might get a question, if all the logs are put in one place then how do I find my application logs?. This is an obvious question but GCP provides a way to only see the logs we are interested in. For instance, we want to see the logs coming from our Nginx webserver. For this, we click on Log name and look for nginx-access and tick the checkbox. Now, click on Add button. This creates a query in the Query builder. If you are more comfortable in writing queries by hand, then you can directly write this query in the editor without selecting options from the drop-down menus. Let's run this query by clicking on Run query to see the results. We only see the logs coming from Nginx webserver in the Query results.
Now, what more do we get from these logs? We learned in the Cloud Logging Agent lecture that it adds a few more fields to the log data to create a Log Entry. We can click on the arrow button to expand the particular Log entry and view all the details. Let me select this Log entry and expand it. Here we can see textPayload which is the actual log line, unique insertId, resource and labels that can point us to the log source, timestamp, logName. All this information is very helpful when debugging the application. If you want to Download the query results, you can do so by clicking on Actions and then Download Logs options. If you are planning to run this query frequently or want to run later, you can save it using the Save option.
Log Explorer UI also has the histogram that shows the frequency of log entries over the time. This is very useful to see the trend in log entries, to visualize the peak in the particular log or when the particular log appeared. If I hover over the particular graph line, it tells us the number of entries with time range. As we are running our webserver on the Debian VM Instance, it also generates the Syslog by default. And /etc/google-fluentd/config.d/syslog.conf configuration file installed by catch-all-config package, configures the log stream for Syslog /var/log/syslog and /var/log/messages to stream logs to GCP Cloud Logging. We can view these logs by updating the logName in the query editor to projects/gcp-demo-8888/logs/syslog.
Once I click on Run query, it returns the syslog from the Nginx Webserver VM Instance. So far we have used the Google Cloud Console UI to view the logs but if you are more comfortable with the Command-Line Interface, we can use the gcloud command to read the logs. For example, if I want to read the nginx-access logs, I can run command, gcloud logging read logName=projects/gcp-demo-8888/logs/nginx-access --project gcp-demo-8888 --limit 2. This returns the two log entries from nginx-access logs. Number of log entries we retrieve is controlled by the limit flag. We are at the end of this lecture and to summarize, we learned about the Cloud Logging Agent, how it works under the hood, how do we install and configure the Cloud Logging Agent on the Compute Engine VM Instance, how does it stream logs to GCP Cloud Logging Service, where do we look for the logs in GCP Console UI and retrieving the logs using gcloud command.
In the next lecture, we will learn how to send log data using Cloud Logging API.
Pradeep Bhadani is an IT Consultant with over nine years of experience and holds various certifications related to AWS, GCP, and HashiCorp. He is recognized as HashiCorp Ambassador and GDE (Google Developers Expert) in Cloud for his knowledge and contribution to the community.
He has extensive experience in building data platforms on the cloud and as well as on-premises through the use of DevOps strategies & automation. Pradeep is skilled at delivering technical concepts helping teams and individuals to upskill on the latest technologies.