Introduction to Software Debugging and Logging
Start course

This Course will introduce new software developers to the concepts of software testing, debugging, and logging. These concepts are common to most programming languages, making this foundational knowledge.

Learning Objectives

  • The purpose of unit testing
  • The purpose of integration testing
  • The concept of code coverage
  • The concept of software debugging
    • And some of the different debugging techniques
  • The concept of software logging
    • And why some information shouldn’t be included inside log entries

Intended Audience 

  • New software developers


  • You should have at least a conceptual understanding of programming
    • It’s okay if you’re not yet developing complex software
    • As long as you’re comfortable with the concept of functions, classes, and methods, you’re likely ready for this Course

In 1946 Grace Hopper traced an error in the Mark II electromechanical computer to a moth trapped in a relay. Meaning the first ever computer bug was an actual bug. 

Software bugs occur when a code defect results in unexpected behaviors such as errors or incorrect results. The process of locating a bug and correcting the defect is referred to as debugging.

The purpose of debugging is to remove software defects from a piece of software.  In this lesson, I’m going to introduce you to the concepts of debugging and logging.

Debugging is my absolute favorite technical task because it’s a very specific puzzle that needs to be solved; and once solved I almost always learn something new about the system I’m debugging. 

Debugging is a core part of the development process. We write some code and run it when we think the code will behave as expected. When the code works as expected we move on to the next bit of code. When it doesn’t work as expected we have to debug the problem. 

To be honest, throughout most of my career my first code attempt probably failed in some way at least 75% of the time. You forget a semicolon, you misspell something, you use the wrong argument type, etc. Because I didn’t get it quite right the first time, I got rather good at debugging. And I came to better understand the systems with which I was working. 

Debugging ranges in complexity - depending on the complexity of the system being debugged. For example: a locally running Python script with 50 lines of code is likely easier to debug than a distributed set of interconnected microservices.

Because debugging ranges in complexity, so do debugging techniques. I’m willing to wager that most developers have used the most primitive form of debugging - referred to as: caveman debugging. 

Caveman debugging relies on using some form of print or display function which displays data to a terminal’s standard output. While caveman debugging is primitive compared to more advanced techniques, it’s also highly effective for a wide range of use cases. 

For example, sometimes you just want to know the last line of code that ran successfully before the error. This has saved me countless hours over my career. 

Some of the systems you’ll have to debug do not make it easy. For example: in some cases line numbers might not be as accurate as you’d like. There are many reasons across the current technical landscape that make this technique useful.

I’ve spent plenty of debugging time looking in the wrong code for a given error due to ambiguous or less-than-precise error messages. Displaying different letters or numbers makes it easy to identify where the problem actually occurs.

Sometimes you’ll make an assumption about some external data and an error will hint that our assumption was wrong. Caveman debugging is great for quickly showing an entire data structure. 

If the mistake is caused by an erroneous assumption about the structure of data, this is quite a useful debugging technique. Because we can solve our problem by writing one line of code and running the code again to see the data and correct our assumptions.

Caveman debugging can be highly effective for a wide range of problems. However, it can clutter up your code with code that isn’t required. 

It’s a best practice to remove these sorts of lines of code before the code is ready for production. These can get picked up by application logs which could take up more drive space than required.

Once we leave the realm of caveman debugging there is a world of different tools and techniques. Most programming languages include some form of debugger. Which may or may not be a third-party application.

For example: 

  • Python includes a source code debugging module named pdb. 

  • Java has a debugger named jdb.

  • Go has a debugger named delve.

  • Compiled languages have a host of options including gdb.

These tools include much more advanced features than are capable with caveman debugging. For example, these enable us to step through code line by line and inspect variables as we go so that we can more holistically understand and control what’s happening. 

Debuggers exist as both terminal-based and GUI applications. They’re even built into many code editors. These tools can be intimidating at first because they’re so foundational. There is a learning curve to using these sorts of debugging tools. However, if you spend the better part of your day writing code then the time investment is probably worth it. 

While working on a project which spanned nearly a year, I spent at least a third of my day working with Python. I most certainly used caveman debugging when it seemed like the fastest way to resolve the problem. 

Though, some problems were more difficult to solve. I ran into an issue where the actual problem was being masked by the provided error. 

Using pdb I was able to set a breakpoint on the line just before the line causing the error. Once the code reached the breakpoint it opened up into a terminal-based debugging session. I stepped through code line by line inspecting all of the nearby objects until I found the real source of the error.

I had missed setting an environment variable which was used to configure some code that was interacting with a remote service. The error that I was presented with didn’t make the actual problem clear. 

The debugger helped me to solve that problem and many others fairly quickly. Some bugs are more difficult to solve than others. Debuggers are specifically designed to help find bugs more quickly. Learning to use a debugger well can be a rather substantial superpower. 

Debugging tools are rather low-level tools which tend to reflect the design of the system being debugged. By learning to use a debugging tool for your given language, system, environment, etc, you’re investing that time into understanding inner workings of the system being debugged. And once your mental model for a system aligns with the way it’s actually designed you become much more proficient when working with that system.

So, debugging ranges in complexity depending on the system being debugged and the type of problem. Between the simple yet effective caveman debugging and more advanced techniques there are a lot of good methods for debugging.

I recommend researching which debuggers are commonly used for the systems that you work with.

When developing software locally we see when it encounters an error. However, the code we write is likely going to be running elsewhere. So, how do you know if it’s running correctly?

Logging is a common way to understand what’s happening inside an application. Logging consists of applications writing messages with different levels of importance to some location. Commonly a file or to a terminal’s standard output. 

Different programming languages, modules, frameworks, etc, provide different mechanisms for creating and managing logs. The exact structure of a log file will differ. Though a log entry will typically contain a timestamp, a message and a log level. Log levels represent the reason for the log entry. It’s common to have levels such as: debug, info, warning, error, etc. 

Each log entry contains a message in some developer-determined format. The format depends on the factors such as: 

  • Whether or not the logs are read programmatically or by humans.

    • For logs commonly consumed by code using a format such as JSON makes for easier automation. 

    • For logs commonly consumed by humans keeping the message as unstructured or lightly structured text may be best.

The purpose of logging is to capture application messages so that we can gain insights into how an application is behaving.

There’s no one correct way to approach logging. It depends on the needs of the application, use cases, etc. However, there’s some considerations that should be made regarding the information in a log entry. It’s important to ensure that private and personally identifiable information isn’t included in log entries.

Imagine that an online retailer is writing credit card numbers to a log file. Or some medical practitioner software is writing people’s health data to a log file. 

Log files aren’t known for their ultra-high security. Once sensitive data is written to log files, that’s a data breach waiting to happen.

I recommend doing some research to better understand what might constitute sensitive data for your industry. 

Most programming languages provide a mechanism for setting the log level to display for an application. 

Using the different log levels is quite useful to filter out specific levels from a log file. For example: the log level of debug is intended to show granular details about the inner workings of the application that would help when debugging. Details such as the argument provided to a function.

Due to the granular detail, this could produce a lot of log data that’s only occasionally required. This sort of detailed data could consume a large amount of drive space over time. 

At least a few times in my career I’ve seen servers become unresponsive due to overly verbose logs consuming all of the server’s drive space. By setting the application’s log level to a higher level it will prevent debug messages from being written to the log. Thereby saving on drive space. Another means to save space is to rotate log files and delete older files. 

When errors are logged it provides developers with the data required to start debugging. Application logging is an important part of understanding how your code behaves in environments other than the one in which it’s developed. 

Okay, this seems like a natural stopping point. Here are your key takeaways for this lesson:

  • The purpose of debugging is to remove software defects from a piece of software

  • There are different debugging techniques including:

    • Caveman debugging. Which is a primitive way to debug by displaying basic information on the command line.

  • The purpose of logging is to capture application messages so that we can gain insights into how an application is behaving.

  • The exact content of a log entry depends on the application and use case. 

    • However, you’ll want to make sure to keep sensitive data out of logs. 

Okay, that's going to be all for this lesson. Thanks so much for watching. And I’ll see you in another lesson!

About the Author
Learning Paths

Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.

Covered Topics