Skip to main content

How DevOps Transforms Software Testing

Testing is arguably the most important aspect of software development. Whether manual or automated, testing ensures the software works as expected. Broken software causes production outages, unsatisfied customers, refunds, decreased trust, or even complete financial collapse. Testing minimizes these types of negative consequences and when done well, enables teams to reach increasingly higher quality thresholds.

DevOps transforms testing by promoting it to a critical concern across all phases of the SDLC and by shifting the responsibilities onto all engineers. DevOps also encourages engineers to answer questions like where and how to test more aspects of their software. This impacts workflows across teams, the deployment pipeline, and encourages exploratory testing. This post covers how DevOps transforms the perspective on software quality and what it means in practice.

Fast Feedback with Trunk-Based Development

The DevOps Handbook defines DevOps with three principles and their associated feedback loops. The Principle of Flow builds a fast feedback loop from development to production by establishing an automated deployment pipeline with tests that check production fitness using trunk-based development.

Trunk-based development coupled with automated testing is the best way to achieve fast feedback from development to production since it drives down batches sizes and ensures all changes are in working order. Adopting this workflow assumes that branches are shorted lived and each commit is tested.

Trunk-based development transforms testing workflows since work happens in a single shared space. There’s not much to understand since commits are simple, but any commit can bring the deployment pipeline to a screeching halt. This workflow avoids merge hell and potential development conflicts.

Here’s an example: performance testing can only happen against an integrated environment, so without it, the tests would happen far later in the process with more negative impact if things go wrong. Trunk-based development enables a “shift left” for any kind of testing, thus providing faster feedback on build quality, enabling faster iterations and ultimately increasing the frequency of production deploys.

This workflow forces teams to adopt a Definition of Done similar to the one found in the DevOps Handbook:

“At the end of each development interval, we must have integrated, tested, working, and potentially shippable code, demonstrated in a production-like environment, created from trunk using a one-click process, and validated with automated tests.”

The Definition of Done removes the need for separate test and stabilization phases towards the end of projects since testing happens continuously. Once testing is automated, teams can turn their attention to identifying and improving other quality indicators earlier in the deployment pipeline.

Continuous Security

Security and compliance checks have traditionally taken place at the end of development and have been done manually. Adopting DevOps integrates infosec into everyone’s daily work as part of the automated deployment pipeline. The shift left also causes teams to engage with infosec concerns as early as possible.

Today, it’s possible to test and mitigate a host of infosec issues by adding the following tests to the deployment pipeline:

  • Static analysis inspects the program for possible run-time behaviors, coding flaws, backdoors, and potentially malicious code like calls to exec. Examples of tools to perform static analysis include CodeClimate and Brakeman.
  • Dynamic analysis consists of tests executed while a program is in operation. These tests monitor aspects like system memory, functional behavior, response time, and overall performance. They can probe for known security vulnerabilities. These type of testing can even be done against a production environment. Examples of frameworks used for dynamic analysis include Arachani and the OWASP ZAP.
  • Dependency scanning checks dependency code and executables for known vulnerabilities. Ruby’s “bundler audit” is one example of a dependency scanner.

Applying these kinds of tests provides immediate and fast feedback on a variety of possible infosec issues. The practice also frees up engineers to focus on different software quality practices. Here’s a story from Etsy on how they took steps to proactively identify security issues in their production environment:

The engineering team added metrics for abnormal production operational events like core dumps or segmentation faults, database syntax errors to indicate potential SQL injection attacks, suspicious SQL queries, and password resets. They graphed the results in real time and found they were being attacked far more often than they thought. Here’s the project lead discussing the impact on their team:

“One of the results of showing this graph was that developers realized that they were being attacked all the time! And that was awesome, because it changed how developers thought about the security of their code as they were writing the code.”

Changing the organization’s relationship to code affects how the code is tested and a careful eye for software quality can confirm or deny a team’s assumptions. This example is not something typically associated with software testing and that’s the point. DevOps changes the way the entire team approaches verifying and testing their software. Modern teams are using fault injection techniques like chaos engineering to build more resilient systems.

Testing in Production

Netflix pioneered chaos engineering. The Principles of Chaos describes chaos engineering as:

“…the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production.”

The practice involves random (or targeted) destructive actions in a production environment to stress test the environment’s reliability. The simplest chaos is randomly killing production instances and seeing how the system behaves. Other forms of chaos are increasing network latency or shutting off access to external systems.

This exercise not only builds more reliability into systems, but it teaches the team how to repair their system. Michael Nygaard refers to the “Volkswagen Microbus” paradox in Release It! (2nd Edition):

“You learn how to fix the things that often break. You don’t learn how to fix the things that rarely break. But that means when they do break, the situation is likely to be more dire. We want a continuous low level of breakage to make sure our system can handle the big things.”

Attempting to bucket chaos engineering with a specific engineering skill set is challenging because it doesn’t fit a specific set of skills. The engineer must understand the system, infrastructure, and have the engineering chops to back it all up. Also, resolving faults found through chaos engineering is not a single person’s responsibility, but rather that of the team. Software testing is no longer purely focused on functionality requirements. It is increasingly moving towards identifying unknowns and adherence to non-functional requirements. It may be obvious that engineers should know how to repair their systems, but they can’t learn to do it without practice. Chaos engineering is an interesting approach to creating a regression test for those sort of non-functional requirements.

The adoption of chaos engineering indicates how DevOps is transforming software testing and the team’s approach to ensuring high-quality software.

Future of Software Testing

DevOps shifts responsibility away from specific individuals to a shared responsibility model backed by automation. That’s news for those working in traditional QA teams, especially if they’re doing manual testing or don’t have much software engineering experience. DevOps obviates the need for dedicated manual QA staff. It also forces every team member to become a software engineer. All forms of automation require writing code, so if traditional QA staff don’t learn to code then they’ll be out in the cold. DevOps replaces that face of QA with a more useful, analytical and exploratory one.

DevOps shifts responsibility away from specific individuals to a shared responsibility model backed by automation. Click To Tweet

Teams will always need engineers to explore ways to break their systems since that’s a fundamentally creative and experimental process. Experimenting and learning is a key component of DevOps. The DevOps Handbook defines it as the “Third Way”:

“practices that create opportunities for learning, as quickly, frequently, cheaply, and as soon as possible. This includes creating learnings from accidents and failures, which are inevitable when we work within complex systems, as well as organizing and designing our systems of work so that we are constantly experimenting and learning, continually making our systems safer.”

Chaos engineering is an example of constant experimentation and learning from real-world operations to make systems safer. DevOps is orientating software testing in a way that facilitates this. There’s something powerful there. Testing isn’t an activity confined to a specific team, feature, or part of an application. It goes wherever the deployment pipeline goes, be it infosec compliance, functional testing, or fault injection. It’s the test’s job to ensure that the deployment pipeline keeps moving and is regression-free. No matter where you are in your DevOps journey — beginner or advanced, just make sure you can write the code and tests to keep up.

Written by

Adam Hawkins

Passionate traveler (currently in Bangalore, India), Trance addict, Devops, Continuous Deployment advocate. I lead the SRE team at Saltside where we manage ~400 containers in production. I also manage Slashdeploy.

Related Posts

— February 7, 2019

Measuring DevOps Success: What, Where, and How

The DevOps methodology relates technical and organization practices so it's difficult to simply ascribe a number and say "our organization is a B+ on DevOps!" Things don't work that way. A better approach identifies intended outcomes and measurable characteristics for each outcome. Let'...

Read more
  • DevOps
— February 5, 2019

2019 DevOps and Automation Predictions

2019 DevOps and Automation PredictionsWe recently released our 2019 predictions for cloud computing and are doing the same here for DevOps and automation predictions.2018 was a great year for software, and DevOps falls somewhere on the slope of enlightenment on the Gartner Hype Cy...

Read more
  • DevOps
— January 17, 2019

Testing Through the Deployment Pipeline

Automated deployment pipelines empower teams to ship better software faster. The best pipelines do more than deploy software; they also ensure the entire system is regression-free. Our deployment pipelines must keep up with the shifting realities in software architecture. Applications a...

Read more
  • DevOps
— December 27, 2018

DevOps and Agile: Understanding the Relationship

Agile development used to be front and center in the conversation about software development. Now, DevOps has taken over the conversation. How do agile and DevOps relate? Both ideas began as ways to improve different aspects of software development. Agile embraced the changing nature of...

Read more
  • DevOps
— December 12, 2018

Getting Started With Site Reliability Engineering

Much has been written and discussed about SRE (Site Reliability Engineering) from what it is, how to do it, and how it's the same (or different) as DevOps. Google coined the term, defined the profession, and wrote the book on it. Their "Site Reliability Engineering" book covers the idea...

Read more
  • DevOps
  • SRE
— December 6, 2018

What DevOps Means for Risk Management

What Does DevOps Mean for Risk Management?Adopting DevOps makes the unfamiliar uneasy in two areas. One, they see an inherently risky choice between speed and quality and second, they are concerned that the quick iterations of DevOps may break compliance rules or introduce security vu...

Read more
  • DevOps
— August 8, 2018

From Monolith to Serverless – The Evolving Cloudscape of Compute

Containers can help fragment monoliths into logical, easier to use workloads. The AWS Summit New York was held on July 17 and Cloud Academy sponsored my trip to the event. As someone who covers enterprise cloud technologies and services, the recent Amazon Web Services event was an insig...

Read more
  • AWS
  • AWS Summits
  • Containers
  • DevOps
  • serverless
Albert Qian
— August 6, 2018

Four Tactics for Cultural Change in DevOps Adoption

Many organizations approach digital transformation and DevOps adoption with the belief that simply by selecting and using the right tools, they will achieve higher levels of automation and gain massive efficiencies as a result. While DevOps adoption does require new tools and processes,...

Read more
  • DevOps
— July 24, 2018

Get Started with HashiCorp Vault

Ongoing threats of data breaches and cyber attacks remain top of mind for every team responsible for securing cloud workloads and applications, especially with the challenge of managing secrets including passwords, tokens, API keys, certificates, and more. Complexity is especially notab...

Read more
  • DevOps
  • HashiCorp Vault
— June 11, 2018

Open Source Software Security Risks and Best Practices

Enterprises are leveraging a variety of open source products including operating systems, code libraries, software, and applications for a range of business use cases. While using open source comes with cost, flexibility, and speed advantages, it can also pose some unique security chall...

Read more
  • DevOps
— June 5, 2018

What is Static Analysis Within CI/CD Pipelines?

Thanks to DevOps practices, enterprise IT is faster and more agile. Automation in the form of automated builds, tests, and releases plays a significant role in achieving those benefits and creates the foundation for Continuous Integration/Continuous Deployment (CI/CD) pipelines. However...

Read more
  • DevOps
— March 29, 2018

What is Chaos Engineering? Failure Becomes Reliability

In the IT world, failure is inevitable. A server might go down, an app may fail, etc. Does your team know what to do during a major outage? Do you know what instances may cause a larger systems failure? Chaos engineering, or chaos as a service, will help you fail responsibly.It almo...

Read more
  • Cloud Computing
  • DevOps