What DevOps Means for Risk Management

What Does DevOps Mean for Risk Management?

Adopting DevOps makes the unfamiliar uneasy in two areas. One, they see an inherently risky choice between speed and quality and second, they are concerned that the quick iterations of DevOps may break compliance rules or introduce security vulnerabilities. Both concerns stem from the core principle of DevOps, which transforms organizational culture and advocates for faster cadence and cycle time. It’s possible to see how moving faster produces less stable systems. This post explores the ways DevOps mitigates risk and yields safer and more stable systems. If you’re generally interested in DevOps, you can also check out my other post on How DevOps Transforms Software Testing.

Choosing Between Speed and Quality is Wrong

In business, outcomes can be cheap, fast, and good, but you’re only allowed to pick two. This framework assesses three common dimensions of project management. It’s intuitive if you assume each is exclusive and that’s where DevOps differs. You can take our word for it.

Accelerate: The Science of Lean Software and DevOps – Building and Scaling High Performing Technology Organizations quantifies IT performance and identifies the best practices. Martin Fowler, Chief Scientist at Thoughtworks, refutes the framing of speed over quality directly in the foreword. He says:

This huge increase in responsiveness does not come at a cost in stability, since these organizations find their updates cause failures at a fraction of the rate of their less-performing peers, and these failures are usually fixed within the hour. Their evidence refutes the bimodal IT notion that you have to choose between speed and stability—instead, speed depends on stability, so good IT practices give you both.1

Fowler identifies the dependencies between speed and stability (a dimension of quality). Fowler refers to two important findings in Accelerate. High performing teams have:

  • 170 times faster MTTR (mean time to recover) from downtime.2
  • 5 times lower change failure rate (or 1/5 as likely as a chance to fail). 3

These two data points demonstrate that DevOps teams are less risky because deploys fail less often and when they do, issues are resolved quickly. Contrast this with systems that are deployed a few times a year where failed changes are more likely and outages are far more costly. How does DevOps do it?

Reduce Batch Sizes with Trunk-Based Development

The first goal of DevOps is to reduce the time needed from commit to production. This is called “lead time”. Reducing the batch size is the simplest way to drop lead times.

Imagine batch size as the size of a commit. Working incrementally through smaller commits is easier to understand, develop, and test. Additionally, small commits are more easily verified in production. If the only modification is a 5-line method change that correlates with increased load or other negative operational conditions, then it’s easy to locate the problem. On the flip side, if the change was 5,000 lines across multiple different areas, then the problem becomes much more difficult to isolate and identify.

Reducing batch sizes is the first step. Aligned to this, DevOps encourages changing the relationship with source control. The DevOps Handbook encourages trunk-based development, which means developers check in their code to “trunk” (or master, or mainline) at least once a day. Trunk-based development keeps commits smaller since larger ones will be harder to integrate each day. Most importantly, trunk-based development combined with continuous integration ensures that each commit keeps the entire system in a releasable state.

This mitigates risk in two ways. First, since every commit is kept in a deployable state, there’s no need for separate test and stabilization phases at the end of the project. These late-stage phases are the riskiest and tend to negatively impact delivery. Second, trunk-based development lays the foundation for automated deployment pipelines which expand over time to add increasingly rigorous tests.

By the way, Cloud Academy has also created a DevOps Playbook, divided into Part 1 and Part 2: check them out if you’re interested in learning more about DevOps.

Automating InfoSec

The fast-paced world of DevOps appears at odds with the slower moving world of Information Security (InfoSec). This originates from common processes that push the concerns of InfoSec to the tail end of projects, making security resolution more difficult and costly. This is true of any part of SDLC, but often more difficult with InfoSec compliance where releases must be verified before going into production. Small numbers of InfoSec engineers can also exacerbate the problem. James Wicket, one of the creators of the GauntIt security tool and organizer of DevOps Days Austin says:

One interpretation of DevOps is that it came from the need to enable developers productivity, because as the number of developers grew, there weren’t enough Ops people to handle all the resulting deployment work. This shortage is even worse in InfoSec—the ratio of engineers in Development, Operations, and InfoSec in a typical technology organization is 100:10:1.4

Operations and development have faced similar issues. The DevOps solution is to automate as much as possible from environment provisioning and software deployment. Automation makes processes robust, correct and frees up engineering time for other work. DevOps offers the same solution for InfoSec: first “shift left” by engaging InfoSec goals with feature teams as early as possible in the process. Second, automate compliance testing in the deployment pipeline as much as possible. This frees up InfoSec staff for more exploratory work, exposes concerns to the whole team, and most importantly, ensures each change is compliant.

The DevOps Handbook recommends some ways to start:

  1. Add static analysis tools to the deployment pipeline. Static analysis can catch coding style errors and also identify security vulnerabilities likes calls to blacklisted system methods like exec.
  2. Add vulnerability scanning to the deployment pipeline. Vulnerability scanning vets application dependencies and system packages for known security vulnerabilities. This can catch Docker images with unpatched OpenSSL packages or unpatched frameworks like Ruby on Rails.
  3. Add dynamic analysis tools such as OWASP ZAP or Arachni that test running applications for known vulnerabilities.
  4. Integrate InfoSec and another production telemetry. Examples include counters on password resets, logins, or second-factor challenges. Other examples may be core dumps or malformed database queries (indicating an attack). Integrating this information into telemetry emphasizes the “shift left” since the entire team has access to the telemetry, thus more people can understand and diagnose security issues in real time.

These are starting points towards the goal of integrating InfoSec objectives into the team’s daily work. Done well, this increases developer and operations efficiency while increasing security.

Adopting these practices also requires committing to continuous improvement. Teams starting out can adopt these practices and rule out an entire class of InfoSec regressions. As the team learns over time, the tests increase in rigor and continually raise the quality floor and ultimately reducing risk across the SDLC.

This post has covered two areas: speed over stability and InfoSec. However, the risk isn’t exclusive to two areas. Technical debt is arguably the riskiest part of a long-term project and applying the DevOps principle of automation may help teams in a new way.

Mitigating Risk in Dependency Upgrades

GitHub recently announced they completed their Rails upgrade from 3.2 to 5.2. Rails 3.2 was released in January 2012 and 5.2 was released in April 2018. GitHub built a system to run the application in different Rails versions allowing an incremental upgrade from 3.x, to 4.x, and finally to 5.x.

The Rails upgrade took a year and a half. This was for a few reasons. First, Rails upgrades weren’t always smooth and some versions had major breaking changes. Rails improved the upgrade process for the 5 series so this meant that while 3.2 to 4.2 took 1 year, 4.2 to 5.2 only took 5 months.5

Software upgrades are a necessary evil and they can be downright sinister when put off. GitHub made it more difficult for themselves by delaying upgrades which created a situation where a major upgrade could not be completed in a single go. This problem is also discussed in “Building Evolutionary Architectures” by Thoughtworks employees Neal Ford, Rebecca Parsons, and Patrick Kua.

The authors apply DevOps automation to dependency updates. They propose “fluid dependencies”. The idea is that the deployment pipeline can detect a fluid dependency and attempt a build with the latest version of that dependency. If the build passes, then the application may be upgraded. The deployment pipeline can also automate the commit processes to make dependency upgrades seamless. This approach removes a chore from the backlog and mitigates un-checked technical debt that can occur from not upgrading dependencies. Unfortunately, there are no such tools available right now, but the idea is worth exploring.

Conclusion

This post explored the ways in which DevOps can be used to mitigate risk across the SDLC. First, it addresses the misconception that IT teams must choose between speed and quality and risk associated with moving fast (and breaking things). DevOps done well provides both speed and quality. Second, the post covers how applying the DevOps mindset of automation and incorporating a “shift left” mindset in InfoSec. Shifting left with automation brings InfoSec concerns to the forefront of everyone within the team, whilst testing automation ensures everyone’s changes are always in compliance. Lastly, the post touched on the idea of mitigating and possibly eliminating risks around critical dependency upgrades with “fluid dependencies”.

Adopting all these practices may not eliminate risk completely, however they are proven to reduce and minimize risk, speed up cycle time, and improve quality. So DevOps isn’t the riskiest thing to try, right now it’s just the way modern IT business is done! Are you ready to implement DevOps now? Get inspired by the 10 Ingredients for DevOps Transformation with Mark Andersen.

  1. Forsgren PhD, Nicole. Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations (Kindle Locations 146-149). IT Revolution Press. Kindle Edition. ↩︎
  2. Forsgren PhD, Nicole. Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations (Kindle Location 434). IT Revolution Press. Kindle Edition. ↩︎
  3. Forsgren PhD, Nicole. Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations (Kindle Location 434). IT Revolution Press. Kindle Edition. ↩︎
  4. Kim, Gene; Humble, Jez; Debois, Patrick; Willis, John. The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations (Kindle Locations 5570-5573). IT Revolution Press. Kindle Edition. ↩︎
  5. https://githubengineering.com/upgrading-github-from-rails-3-2-to-5-2/ ↩︎

If you want to learn more about DevOps, you might also like: 

Avatar

Written by

Adam Hawkins

Passionate traveler (currently in Bangalore, India), Trance addict, Devops, Continuous Deployment advocate. I lead the SRE team at Saltside where we manage ~400 containers in production. I also manage Slashdeploy.


Related Posts

Avatar
Adam Hawkins
— September 13, 2019

How Google, HP, and Etsy Succeed with DevOps

DevOps is currently well developed, and there are many examples of companies adopting it to improve their existing practices and explore new frontiers. In this article, we'll take a look at case studies and use cases from Google, HP, and Etsy. These companies are having success with Dev...

Read more
  • Continuous Learning
  • DevOps
  • Velocity
Chris Gambino
Chris Gambino
— August 28, 2019

How to Accelerate Development in the Cloud

Understanding how to accelerate development in the cloud can prevent typical challenges that developers face in a traditional enterprise. While there are many benefits to switching to a cloud-first model, the most immediate one is accelerated development and testing. The road blocks tha...

Read more
  • deploy
  • deployment acceleration
  • development
  • DevOps
Avatar
Adam Hawkins
— August 9, 2019

DevSecOps: How to Secure DevOps Environments

Security has been a friction point when discussing DevOps. This stems from the assumption that DevOps teams move too fast to handle security concerns. This makes sense if Information Security (InfoSec) is separate from the DevOps value stream, or if development velocity exceeds the band...

Read more
  • AWS
  • cloud security
  • DevOps
  • DevSecOps
  • Security
Valery Calderón Briz
Valery Calderón Briz
— August 8, 2019

Understanding Python Datetime Handling

Communicating dates and times with another person is pretty simple... right? “See you at 6 o’clock on Monday” sounds understandable. But was it a.m. or p.m.? And was your friend in the same time zone as you when you said that? When we need to use and store dates and times on Pytho...

Read more
  • DevOps
  • Python
  • Python datetime
  • Unix timestamp
Alisha Reyes
Alisha Reyes
— July 22, 2019

Cloud Academy’s Blog Digest: July 2019

July has been a very exciting month for us at Cloud Academy. On July 10, we officially joined forces with QA, the UK’s largest B2B skills provider (read the announcement). Over the coming weeks, you will see additions from QA’s massive catalog of 500+ certification courses and 1500+ ins...

Read more
  • AWS
  • Azure
  • Cloud Academy
  • Cybersecurity
  • DevOps
  • Kubernetes
Avatar
Adam Hawkins
— July 17, 2019

How to Become a DevOps Engineer

The DevOps Handbook introduces DevOps as a framework for improving the process for converting a business hypothesis into a technology-enabled service that delivers value to the customer. This process is called the value stream. Accelerate finds that applying DevOps principles of flow, f...

Read more
  • AWS
  • AWS Certifications
  • DevOps
  • DevOps Foundation Certification
  • Engineer
  • Kubernetes
Avatar
Adam Hawkins
— July 9, 2019

Top 20 Open Source Tools for DevOps Success

Open source tools perform a very specific task, and the source code is openly published for use or modification free of charge. I've written about DevOps multiple times on this blog. I reiterate the point that DevOps is not about specific tools. It's a philosophy for building and improv...

Read more
  • Ansible
  • Chef
  • configuration management
  • DevOps
  • devops tools
  • Docker
  • infrastructure-as-code
  • Kubernetes
  • telemetry
Avatar
Adam Hawkins
— July 2, 2019

DevOps: Scaling Velocity and Increasing Quality

All software teams strive to build better software and ship it faster. That's a competitive edge required to survive in the Age of Software. DevOps is the best methodology to leverage that competitive advantage, ultimately allowing practitioners to accelerate software delivery and raise...

Read more
  • continuous delivery
  • DevOps
  • software
Avatar
Adam Hawkins
— June 13, 2019

Continuous Deployment: What’s the Point?

Continuous Deployment is the pinnacle of high-performance software development. Continuous deployment teams deploy every commit that passes tests to production, and there's nothing faster than that. Even though you'll see the "CD" term thrown around the internet, continuous deployment a...

Read more
  • Development & Deploy
  • DevOps
Avatar
Adam Hawkins
— May 31, 2019

DevOps Telemetry: Open Source vs Cloud vs Third Party

The DevOps principle of feedback calls for business, application, and infrastructure telemetry. While telemetry is important for engineers when debugging production issues or setting base operational conditions, it is also important to product owners and business stakeholders because it...

Read more
  • Analytics
  • DevOps
Avatar
Adam Hawkins
— April 16, 2019

The Convergence of DevOps

IT has changed over the past 10 years with the adoption of cloud computing, continuous delivery, and significantly better telemetry tools. These technologies have spawned an entirely new container ecosystem, demonstrated the importance of strong security practices, and have been a catal...

Read more
  • DevOps
  • Security
Avatar
Adam Hawkins
— March 21, 2019

How DevOps Increases System Security

The perception of DevOps and its role in the IT industry has changed over the last five years due to research, adoption, and experimentation. Accelerate: The Science of Lean Software and DevOps by Gene Kim, Jez Humble, and Nicole Forsgren makes data-backed predictions about how DevOps p...

Read more
  • DevOps
  • Security