What DevOps Means for Risk Management

What Does DevOps Mean for Risk Management?

Adopting DevOps makes the unfamiliar uneasy in two areas. One, they see an inherently risky choice between speed and quality and second, they are concerned that the quick iterations of DevOps may break compliance rules or introduce security vulnerabilities. Both concerns stem from the core principle of DevOps, which transforms organizational culture and advocates for faster cadence and cycle time. It’s possible to see how moving faster produces less stable systems. This post explores the ways DevOps mitigates risk and yields safer and more stable systems.

Choosing Between Speed and Quality is Wrong

In business, outcomes can be cheap, fast, and good, but you’re only allowed to pick two. This framework assesses three common dimensions of project management. It’s intuitive if you assume each is exclusive and that’s where DevOps differs. You can take our word for it.

Accelerate: The Science of Lean Software and DevOps – Building and Scaling High Performing Technology Organizations quantifies IT performance and identifies the best practices. Martin Fowler, Chief Scientist at Thoughtworks, refutes the framing of speed over quality directly in the foreword. He says:

This huge increase in responsiveness does not come at a cost in stability, since these organizations find their updates cause failures at a fraction of the rate of their less-performing peers, and these failures are usually fixed within the hour. Their evidence refutes the bimodal IT notion that you have to choose between speed and stability—instead, speed depends on stability, so good IT practices give you both.1

Fowler identifies the dependencies between speed and stability (a dimension of quality). Fowler refers to two important findings in Accelerate. High performing teams have:

  • 170 times faster MTTR (mean time to recover) from downtime.2
  • 5 times lower change failure rate (or 1/5 as likely as a chance to fail). 3

These two data points demonstrate that DevOps teams are less risky because deploys fail less often and when they do, issues are resolved quickly. Contrast this with systems that are deployed a few times a year where failed changes are more likely and outages are far more costly. How does DevOps do it?

Reduce Batch Sizes with Trunk-Based Development

The first goal of DevOps is to reduce the time needed from commit to production. This is called “lead time”. Reducing the batch size is the simplest way to drop lead times.

Imagine batch size as the size of a commit. Working incrementally through smaller commits is easier to understand, develop, and test. Additionally, small commits are more easily verified in production. If the only modification is a 5-line method change that correlates with increased load or other negative operational conditions, then it’s easy to locate the problem. On the flip side, if the change was 5,000 lines across multiple different areas, then the problem becomes much more difficult to isolate and identify.

Reducing batch sizes is the first step. Aligned to this, DevOps encourages changing the relationship with source control. The DevOps Handbook encourages trunk-based development, which means developers check in their code to “trunk” (or master, or mainline) at least once a day. Trunk-based development keeps commits smaller since larger ones will be harder to integrate each day. Most importantly, trunk-based development combined with continuous integration ensures that each commit keeps the entire system in a releasable state.

This mitigates risk in two ways. First, since every commit is kept in a deployable state, there’s no need for separate test and stabilization phases at the end of the project. These late-stage phases are the riskiest and tend to negatively impact delivery. Second, trunk-based development lays the foundation for automated deployment pipelines which expand over time to add increasingly rigorous tests.

Automating InfoSec

The fast-paced world of DevOps appears at odds with the slower moving world of Information Security (InfoSec). This originates from common processes that push the concerns of InfoSec to the tail end of projects, making security resolution more difficult and costly. This is true of any part of SDLC, but often more difficult with InfoSec compliance where releases must be verified before going into production. Small numbers of InfoSec engineers can also exacerbate the problem. James Wicket, one of the creators of the GauntIt security tool and organizer of DevOps Days Austin says:

One interpretation of DevOps is that it came from the need to enable developers productivity, because as the number of developers grew, there weren’t enough Ops people to handle all the resulting deployment work. This shortage is even worse in InfoSec—the ratio of engineers in Development, Operations, and InfoSec in a typical technology organization is 100:10:1.4

Operations and development have faced similar issues. The DevOps solution is to automate as much as possible from environment provisioning and software deployment. Automation makes processes robust, correct and frees up engineering time for other work. DevOps offers the same solution for InfoSec: first “shift left” by engaging InfoSec goals with feature teams as early as possible in the process. Second, automate compliance testing in the deployment pipeline as much as possible. This frees up InfoSec staff for more exploratory work, exposes concerns to the whole team, and most importantly, ensures each change is compliant.

The DevOps Handbook recommends some ways to start:

  1. Add static analysis tools to the deployment pipeline. Static analysis can catch coding style errors and also identify security vulnerabilities likes calls to blacklisted system methods like exec.
  2. Add vulnerability scanning to the deployment pipeline. Vulnerability scanning vets application dependencies and system packages for known security vulnerabilities. This can catch Docker images with unpatched OpenSSL packages or unpatched frameworks like Ruby on Rails.
  3. Add dynamic analysis tools such as OWASP ZAP or Arachni that test running applications for known vulnerabilities.
  4. Integrate InfoSec and another production telemetry. Examples include counters on password resets, logins, or second-factor challenges. Other examples may be core dumps or malformed database queries (indicating an attack). Integrating this information into telemetry emphasizes the “shift left” since the entire team has access to the telemetry, thus more people can understand and diagnose security issues in real time.

These are starting points towards the goal of integrating InfoSec objectives into the team’s daily work. Done well, this increases developer and operations efficiency while increasing security.

Adopting these practices also requires committing to continuous improvement. Teams starting out can adopt these practices and rule out an entire class of InfoSec regressions. As the team learns over time, the tests increase in rigor and continually raise the quality floor and ultimately reducing risk across the SDLC.

This post has covered two areas: speed over stability and InfoSec. However, the risk isn’t exclusive to two areas. Technical debt is arguably the riskiest part of a long-term project and applying the DevOps principle of automation may help teams in a new way.

Mitigating Risk in Dependency Upgrades

GitHub recently announced they completed their Rails upgrade from 3.2 to 5.2. Rails 3.2 was released in January 2012 and 5.2 was released in April 2018. GitHub built a system to run the application in different Rails versions allowing an incremental upgrade from 3.x, to 4.x, and finally to 5.x.

The Rails upgrade took a year and a half. This was for a few reasons. First, Rails upgrades weren’t always smooth and some versions had major breaking changes. Rails improved the upgrade process for the 5 series so this meant that while 3.2 to 4.2 took 1 year, 4.2 to 5.2 only took 5 months.5

Software upgrades are a necessary evil and they can be downright sinister when put off. GitHub made it more difficult for themselves by delaying upgrades which created a situation where a major upgrade could not be completed in a single go. This problem is also discussed in “Building Evolutionary Architectures” by Thoughtworks employees Neal Ford, Rebecca Parsons, and Patrick Kua.

The authors apply DevOps automation to dependency updates. They propose “fluid dependencies”. The idea is that the deployment pipeline can detect a fluid dependency and attempt a build with the latest version of that dependency. If the build passes, then the application may be upgraded. The deployment pipeline can also automate the commit processes to make dependency upgrades seamless. This approach removes a chore from the backlog and mitigates un-checked technical debt that can occur from not upgrading dependencies. Unfortunately, there are no such tools available right now, but the idea is worth exploring.

Conclusion

This post explored the ways in which DevOps can be used to mitigate risk across the SDLC. First, it addresses the misconception that IT teams must choose between speed and quality and risk associated with moving fast (and breaking things). DevOps done well provides both speed and quality. Second, the post covers how applying the DevOps mindset of automation and incorporating a “shift left” mindset in InfoSec. Shifting left with automation brings InfoSec concerns to the forefront of everyone within the team, whilst testing automation ensures everyone’s changes are always in compliance. Lastly, the post touched on the idea of mitigating and possibly eliminating risks around critical dependency upgrades with “fluid dependencies”.

Adopting all these practices may not eliminate risk completely, however they are proven to reduce and minimize risk, speed up cycle time, and improve quality. So DevOps isn’t the riskiest thing to try, right now it’s just the way modern IT business is done!

  1. Forsgren PhD, Nicole. Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations (Kindle Locations 146-149). IT Revolution Press. Kindle Edition. ↩︎
  2. Forsgren PhD, Nicole. Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations (Kindle Location 434). IT Revolution Press. Kindle Edition. ↩︎
  3. Forsgren PhD, Nicole. Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations (Kindle Location 434). IT Revolution Press. Kindle Edition. ↩︎
  4. Kim, Gene; Humble, Jez; Debois, Patrick; Willis, John. The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations (Kindle Locations 5570-5573). IT Revolution Press. Kindle Edition. ↩︎
  5. https://githubengineering.com/upgrading-github-from-rails-3-2-to-5-2/ ↩︎

If you want to learn more about DevOps, you might also like: 

Written by

Passionate traveler (currently in Bangalore, India), Trance addict, Devops, Continuous Deployment advocate. I lead the SRE team at Saltside where we manage ~400 containers in production. I also manage Slashdeploy.

Related Posts

— October 25, 2018

How DevOps Transforms Software Testing

Testing is arguably the most important aspect of software development. Whether manual or automated, testing ensures the software works as expected. Broken software causes production outages, unsatisfied customers, refunds, decreased trust, or even complete financial collapse. Testing mi...

Read more
  • DevOps
— August 8, 2018

From Monolith to Serverless – The Evolving Cloudscape of Compute

Containers can help fragment monoliths into logical, easier to use workloads. The AWS Summit New York was held on July 17 and Cloud Academy sponsored my trip to the event. As someone who covers enterprise cloud technologies and services, the recent Amazon Web Services event was an insig...

Read more
  • AWS
  • AWS Summits
  • Containers
  • DevOps
  • serverless
Albert Qian
— August 6, 2018

Four Tactics for Cultural Change in DevOps Adoption

Many organizations approach digital transformation and DevOps adoption with the belief that simply by selecting and using the right tools, they will achieve higher levels of automation and gain massive efficiencies as a result. While DevOps adoption does require new tools and processes,...

Read more
  • DevOps
— July 24, 2018

Get Started with HashiCorp Vault

Ongoing threats of data breaches and cyber attacks remain top of mind for every team responsible for securing cloud workloads and applications, especially with the challenge of managing secrets including passwords, tokens, API keys, certificates, and more. Complexity is especially notab...

Read more
  • DevOps
  • HashiCorp Vault
— June 11, 2018

Open Source Software Security Risks and Best Practices

Enterprises are leveraging a variety of open source products including operating systems, code libraries, software, and applications for a range of business use cases. While using open source comes with cost, flexibility, and speed advantages, it can also pose some unique security chall...

Read more
  • DevOps
— June 5, 2018

What is Static Analysis Within CI/CD Pipelines?

Thanks to DevOps practices, enterprise IT is faster and more agile. Automation in the form of automated builds, tests, and releases plays a significant role in achieving those benefits and creates the foundation for Continuous Integration/Continuous Deployment (CI/CD) pipelines. However...

Read more
  • DevOps
— March 29, 2018

What is Chaos Engineering? Failure Becomes Reliability

In the IT world, failure is inevitable. A server might go down, an app may fail, etc. Does your team know what to do during a major outage? Do you know what instances may cause a larger systems failure? Chaos engineering, or chaos as a service, will help you fail responsibly.It almost...

Read more
  • Cloud Computing
  • DevOps
— December 7, 2017

10 Ingredients for DevOps Transformation with Mark Andersen

At Capital One, DevOps is about delivering high quality, working software, faster. This means software that is reliable, secure, usable, and performant while providing value and accomplishing those important end user goals. Everything is about speed of delivery and getting that feedback...

Read more
  • Cloud Migration
  • DevOps
— October 5, 2017

SQL Injection Lab: Think Like a Hacker

Security is IT’s top spending priority according to the 2017/2018 Computer Economics IT Spending & Staffing Benchmarks report*. Given the frequent changes and updates in vendor platforms, the pressure is on for IT teams who need to keep their infrastructures and data secure. As brea...

Read more
  • DevOps
  • Security
  • SQL injection
— September 15, 2017

Women in Tech: Zamira Jaupaj, DevOps Engineer

In building an enterprise culture of cloud, DevOps skills complement the enterprise’s need to automate development, testing, deployment, and operations processes for their public cloud deployments. In this latest post in our Women in Tech series, we’ll be talking to Zamira Jaupaj, a Dev...

Read more
  • DevOps
  • Women in Tech
— September 8, 2017

Mesosphere to Incorporate Kubernetes into DC/OS

The announcement that Mesosphere is going to incorporate Kubernetes into DC/OS has generated a fair amount of buzz in the industry, with the consensus landing largely on the side that this is a sign that Mesosphere is ceding to Google’s open source software. I have a different perspecti...

Read more
  • DevOps
  • Docker
  • Kubernetes
— July 7, 2017

Embracing DevOps in your company – an interview with our DevOps expert

On the Cloud Academy Community, we get a lot of questions about DevOps. According to the 2017 State of DevOps Report by the DevOps Research & Assessment and Puppet, DevOps “is viewed as the path to faster delivery of software, greater efficiency, and the ability to pull ahead of the...

Read more
  • Cloud Computing
  • DevOps
  • Security