SRE Anti-Fragility and Learning from Failure

15m 13s

This lesson looks at anti-fragility and how to learn from failure. Anti-fragility is all about understanding disorder and using it to your advantage. Learning from failure helps you to understand why things break, how to fix them, and prevent or minimize the same thing from breaking again. By the end of this lesson, you'll have a clear understanding of anti-fragility and learning from failure.

If you have any feedback relating to this lesson, please contact us at

Learning Objectives

  • Understand how SRE, DevOps, and anti-fragility help to reduce risk and increase predictability
  • Learn how to reframe failure and mistakes so that you can learn from them
  • Learn about tools that can be used to reduce the risk of failure

Intended Audience

  • Anyone interested in learning about SRE and its fundamentals
  • Software Engineers interested in learning about how to use and apply SRE within an operations environment
  • DevOps practitioners interested in understanding the role of SRE and how to consider using it within their own organization


To get the most out of this course, you should have a basic understanding of DevOps, software development, and the software development lifecycle.


Link to YouTube video referenced in the lesson:

About the Author
Jeremy Cook, opens in a new tab
Content Lead Architect
Learning paths

Jeremy is a Content Lead Architect and DevOps SME here at Cloud Academy where he specializes in developing DevOps technical training documentation.

He has a strong background in software engineering, and has been coding with various languages, frameworks, and systems for the past 25+ years. In recent times, Jeremy has been focused on DevOps, Cloud (AWS, Azure, GCP), Security, Kubernetes, and Machine Learning.

Jeremy holds professional certifications for AWS, Azure, GCP, Terraform, Kubernetes (CKA, CKAD, CKS).

Covered Topics