Modern software systems are becoming increasingly complex, to meet quality, availability, and security demands. And these systems are changing rapidly to keep up with the needs of end-users. With all of the changes, how do you ensure stability, quality, security, and innovation? In this Course, we look at how the DevOps philosophy can provide a holistic way to look at software development, deployment, and operations. And we will provide some tenets to help improve quality and stability.
Course Objectives
You will gain the following skills by completing this Course:
- Learn why automation, culture, and metrics are essential to a successful DevOps project
- Learn how DevOps can positively impact your business's bottom line
- Learn which major companies are successfully utilizing DevOps in their own engineering processes
Intended Audience
You should take this Course if you are:
- A newcomer to the DevOps or cloud world
- Looking to upgrade your skills from a conventional software development career
Prerequisites
None specified.
This Course Includes
- Expert-guided lectures about DevOps
- 1 hour of high-definition video
- Solid foundational knowledge for your explorations into DevOps
What You'll Learn
Video lecture | What you'll learn |
---|---|
What Is DevOps? | In this lecture series, you'll gain a fundamental understanding of DevOps and why it matters. |
The Business Value of DevOps | Need to justify the business case for DevOps? This is the lecture series for you. |
Who's Using DevOps? | Find out who's using DevOps in the enterprise - and why their success matters for your own organization. |
If you have thoughts or suggestions for this course, please contact Cloud Academy at support@cloudacademy.com.
Welcome back to our Introduction to DevOps course. I'm Ben Lambert and I'll be your instructor for this lecture. In this lecture, we're going to review some real companies using DevOps practices. We'll talk about Etsy, Netflix, and Amazon. We're going to start with Etsy because when it comes to DevOps, they really get it. Etsy is an online marketplace for handmade and vintage items. Their system supports 54 million members. They have 1.4 million active sellers and 19.8 million active buyers. They deploy code around 50 times per day.
Now, I want to make the distinction here between deployments and releases, because throughout this course, you may have had the wrong idea about the difference between the two. Deployment is the act of pushing changes to an environment, say, production. However, if you're using feature toggles and smart database migrations, then you can deploy code that may not have been released to the user. So, released means that the feature is enabled and usable by your end users. I really like talking about Etsy. I like how they organically changed their development, deployment, and operations pipeline into a highly efficient process that promotes high quality. I respect their culture and their contributions to the open-source community.
Etsy's tech stack started out as Ubuntu, PostGreSQL, Lighttpd, PHP, and some Python. Business logic was contained in SQL-stored procedures. Their site's uptime wasn't great, and they had regular maintenance outage windows. Twice per week they deploy, which took about four hours per deployment. The way Etsy was deploying and operating was unsustainable at scale and they knew it. Organically, the culture started to shift, and they shifted to what we now call DevOps. But to them, it was just a better way of doing things. Silos were out and collaboration, transparency, and shared responsibility were in. They implemented a continuous deployment process using a tool they called Deployinator. It allows for deployments with one push of a button. In fact, engineers perform a deployment on their first day.
They started using Chef for configuration management, and even started open sourcing some of their Chef plugins. They started making small code changes and deploying those frequently, rather than large deployments filled with changes. Small changes allowed engineers to better identify the source of problems. They started using an ORM and avoided having business logic in multiple occasions. They switched to a MySQL Cluster using master-master replication. And for schema of migrations, they run their migration code on half of the note, and then if all went well, they're running on the other half.
Their current tech stack is a standard LAMP stack with Memcached D for database caching. Using a tech stack that's cutting-edge, maybe fun to learn, but often the lack of documentation and skilled engineers makes it a tough investment. For Etsy, using well-established tech like a LAMP stack allowed them to build on well-tested and well-documented technologies. Now, it's not to say that you should avoid new tech, rather you should know when to use new tech and when to use well-established tech.
So, Etsy became a beacon of DevOps without ever thinking about DevOps. For them, it came organically as they grew and evolved. I encourage you to look into Etsy further. Watch some of the talks given by their team members. Hearing how they went from painful, relatively, infrequent deployments to deploying 50 times per day directly from the engineers that made it happen is pretty inspiring.
Next, let's talk about Netflix and how they do things. Netflix recently announced that they had completed their cloud migration. Their entire operation is now in the cloud. The tech stack used at Netflix seems to center around Java for the most part, though not exclusively. They used Git, Jenkins, and Nebula for continuous integration. Developers test locally using Nebula, and once everything is passed, they commit their code to Git. Jenkins builds, tests, and bundles once again using Nebula.
If the project is an application, then Nebula will produce an installable OS package, and then the artifact gets saved to their artifact repository. If the build was successful and passed all of its tests, then Jenkins calls Spinnaker to do its thing. Netflix uses an immutable server model which means the goal is to prebake the OS with your application code, and never to make changes to the operating system post deployment.
Spinnaker uses Aminator to bake the AMI installing the previously created artifact. If the baking job was successful, then the build is deployed to a staging environment. Once the build is staged, teams can test it and review it. Once an application is ready for production, teams can use Spinnaker to deploy it using a blue-green deployment model. And they don't stop here, Netflix understands that failures are inevitable, especially at the frequency of changes they're making and at the scale that they're at.
So, Netflix came up with what's called the Simian Army, a suite of tools that include Chaos Monkey. Chaos Monkey is a tool for ensuring that infrastructure can handle failure. Chaos Monkey identifies groups of systems and randomly terminates one of the systems in the group. By running Chaos Monkey in production, they can prove that their environment is as redundant as it needs to be.
There's also Janitor Monkey. Janitor Monkey is a service that runs in the AWS cloud and looks for unused resources to clean up. They have Conformity Monkey which is another great tool. Conformity Monkey is a service which also runs in the AWS cloud, and looks for instances that aren't conforming to a set of predefined best practices. So Netflix mitigates the potential for failure by preparing for and practicing failures. I recommend that you check out the Netflix tech blog for more information about how they do things.
Our final company review is going to be Amazon. Amazon.com started out as a monolith like most sites did. Monoliths aren't usually a problem until the site needs to scale out and add a lot of new features. So, Amazon chose to implement two-pizza teams which basically means small teams, small enough that you can feed by two pizzas, which for them was roughly six to eight people.
So, Amazon's monolith was slowly refactored into a microservices model. Each team was responsible for the complete life cycle of their product. They were given the freedom to use whatever tools and technology they thought was best. If a team built something, they ran it in production. So, if their service went down at 3 a.m., they got the call to fix it. This incentivized teams do thoroughly test their code and infrastructure.
The biggest gains in efficiency at Amazon were found when they implemented a completely automated continuous delivery pipeline. This allowed new teams the ability to automate their testing and deployments. This automation enabled Amazon to deploy over 50 million times in 2014 across through thousands of teams. Just like the other companies we talked about, Amazon has a tech blog where they share information about the things that they're doing and what's been working for them, and I highly recommend that you check it out.
Each company that we've talked about has used different tools, but they've all had a very similar path to gaining their highly efficient development, deployment, and operations pipeline.
In our next lecture, we'll summarize what we've covered in this course and put everything together into an example of a complete development, deployment, and operations pipeline. All right, let's get started.
Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.