Do you remember the days of deploying an N-tier application to on-premises servers? The planning that went into determining the right amount of hardware to use so that you weren’t under or significantly over-provisioned. Deployments were often problematic because what ran well on the developer’s computer didn’t always work outside of their environment. Deployments also were assumed to cause downtime, and scheduled during non-peak hours.
In the event of a hardware failure, your app might have been unavailable depending on how much hardware you had access to, and how the application was designed. Failovers may or may not have been automatic, and frankly, it was all a lot of work.
Well, if you thought that was difficult, imagine trying to do all of this at the scale of Google, Facebook, Twitter, Netflix, or similar companies.
All of the companies I just mentioned found that hyperscale computing required a new way to look at things. And regardless of the actual tools that they used, they all had the same solution, which was to treat their entire data center as a single entity.
And that’s what DC/OS does: it’s a central OS for your data center, and it’s the topic of this course.
- You should understand how DC/OS is used
- You should have a high-level understanding of DC/OS
- You should be familiar with the UI
- You should be familiar with the CLI
- You should be able to install services from the catalog
- DevOps Engineers
- Site Reliability Engineers
- Familiarity with containers
- Comfort with the command line
|Lecture||What you'll learn|
|Intro||What to expect from this course|
|A Brief History||The history of DC/OS|
|Overview||An overview of DC/OS|
|Components||About the components of DC/OS|
|Exploring the UI||How to navigate the UI|
|Installing WordPress (UI)||How to install WordPress from the Catalog|
|Installing WordPress (CLI)||How to install WordPress from the Catalog|
|Summary||How to keep learning|
If you have thoughts or suggestions for this course, please contact Cloud Academy at firstname.lastname@example.org.
Welcome back! In this lesson we’ll take a high level look at what DC/OS is and how companies such as Autodesk are using it to speed up and standardize deployments.
DC/OS stands for data center operating system and it’s an open-source, distributed operating system based on Apache Mesos.
DC/OS can manage multiple cloud based or on-premises machines from a single interface.
It can deploy containers, distributed services, and legacy applications onto those machines.
It also provides networking, service discovery and resource management to keep services running and communicating with each other.
With DC/OS you can basically treat all of your servers that are part of the cluster as one giant computer. If your cluster has 100 quad-core servers, then your end result is basically one OS with 400 cores.
To help make the point check out this comparison of the traditional versus the DC/OS approach. The traditional way to handle operations might be to have servers dedicated to a particular task, such as web servers, or some stateful process. Because the services that you’re running don’t use all of a server’s resources all of the time, there’s a lot of additional capacity for any given server that tends to go to waste.
By comparison, DC/OS pools all the resources of the cluster into one place. When you want to start an application or service you determine its resource requirements and DC/OS will determine where to run it.
By distributing the applications and services across the cluster you can more evenly distribute the load, reducing the amount of idle server time.
If you’re new to this then it would easy to think that since DC/OS has OS in the name, that it’s a host OS. However, that’s not the case. While DC/OS is a distributed operating system, it’s not a host OS such as Linux, Windows, or MacOS. DC/OS runs on top of servers that need a supported host OS. At the time of this recording the supported agent operating systems are RHEL, CentOS, and CoreOS. For more info about which OS versions are supported check out the link in the course description to the installation page.
DC/OS is designed to be an OS for modern applications. And because of that it runs on any cloud platform, public or private, it runs on virtual machines or on-premises servers; and the intentional byproduct of that flexibility is that it avoids cloud vendor lock-in.
So, at a high level DC/OS does exactly what its name says; it serves as an operating system for your data center.
Having one OS that runs all of your applications and services has a lot of value.
To name just a few:
It allows you to standardize development and deployments.
In cases of hardware failure DC/OS will be able to move services to a new node in the cluster.
You have the ability to start long running or scheduled tasks.
Having one OS that runs everything also gives you a single entry point for managing all of your services via the CLI or web UI.
These are just a few of the perks of having an OS for your data center, and these don’t even get into all the cool features of DC/OS that we’ll be covering throughout this and other DC/OS courses.
Now that you have a high level sense for what DC/OS is, I want to focus on a case study by Autodesk.
Autodesk was looking to improve the efficiency of their infrastructure, standardize deployments, and implement a cross-platform way to run services.
After some consideration they settled on using containers, because containers solve several problems; including providing process isolation, and in the case of Docker providing dependency management by packaging the app with its dependencies.
Their concern with containers was that container orchestration is a non-trivial task. Their research brought them to Apache Mesos, Marathon, and DC/OS.
They knew that Mesos was used in companies such as Twitter, Netflix and others to manage tens of thousands of servers, and that it was an industrial strength solution. They also knew that they wanted a solution that was easy to install, and well documented; which made DC/OS an easy choice.
After a year of running DC/OS in production they ended up with:
A 66% reduction in AWS Instances
Cost Improvements of up to 57%
A 40 sec, zero downtime deployment
The ability to stand up a new region in 3 minutes
They hit 100% Uptime
To top it all off, the managed to do all of this with 1 DevOps Engineer
Let’s go through these in a bit more detail to provide some context.
The 66% reduction in AWS instances was the result of container density. By being able to run containers on servers with available resources, they didn’t need as many instances. Fewer instances translated into cost savings.
A 40 second, zero downtime deployment...this is pretty awesome! This sort of speed and efficiency has several contributors. One being a standard deployment process for everything. Another contributor is having well established deployment models such as rolling, canary and blue/green deployments, that are already codified, or bolted onto the platform.
I’m assuming that having 100% uptime is the result of best practices for HA inside of AWS, including multi-AZ, multi-region environments. As well as DC/OS being able to ensure the health of services.
And then doing all of this with one DevOps engineer is kind of a byproduct of both a talented DevOps engineer, a great group of developers, and the ease of operation for DC/OS.
Autodesk isn’t alone in their success with DC/OS. DC/OS is built on top of industrial strength components to form an easy to use and highly robust operating system for running modern applications.
Okay, before we wrap up this lesson you might still be wondering: what the difference is between Mesos and DC/OS?
That’s a great question, and the answer is that DC/OS is built on top of Mesos to add on some additional functionality, and pull in some additional open source projects that work with Mesos, but aren’t part of it. Imagine it’s like the Linux kernel. There are multiple distributions of Linux that include different UIs, package managers, security settings, etc. Though, they’re all still Linux, because they’re using the same shared kernel.
Continuing with that comparison, for DC/OS, Mesos is the kernel, and DC/OS is a distribution.
You may also be wondering how DC/OS has anything to do with containers, and Docker in particular.
That’s another great question that we’ll dive into more in another course on container orchestration with DC/OS, but since you asked…
Mesos was in use back before Docker was being run on ALL THE THINGS! It started out using LXC and cgroups to provide process isolation. Now it also supports Docker.
Okay, let’s wrap up the lesson here and in the next lesson we’ll start looking at the components and architecture of DC/OS.
Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.