Understanding the DC/OS Cluster Architecture
As part of the Lab startup procedure, a fully functional DC/OS cluster is provisioned. It takes around 20 minutes to fully provision. Everything is ready to use once you see the 100% Setup Complete status message below Open Environment. In the meantime, you will review some DC/OS terminology and the architecture of the cluster to prepare for the Lab. After reading this Lab Step, you can proceed to the next Lab Step. If the DC/OS graphical user interface (GUI) doesn't load, or there are not three nodes displayed in the nodes chart, or not all components are healthy, then the provisioning hasn't fully completed.
- Component: A DC/OS system service that is distributed with DC/OS. Examples of DC/OS components include Marathon, diagnostic agents, and Docker garbage collection to clean up orphaned Docker images.
- Master node: A DC/OS node that runs a collection of DC/OS components that manage the rest of the cluster. Each cluster has one or more master nodes. Using one master node is only recommended for development.
- Agent node: A DC/OS node is where tasks are run. Agent nodes can be public or private. Private agents don't allow ingress traffic from outside of the cluster. Public agents do allow ingress traffic from outside the cluster. Public agents are usually only used for reverse proxy load balancing services on private agents.
- Mesos: Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. DC/OS is built around Mesos.
- Marathon: A container orchestration engine for Mesos and DC/OS.
- Application: A long-running service that may have one or more instances that map one-to-one with Mesos.
- Service: A set of one or more service instances that can be started and stopped as a group and restarted automatically if they exit before being stopped.
- Task: A Marathon task is an application instance created from an application definition.
- Package: A bundle of metadata that describes how to configure, install, and uninstall a DC/OS service using Marathon.
There are many ways to create ready-to-use DC/OS clusters. Focusing on public clouds, DC/OS provides "universal" installer for Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform.. This Lab uses the universal installer to deploy DC/OS on AWS. The diagram below shows the high-level architecture of the main AWS resources that are provisioned:
All of the nodes are in a single availability zone. This is the recommended architecture for creating highly available clusters. Each availability zone has a replicated cluster and load balancing is performed across the availability zones. Further discussion of configuring highly available DC/OS clusters is outside of the scope of this Lab.
The DC/OS master nodes and public agents exist in a public subnet, while the private agents are in a private subnet. This is a security best practice. In order for private agents to connect to the internet, a NAT instance is included in the public subnet.
All of the node types are contained in Auto Scaling groups. No Auto Scaling policies are created, but it would be easy to dynamically size your cluster in production. The launch configuration of the Auto Scaling group runs scripts to automatically join new nodes to the cluster.
The master nodes and the public agents are accessible from the internet via public load balancers that expose HTTP and HTTPS ports. The master nodes are also accessible via an internal load balancer which allows more ports because it can trust internal traffic more than the internet.
The DC/OS Lab environment is configured to allow anonymous access. This is only for simplicity of performing the Lab and is never a good idea in production. In production, you should enable authentication and create users with the least amount of access required to fulfill their role.
Lastly, if you need to use the DC/OS command-line interface (CLI) you must install it on a node outside of the DC/OS cluster. The NAT instance is where you can install the CLI. It has a name tag of NAT Instance (SSH user: centos). When connecting to the NAT instance via SSH, use the username centos and the provided key file.
In this Lab Step, you reviewed the resources that are provisioned for you in this Lab. In the upcoming Lab Steps you will connect to the DC/OS graphical user interface (GUI) on a master node and manage the cluster using DC/OS CLI installed on the NAT instance.