The course is part of these learning paths
Docker has made great strides in advancing development and operational agility, portability, and cost savings by leveraging containers. You can see a lot of benefits even when you use a single Docker host. But when container applications reach a certain level of complexity or scale, you need to make use of several machines. Container orchestration products and tools allow you to manage multiple container hosts in concert. Docker swarm mode is one such tool. In this course, we’ll explain the architecture of Docker swarm mode, and go through lots of demos to perfect your swarm mode skills.
After completing this course, you will be able to:
- Describe what Docker swarm mode can accomplish.
- Explain the architecture of a swarm mode cluster.
- Use the Docker CLI to manage nodes in a swarm mode cluster.
- Use the Docker CLI to manage services in a swarm mode cluster.
- Deploy multi-service applications to a swarm using stacks.
This course is for anyone interested in orchestrating distributed systems at any scale. This includes:
- DevOps Engineers
- Site Reliability Engineers
- Cloud Engineers
- Software Engineers
This is an intermediate-level course that assumes:
- You have experience working with Docker and Docker Compose
Thanks for joining me for this lesson on Docker swarm mode architecture. You heard about the great benefits swarm mode provides in the previous lesson. In these architecture lessons, we'll understand more about the parts of swarm mode that enable it to accomplish all those great benefits, starting with networking. This lesson and the following architecture lessons build the foundations for using swarm mode. I promise we'll be seeing swarm mode in action in the demos of the next lesson group in the course.
This lesson will cover everything that is unique to swarm mode and networking:
" (Overlay networks) Starting with a Docker network type exclusive to swarm mode, the overlay network.
" (Service discovery) After that, we'll discuss how services in a swarm can be discovered across multiple host swarm networks.
" (Load balancing) On a related note, we'll see how load is balanced across all the replicas of a service.
" (External access) Then the mechanisms for accessing the swarm services from outside the swarm will be explored.
The networking requirements in a swarm are much more complex than using a single Docker host. Services need to communicate with one another and the replicas of the service can be spread across multiple nodes. Fortunately, Docker includes a network driver that makes multi-host networking reliable, secure, and a breeze to set up.
The driver I'm referring to is the overlay network driver. With the overlay driver a multi-host networking in a swarm is natively supported. There is no need to perform any external configuration. You can attach a service to one or more overlay networks, in the same way you would attach a container to one or more user-defined networks when not running in swarm mode.
Overlay networks only apply to swarm services and can't be connected to by containers that aren't part of a swarm service. Managers automatically extend overlay networks to nodes that run tasks requiring access to a given overlay network.
Network isolation and firewalls
It's a good time to review Docker network isolation and firewall rules. These rules apply to overlay networks just as they do for bridge networks.
Containers within a Docker network are permitted access on all ports of containers in the same network.
Access is denied between containers that don't share a common network.
Traffic originating inside of a Docker network and not destined for a Docker host is permitted. For example, access to the internet. However, any network infrastructure outside of Docker may still deny the traffic.
Ingress traffic, or traffic coming into a Docker network, is denied by default. Ports must be published in order to grant access form outside of Docker.
With services distributed across multiple nodes, a service discovery mechanism is required in order to connect to the nodes running tasks for a service. Swarm mode has an integrated service discovery. It is based upon the domain name system (DNS). The DNS is internal to Docker and implemented in the Docker Engine. It is used for resolving names to IP addresses.
Actually, the same service discovery system is used when not running in swarm mode. Service discovery in Docker is scoped to a network. When you are in swarm mode, the network can be an overlay spanning multiple hosts. But the same internal DNS system is used. All nodes in a network store corresponding DNS records for the network. Only service replicas in the network can resolve other services and replicas in the network by name.
Internal Load balancing
There are some unique service discovery considerations for Swarm mode. Each individual task is discoverable with a name to IP mapping in the internal DNS. But because services can be replicated across multiple nodes, which IP address should a service name request resolve to? Docker assigns a service a single virtual IP (VIP) address, by default. Requests for the virtual IP address are automatically load balanced across all healthy tasks spread across the overlay network. By using a virtual IP, Docker can manage the load balancing allowing clients to interact with a single IP address without considering load balancing. It also makes the service more resilient since the service can scale and tasks can change the nodes that they are scheduled on but clients are sheltered from the changes.
Internal load balancing example
To illustrate how service discover and load balancing work in swarm mode, consider two services deployed in a swarm service A and service B. Service A has a single replica while service B has two replicas. When service A makes a request for service B by name, the virtual IP of service B is resolved by the DNS server. Service A uses the virtual IP to make a request for service B. Using support for ip virtual servers (IPVS) the request for the virtual IP address is routed to one of the two nodes running service B tasks.
DNS Round Robin
Besides the default virtual IP, you can configure load balancing using DNS round robin (DNS RR). You can configure the load balancing on a per service basis. When DNS round robin is used, the Docker Engine's DNS server resolves a service name to individual task IP addresses by cycling through the list of IP addresses of node's running a task in the service. If you need more control over load balancing than a virtual IP can give you, DNS round robin should be used for integrating your own external load balancer.
We've covered access to services within a Docker network, but what about accessing a service from the outside? With a single Docker host, you would publish a container port on the host to permit access to a container. Similar functionality is still available in swarm. But there are actually two modes for publishing ports in swarm.
The first is the same as you would expect when publishing a port when not running in swarm mode. The container port is published on the host that is running the task for a service. This mode is referred to as host mode service publishing. You need to be careful with specifying a host port in host mode. If you have more tasks than available hosts, tasks will fail to run because the host port can only be bound to one task. You can omit a host port to allow Docker to assign an available port number in the default port range of 30000-32767. However, this can make it more difficult to work. Also, there isn't load balancing unless you configure it externally. Obviously, that is useful when you don't want load balancing, but what about when you do?
Because services can be replicated and tasks can be rescheduled onto different nodes as the state of the swarm changes, it is useful to have the option to load balance a published port across all tasks of a service. This is referred to as ingress mode service publishing. For convenience, all nodes in the swarm publish the port. This is different from host mode where a port is only published if the node is running a task for the service. In ingress mode, requests are round robin load balanced across the healthy instances of the service's tasks regardless of the node that receives the request.
Ingress mode is the default service publishing mode. It's ideal when you have multiple replicas of a service and need to load balance between them. Host mode publishing is useful when you have an external service discovery service and potentially for global services where one task for a service runs on each node. For example, a global service that monitors each node's health shouldn't be load balanced since you want to get the status of a specific node.
At this point, you might be wondering how ingress mode publishing work. The magic happens in what is called the routing mesh. The routing mesh combines two of the swarm components that we discussed earlier: an overlay network, and a service virtual IP.
When you initialize a swarm, the manager creates an overlay network named ingress. Every node that joins the swarm is in the ingress network. The sole purpose of the ingress network is to transport traffic from external clients that is destined to published service ports to the service inside the swarm.
When a node receives an external request on the ingress network the node resolves the service name to a virtual IP address. This process is carried out using the same internal DNS server as we discussed in the internal load balancing. The IP virtual server then load balances the request to a service replica over the ingress network.
Because every node is in the ingress network, every node can resolve the external requests can handle the external requests. The nodes need to have a couple of ports open for all of this magic to work:
o Port 7946 for both TCP and UDP protocols to enable container network discovery.
o Port 4789 for the UDP protocol to enable the container ingress network.
It's worth mentioning that you could add an external load balancer on top of the load balancing provided by the routing mesh. For example, if you have nodes running in the cloud, you can have the nodes in a private subnet so they aren't directly accessible from the internet. You could provision a cloud load balancer to handle requests from the internet and load balance them across nodes in the swarm. The swarm nodes then load balance again across the nodes running tasks for the service.
As a final note on the routing mesh, if you are planning to use the routing mesh on Windows, you need to be running version 17.09 or greater.
Besides the ingress network, Docker also creates a second network when running in swarm mode called docker_gwbridge. The docker_gwbridge is a virtual bridge that connects the overlay networks (including the ingress network) to an individual Docker daemon's physical network. This interface provides default gateway functionality for all containers attached to the network. Docker creates it automatically when you initialize a swarm or join a Docker host to a swarm, but it is not a Docker device. It exists in the kernel of the Docker host. You can see it if you list the network interfaces on your host.
There was quite a few topics related to networking in swarm mode. Let's recap the main points:
" Swarm mode includes a new type of Docker network, the overlay network. Overlay networks make it easy to use multi-host networking in a swarm.
" The same internal DNS service discovery mechanism used when not running in swarm mode is used in swarm mode. The internal DNS naturally extends to multi-host networks.
" The services in a swarm can be load balanced by using a virtual IP address or by DNS round robin.
" External access to the swarm is made possible by publishing ports. There are two modes for publishing in swarm mode: host and ingress.
o In host mode each service replica publishes it's container port on the host. No load balancing is used.
o In ingress mode, every node in the swarm publishes the port and requests are load balanced across all the replicas of a service. Any node can handle requests for the service even if the node doesn't have a replica of the service itself.
" Ingress mode is made possible by the swarm routing mesh which uses two default swarm networks: the ingress overlay network and docker_gwbridge network
In the next lesson, we'll look into swarm mode container orchestration features including rolling updates and scheduling constraints. When you're ready continue on to the next lesson to see how swarm can orchestrate containers.
Logan has been involved in software development and research since 2007 and has been in the cloud since 2012. He is an AWS Certified DevOps Engineer - Professional, AWS Certified Solutions Architect - Professional, Microsoft Certified Azure Solutions Architect Expert, MCSE: Cloud Platform and Infrastructure, Google Cloud Certified Associate Cloud Engineer, Certified Kubernetes Security Specialist (CKS), Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD), and Certified OpenStack Administrator (COA). He earned his Ph.D. studying design automation and enjoys all things tech.