Kubernetes: Ecosystem & Production Operations

Kubernetes Ecosystem: if you have been following our introductory webinar series on Kubernetes, we recently wrapped up part two: Ecosystem & Production Operations, which you can watch here.
In part two, we covered production preparedness, application packaging and cluster ops, and the wider Kubernetes ecosystem and tools. In this recap, I’ll also expand on some of the areas covered in the live event.

Let’s start by digging into the Kubernetes features that are important in a production environment.
Learn more about Kubernetes Ecosystem

Kubernetes Ecosystem: production resources

  • DaemonSet: A DaemonSet is a pod that is automatically scheduled to every node in the cluster. This is especially useful for running monitoring or logging agents on each node, or a log collector like Fluentd. This is my preferred way to collect telemetry data.
  • StatefultSet: Prior to Kubernetes 1.5, this was referred to as PetSet. A StatefulSet is similar to a pod, except that it has ordering and other stateful guarantees. You can use a StatefulSet to run a database setup. Consider MongoDB. You can define a StatefulSet to bring up an arbiter, primary, and secondary in that order. However, I would recommend that you focus on stateless workloads for now. This area is actively developing and will naturally mature over time.
  • Ingress: As of Kubernetes 1.5, this is a beta API. You can think of an asingress sort of a proxy for pods. You may use it as a firewall, to handle virtual hosting, or even as a quasi API gateway. I recommend reading the Ingress guide. Ingress resources will be big going forward and will certainly change the way you deploy applications with Kubernetes!
  • Job & CronJob: What would an application be without some batch processing and recurring reports? A job creates one or more pods and tracks execution of the job through all of the containers. They can also be scheduled with the resourceCronJob.

Odds are, you’ll need one or two of these to build a full-featured application. Now, let’s take a look at production practices to keep everything in tip-top shape.

Production Practices

  • Set resource requests and limits: Setting these ensures that pods either get the required compute resources or fail to schedule. Establishing the limit ensures that containers do not consume unexpected compute resources.
  • Separate critical and non-critical workloads: This practice increases resource utilization. Non-critical containers set a limit and may be thrown anywhere in the cluster. Critical loads can be guaranteed their minimum compute resources and may be scheduled appropriately.
  • Node selectors and Node name: This point is a continuation of the previous two. These two settings inform the scheduler about node characteristics. For example, One entire CPU on a c4.xlarge  instance type is not the same as one on a t2.small. You can use these settings to place containers on an appropriate node. This is especially useful for heterogeneous clusters or when workloads are divided into critical/non-critical.
  • Set liveness and readiness probes: This is mandatory for production environments! Liveness probes can automatically restart broken containers. Readiness probes test that a container is ready to receive traffic. These probes ensure that your containers are working as defined by your probes, and not on generic container semantics.
  • Add telemetry: Remember, it’s not in production until it is being monitored, and you cannot have monitoring without telemetry data. Given the large variety of metric collection tools, it doesn’t matter which one you use, just pick one and go with it. It’s easy to deploy a DaemonSet for your chosen agent. Be sure to collect cluster CPU/memory percentage and the same for pods. Decide on a headroom and create an action plan for when the threshold is breached.
  • Prep for Cluster Admin: Read and understand the administration guide. It covers Kubernetes version upgrades, node maintenance, and managing API versions. I can assure you that you’ll need the first two at some point during production operations. It’s best to prepare before something happens so that you have a general idea of how it may impact your system.
  • Plan for Availability: You may consider a multi-master setup or even federated clusters. You will want to make sure that you have a solid understanding of your availability requirements, the potential risks, the failure modes that you can expect, and a plan for resolving issues.
  • Keep resource definitions in (YML/JSON) under source control: These are important files, and changes must be tracked. Ideally, they are kept in the same repo as the source code. Remember, these are just YAML/JSON files. They should be listed and verified during CI. You don’t want mistakes in these breaking your deployments!
  • Secure your cluster: The Kubernetes official documentation provides in-depth coverage of this topic. There are multiple ways to implement authorization. Choose one and set it up before going to production!
  • Back up etcd data: Kubernetes stores all data in etcd. Trust me; you don’t want to lose this.
  • Configure centralized logging: You can install or fluentd similar to collect logs from all containers and ship them off to something else. You can even run that system (ELK for example) on Kubernetes! Just make sure you have a strategy in place to collect logs from all nodes and containers in a central place. You will need this!

Kubernetes Ecosystem

There is a vibrant ecosystem around Kubernetes. Here, we’ll focus on tools for cluster infrastructure management and application packaging.

Cluster operations, or Cluster ops, generally refers to the work required to provision, maintain, and scale Kubernetes clusters. I’ll be honest, this is one my favorite technical areas, and I think you’ll like it too. Let’s start at the beginning. Clusters don’t just spring into existence. They must be created.

Kubernetes ecosystem is a distributed system in itself. It’s non-trivial to build from scratch. Kelsey Hightower’s tutorial “Kubernetes the Hard Way” covers everything you need to build and run K8S from scratch. This is a fabulous resource if you want to get really down and dirty and learn it all. Most of us, myself included, consider this a reference manual rather than a tutorial. Check it out. It’s detailed and long. This use case is better served with automation.

Kops (short for “Kubernetes Operations”) is as official as you can get for open-source Kubernetes tools. It is “the way” to bootstrap clusters on AWS. This is essentially Kubel  for clusters. If you want to bootstrap and manage a new cluster, this is the place to start. Kube-Up is a popular script for bootstrapping a new cluster that you will see referenced in documentation and old posts. However, it has depreciated over time and you’ll want to stick with kops going forward.

If you don’t want to run Kubernetes yourself, there are a variety of hosted solutions available. Google Cloud Platforms provides Google Container. It’s the easiest and most straightforward way to get Kubernetes in production. I recommend this option if you’re using GCP. I also recommend switching cloud providers just to use Google Container Engine. Tectonic is a Kubernetes solution from CoreOS. It wraps the official Kubernetes releases in a tight package from setup, scaling, and general administration. It also includes the kube-aws  CLI for managing Kubernetes clusters on AWS.

Kismatic from Apprenda is a useful suite of tools for provisioning, maintaining, and testing Kubernetes clusters. It includes kuberang This tool may be used to smoke test a Kubernetes cluster (especially useful if you’re following “Kubernetes the Hard Way”!). Again, this is a small sample. You’ll find plenty more by doing a Google search or following the ecosystem. KubeWeekly can send you the latest news, projects, tutorials, and other good stuff in their weekly newsletter.

There are also many projects that target end users (and not system administrators). Kubernetes’ package manager Helm is the most useful and important of these projects. You write packages called “charts,” and then you can use the helm  CLI to install/upgrade charts as “releases” on your cluster. Charts contain all of the resources required to run a particular application, including services and deployments.

Here are a couple of examples. You can use the MySQL chart to run mysql for application A, and deploy it again with a different release name for application B. Or, you can create a chart for an entire microservice application and deploy it that way. The official charts repository is one of the most active repositories in the Kubernetes organization. It’s also a fantastic resource for tips and tricks, such as learning how other users write various Kubernetes resources. The Helm docs list a bunch of related tools and tutorials to help you get started. Odds are, you’ll find a few blog posts about Helm in each KubeWeekly issue. Keep an eye on Helm, and try it out for yourself.

How to get involved

The ecosystem is a product of the wonderful Kubernetes community. Here are some other ways that you can get involved:

Stay connected. The community and many of those who maintain it are active on the Kubernetes Slack channel. This is the best place to be if you are even remotely interested in Kubernetes and the ecosystem.
Get involved. As with any rapidly changing technology, staying involved and talking with others is the best way to succeed. Kubernetes is no exception. Look into local meet-ups and conferences. You may also join the various planning meetings and weekly calls with Kubernetes developers. These are great forums to voice your opinion on project direction, collaborate on issues, and learn what other people are up to. You will learn something and you’ll help others in the process.

Kubernetes guides. The official Kubernetes guides are a fantastic supplement to the webinars in our two-part series. They provide more information on use cases and functionality and a great abstract overview of Kubernetes. I suggest you watch the previous Kubecon videos. These will help you get a handle on how people are using Kubernetes in the field, and you’ll learn about the cool stuff that the larger Kubernetes ecosystem is working on.

Coming soon: Helm and Kops

I hope you have enjoyed this webinar series. If you want a refresher on Kubernetes and its main features, check out part one, “Hands on Kubernetes,” by viewing the webinar and the recap.

We’ve received lots of positive feedback from these events, so I am planning a second two-part series specifically on Helm and Kops as a result. Stay tuned to the Cloud Academy webinars page for scheduling information. Until then, you can find me @ahawkins  on the Kubernetes Slack. Good luck out there, and happy shipping!

Cloud Academy