Cloud Skills: Transforming Your Teams with Technology and Data
How building Cloud Academy helped us understand the challenges of transforming large teams, and how data and planning can help with your cloud tran...Learn More
In mid-April, we hosted the first Cloud Academy “office hours” webinar. This was intended to be an open Q&A session for all things cloud computing that would allow us to address some of your most common questions and topics, from the general Cloud to Docker, Kubernetes, AWS, and more. In all, we received 37 questions! (Thanks to everyone who participated and submitted questions!!) In this post, I’m going to feature some of my favorite questions and elaborate on some of the topics discussed in the live event.
If you’d like to see more “office hours” type webinars and posts, please let us know in the comments. Now, let’s dive into the questions!
What’s the best way to learn how to use Docker and its capabilities?
Learning how to use Docker (and containers) is no different than any other technology in general. My preferred approach is to mix educational material like courses and blog post tutorials with hands-on experiencing building our own prototypes. This helps you understand the concepts and apply them to a problem that you’re familiar with. I recommend that you pick your own application and get going. Mix that with my course on Docker fundamentals then learn to deploy with Kubernetes. You can also go through the webinar back catalog for more great learning material.
What skills are required to learn Docker/Kubernetes? Are coding skills required?
Coding skills are not required to learn Docker or Kubernetes because technically you don’t need to code to use either of them. However, the technology and its benefits will be lost on you if you have no prior experience building applications. Here’s an example: How will you know what to put in a
Dockerfile if you’ve never prepared a development environment or compiled some software? The short answer is you won’t. I suggest that you put containers and container orchestration further down your list until you’ve built and deployed a few applications. If you have a web application or two under your belt, then dive right it.
We are still at the virtualization state – what modifications do you suggest before we involve any container technology?
This depends on how stateful your virtualization-based solution is. The solutions that are currently available for deploying containerized solutions are best suited for stateless, horizontally scalable application containers. They do support stateful applications but this is a more difficult road to go down. Here is my recommendation: First, make sure that your current infrastructure is horizontally scalable. Next, you can move from running your processes on “bare metal” inside the virtual machine (VM) to containers inside the VM. This gives you a place to get acquainted with the technology before replacing the underlying infrastructure. Finally, you can replace your infrastructure and deployment process with a container orchestration solution.
Which container orchestration tool (e.g. DCOS or Kubernetes) is best suited for serverless microservices?
First, I need to unpack the question a bit. Container orchestration software does not generally apply to serverless solutions. Serverless and container solutions focus on different artifacts. Consider AWS Lambda or Azure Functions. You publish individual functions/methods and the cloud provider runs them for you. Containers focus on process isolation and dependency management. There are some projects to run something like “functions as a service” on top of container orchestration platforms like Kubernetes. This is possible because an individual function may be executed in a container. This is very early days. I’m also guessing supporting this type of application is not core to the container orchestration projects right now.
Now, let’s talk about microservices applications deployed as containers. Remember that microservices do not mandate containers, but it is safe to assume this association in the current climate. Personally, I don’t think that one orchestration tool is technically better than another for microservices. The best one comes down your own personal preference and unique technical requirements.
What is the best way to monitor your containers and what is the difference between monitoring containers vs. virtual machines?
I’m excited that this question came up because it’s so common and an important part of any production application. Here, I’ll elaborate on the answer given in the session. The best way to monitor containers is to apply the existing technologies and methodologies for monitoring cloud-based systems. Odds are that you have experience with something like Nagios, Collectd, ganglia, or any system that collects data via an agent. In my experience, this is the most common and easiest way.
Conversely, you can use a system like Prometheus to pull metrics from target systems. At this point, you have data. Now, ship this off to your ingress points such as Graphite, or a SaaS like Librato. Next, work with the data just like any other data. The point is that the technical approach does not change much, just the scope of your data.
Let’s assume for a moment that you have a VM running somewhere. You can collect data about that VM from the hypervisor itself or from an agent running in the VM. Now, assume that there is a container running in that VM. A container is just another running process on your system.
How would you monitor something like Apache? You may watch CPU usage, memory, or IO metrics. These metrics come directly from the Docker daemon. An agent may connect to the Docker daemon and ship this data off. The difference compared to VMs arises in how dynamic your overall infrastructure is. If a new VM comes up, it must be monitored. The same goes for containers. If a new container comes up then it should be monitored as well.
It’s easy to automatically collect CPU/memory/disks. You will need to put more effort into deciding how to monitor the processes themselves. Consider a web server running in a container. You may want to consistently check that the server handles the request. Does this go into your monitoring tool or do you leverage the health checks built into Docker? How do new containers automatically pick the specific types of monitoring they need? These are the types of questions that you need to ask, depending on how dynamic your solution is. The answers may guide you to the answer in comparing it to virtual machines.
Among the currently available orchestration tools, is there a clear market leader?
I don’t have any hard numbers to declare the market leader in terms of the number of deployed applications. However, Kubernetes is definitely the market leader by community size.
Networking in Docker: How to persist data in Docker and how do containers in Docker talk to each other?
Let’s unpack this into two questions: 1) How to persist data with Docker containers?; 2) How do Docker containers talk to each other? I’ll start with #1 because it’s shorter and easier to answer. You should use Docker Volumes for persistent data. Volumes are independent of containers and may be reused across different containers.
Question #2 is more complicated. I’ll do my best to cover the high-level points with a bit of hand-wavy explanations. (The Docker networking guide can fill in everything behind the hand-wavy magic.) Docker networking, broadly speaking, operates at two different levels: host and non-host networks. Containers on the host network operate just like any other process running on the Docker host. You can access their exposed ports on the host IP or hostname.
Non-host networks are more powerful and more complicated. Modern Docker version use SDN (software-defined networking) to accommodate many different use cases. This boils down to creating a network for all of the containers in an application. Give each container a name. Then each container in the network can resolve the others on hostname (e.g. DNS). Docker sets up things like IPs and host file entries.
This topic gets much more complicated when you talk about cross-host networking like multiple Docker hosts inside a Docker Swarm cluster. I won’t touch on that because it’s not in my expertise. I would recommend that you check the networking guides for each orchestration platform if you want to learn more about these solutions.
Are containers a threat to virtual machines?
This post gives me the opportunity to refine my answer from the webinar. I see containers replacing VMs for deploying applications. They are not at all a threat to virtualized machines as a technology. That is actually not possible since they solve two completely different problems.
Here’s an example. Our previous deployment infrastructure at Saltside used golden AMI images. Each commit built an AMI with a predetermined SHA and all run-time dependencies. The AMI also included all of the monitoring agents and various other things that we needed. Then, we put that AMI into an auto-scaling group behind a load balancer. This is simple and worked well for years. However, it’s not the most resource efficient approach. It’s possible that one EC2 instance could run multiple processes instead of just one. Eventually, we moved to containers, which allowed us to pack more processes onto a single machine. This did not replace virtual machines completely. It just changed what we did with them.
Containers are not intended to run multiple processes. Virtual Machines are. VMs are not going anywhere. We’ll always need VMs to build and scale infrastructure solutions regardless of whether it runs standard processes, “serverless” functions, or containers.
How do you see Docker or Kubernetes or any containers in the next two years or so?
Cool, another personal opinion question! I predict a world that focuses on orchestration rather than runtimes. We are in the middle of that transition. The industry focuses on the runtime when discussing the development phase because this is what developers are interacting with. Now, we’ve moved out of this phase because teams are more interested in deploying the containerized application that they’ve built.
This is where orchestration tools come into the picture. Kubernetes (my preferred tool) is actively working to define interfaces for the different components in their internal stack. This makes things like networking and runtime interchangeable. Hopefully, in the next few years, we will be more focused on how we deploy containerized applications rather than how they’re run behind the scenes.
We’ve seen this before with VMs. These days, we don’t care about the VM technology itself—”just give me a VM”. My guess is that, in the next couple of years, Docker (or Moby or whatever they’ve been rebranded as) will be decreasingly important. I also predict that the communities will develop a “functions as a service” solution built on top of container orchestration.
What’s your preferred stack?
These days I work with non-monoliths. My preferred approach is to keep each application in its own code repo and keep a mono-repo for “releases” of all applications. A “release” is a change in the configuration or code of any of the applications. This repo is packaged up as a Helm chart and deployed to Kubernetes. Application-level concerns such as what language or web framework to use are made irrelevant to the underlying infrastructure. Application developers can use whatever technology they like and everything is deployed in the same way (containers via Kubernetes). Things like which monitoring system to use are context specific.
That’s a wrap on this post. I hope it clarifies some of my answers in the session. Stay tuned to the Cloud Academy blog and webinars for more helpful content on Docker, Kubernetes, AWS, and all things cloud.
In part 1 of my webinar series on Kubernetes, I introduced Kubernetes at a high level with hands-on demos aiming to answer the question, "What is Kubernetes?" After polling our audience, we found that most of the webinar attendees had never used Kubernetes before, or had only been expos...
Whether you're looking to become a cloud engineer or you're a manager wanting to gain more knowledge, learn the basics of how cloud computing works. Are you wondering about how cloud computing actually works? We can help explain the basic principles behind this technology. Cloud comput...
What is Ansible? Ansible is an open-source IT automation engine, which can remove drudgery from your work life, and will also dramatically improve the scalability, consistency, and reliability of your IT environment. We'll start to explore how to automate repetitive system administratio...
When it comes to building and configuring IT infrastructure, especially across dozens or even thousands of servers, developers need tools that automate and streamline this process. Enter Puppet, one of the leading DevOps tools for automating delivery and operation of software no matter ...
As Head of Content at Cloud Academy I work closely with our customers and my domain leads to prioritize quarterly content plans that will achieve the best outcomes for our customers. We started 2018 with two content objectives: To show customer teams how to use Cloud Services to solv...
2018 was a banner year in cloud computing, with Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP) all continuing to launch new and innovative services. We also saw growth among enterprises in the adoption of methodologies supporting the move toward cloud-native...
Today, cloud technology platforms and best practices around them move faster than ever, resulting in a paradigm shift for how organizations onboard and train their employees. While assessing employee skills on an annual basis might have sufficed a decade ago, the reality is that organiz...
How building Cloud Academy helped us understand the challenges of transforming large teams, and how data and planning can help with your cloud transformation. When we started Cloud Academy a few years ago, our founding team knew that cloud was going to be a revolution for the IT indu...
If you want to deliver digital services of any kind, you’ll need to compute resources including CPU, memory, storage, and network connectivity. Which resources you choose for your delivery, cloud-based or local, is up to you. But you’ll definitely want to do your homework first. In this...
Now that you’ve decided to invest in the cloud, one of your chief concerns might be maximizing your investment. With little time to align resources with your vision, how do you objectively know the capabilities of your teams? By partnering with hundreds of enterprise organizations, we’...
It’s no secret that cloud, its supporting technologies, and the capabilities it unlocks is disrupting IT. Whether you’re cloud-first, multi-cloud, or migrating workload by workload, every step up the ever-changing cloud capability curve depends on your people, your technology, and your ...
In the IT world, failure is inevitable. A server might go down, an app may fail, etc. Does your team know what to do during a major outage? Do you know what instances may cause a larger systems failure? Chaos engineering, or chaos as a service, will help you fail responsibly. It almo...