1. Home
  2. Training Library
  3. Google Cloud Platform
  4. Courses
  5. Introduction to Google Kubernetes Engine (GKE)

Storage

Contents

keyboard_tab
Introduction
1
Introduction
PREVIEW2m 36s
Clusters
2
Configuration
4
Workloads
6m 30s
5
6
Storage
6m 59s
7
Security
4m 11s
Demo

The course is part of this learning path

play-arrow
Start course
Overview
DifficultyIntermediate
Duration55m
Students1111
Ratings
4.6/5
starstarstarstarstar-half

Description

Kubernetes has become one of the most common container orchestration platforms. It has regular releases, a wide range of features, and is highly extensible. Managing a Kubernetes cluster requires a lot of domain knowledge, which is why services such as GKE exist. Certain aspects of a Kubernetes cluster vary based on the underlying implementation.

In this course, we’ll explore some of the ways that GKE implements a Kubernetes cluster. Having a basic understanding of how things are implemented will set the stage for further learning.

Learning Objectives

  • Learn how Google implements a Kubernetes cluster
  • Learn how GKE implements networking
  • Learn how GKE implements logging and monitoring
  • Learn how to scale both nodes and pods

Intended Audience

  • Engineers looking to understand basic GKE functionality

Prerequisites

To get the most out of this course, you should have a general knowledge of GCP, Kubernetes, Docker, and high availability.

Transcript

Hello and welcome. In this lesson, we'll be talking about two storage abstractions provided by Kubernetes. By the end of this lesson, you'll be able to list the types of Kubernetes storage abstractions, describe some of the different volume types, and describe how GKE implements persistent storage for persistent volumes.

Alright, there's no shortage of storage options these days. Every cloud vendor has dozens of storage services all targeting different storage needs: blob storage, relational databases, document databases, Git repositories, Docker container registries, etc.

While important, these aren't the types of storage we're going to cover. The storage we're going to cover is a bit lower-level. We're going to cover filesystem and block storage for pods. The reason for covering these two and not the other types is because these options can vary based on the underlying implementation which means it's valuable to understand how GKE is implementing this functionality.

Kubernetes includes different storage abstractions, two of which are called Volumes and PersistentVolumes.

Volumes are an abstraction for different types of ephemeral storage, meaning that the storage only exists while the pods exist on a node. When the pod is removed from a node, the volumes that were attached are completely removed forever. Volumes allow multiple containers running inside of a pod to access the data.

Let's cover three types of volume which are empty directories, configuration maps, and secrets.

The empty directory type provides all containers inside of a pod with access to an empty directory. There are different types of empty directories, such as in-memory filesystem, however, by default, it uses some of these spare disk space from the node itself. Now, what that means is that our choice for the boot disk we use on our nodes is going to be a point of consideration. GKE supports nodes with hard disks, solid-state drives, and, as of GKE 1.10, local SSDs.

Kubernetes has a mechanism called ConfigMap and it's used to store basic application configuration info that we can allow pods to access. The config data can be used in different ways in the pod spec and one of the ways that we can access it is through volumes. Now, there's not really any specific GKE integration with config maps but I did want to mention it because it is a useful way to access configuration files from inside a pod.

Similar to the configMap, Kubernetes has an object called a secret. It's similar to the configMap except it's used for sensitive data. By default, GKE handles the encryption at rest, however, some workloads are more sensitive and require us to manage the encryption at the application layer.

The way in which GKE implements this is by allowing us to enable a cluster-level setting called application layer secrets encryption which allows GKE secrets to be encrypted using CloudKMS-managed keys.

So volumes are ephemeral pod storage allowing for shared data access to all the containers in a pod.

In contrast with volumes, GKE PersistentVolumes are backed by Compute Engine persistent disks. Kubernetes manages the life cycle of these volumes and they exist independently of any of the pods.

Let's talk about three objects that tie together to allow us to configure PersistentVolumes for GKE. The three objects are PersistentVolumes, StorageClasses, and PersistentVolumeClaims.

Here's a high-level. PersistentVolumes are GKE objects that represent specific disks. Since there are different types of disks, the storage class is used to define the type of disk that we want to use and the persistent volume claims are used by pods to claim access to a disk.

Kubernetes allows us to specify the storage class for our persistent volumes and it allows us to set a default storage class to be used. And every GKE cluster has a default storage class that is set to use the standard hard disk.

PersistentVolumes are a Kubernetes abstraction that can change based on the underlying implementation. The abstraction itself defines three different access modes which are ReadWriteOnce, ReadOnlyMany, and ReadWriteMany.

Since Compute Engine persistent disks don't support multiple writes, it means if we need ReadWriteMany, then we need to find another storage option to use there. 

Pods access volumes through a PersistentVolumeClaim so if you have a PersistentVolume which is a Compute Engine disk and it is determined by the storage class, either the default option or some custom, in order to actually use the persistent volumes, pods need to make a claim to them.

A persistent volume claim requests a size, access mode, and storage class, and Kubernetes will either provide an existing unclaimed disk that meets those requirements or it will create a new disk dynamically. Now, by default, dynamically created disks are not retained after the claim is removed which means the word persistent is contextual.

Both PersistentVolumes and StorageClasses include a property to specify the retention period. Setting this value to retain allows the disk to remain even if there are no longer claims to it. Regardless of how you create PersistentVolumes, whether it's directly or dynamically, pay attention to the reclaim policy of the storage class because disks that are retained even if they're not in use, are still billable resources.

PersistentVolumes have a wide range of use cases, though the way they interact with pods will shape their usage.

Recall that deployments are intended for stateless applications and create multiple pod replicas. In a deployment, each replica includes the same persistent volume claim so all pods in the deployment will use the same disk. So this implies that the optimal access mode for deployments is ReadMany.

The optimal way to use persistent disks in ReadWriteOnce access mode would be to use a StatefulSet. Pods and replicas inside of a stateful set are given unique identifiers and they're updated in a predictable order which ensures that a replica of a given ID has access to the matching persistent volumes.

So the takeaway here is to make sure you consider the type of workload you're using when thinking about Persistent Volumes.

Let's wrap up here and summarize what we've covered. Kubernetes has two storage abstractions called Volumes and PersistentVolumes. Volumes are ephemeral storage that only exists while a pod exists on a node. The empty directory is useful for sharing a directory with all of the containers inside of a pod. ConfigMap volumes provide access to configuration data and secrets provide access to sensitive data. Also, secrets can integrate with CloudKMS allowing us to manage the keys ourselves. PersistentVolumes on GKE use Compute Engine persistent disks. They're configured through storage class resources, they support standard disks and solid-state drives, and currently, they lack support for the ReadWrite Many access mode.

Okay, that's going to do it for this lesson. Thank you so much for watching and I will see you in the next lesson.

About the Author
Students58219
Courses19
Learning paths15

Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.