Knative Serving

The course is part of this learning path

Start course
1h 1m

Interested in knowing what Knative is and how it simplifies Kubernetes?

Knative is a general-purpose serverless orchestration framework that sits on top of Kubernetes, allowing you to create event-driven, autoscaled, and scale-to-zero applications. 

This course introduces you to Knative, taking you through the fundamentals, particularly the components Serving and Eventing. Several hands-on demonstrations are provided in which you'll learn and observe how to install Knative, and how to build and deploy serverless event-driven scale-to-zero workloads.

Knative runs on top of Kubernetes, and therefore you’ll need to have some existing knowledge and/or experience with Kubernetes. If you’re completely new to Kubernetes, please consider taking our dedicated Introduction to Kubernetes learning path.

For any feedback, queries, or suggestions relating to this course, please contact us at

Learning Objectives

By completing this course, you will: 

  • Learn about what Knative is and how to install, configure, and maintain it
  • Learn about Knative Serving and Eventing components
  • Learn how to deploy serverless event-driven workloads
  • And finally, you’ll learn how to work with and configure many of the key Knative cluster resources

Intended Audience

  • Anyone interested in learning about Knative and its fundamentals
  • Software Engineers interested in learning about how to configure and deploy Knative serverless workloads into a Kubernetes cluster
  • DevOps and SRE practitioners interested in understanding how to install, manage and maintain KNative infrastructure


The following prerequisites will be both useful and helpful for this course:

  • A basic understanding of Kubernetes
  • A basic understanding of containers, containerization, and serverless based architectures
  • A basic understanding of software development and the software development life cycle
  • A basic understanding of networks and networking


The knative-demo GitHub repository used within this course can be found here:


Welcome back. In this lesson I'm going to introduce you to the Knative Serving component, explaining how it works and how and when to use it. Knative Serving provides flexible features for application scaling and routing. With the Knative Serving component installed you'll have the ability to: One, automatically deploy applications with public routes in one hit. Two, maintain point in time snapshots of deployments. Three, configure automatic pod scaling, including scale to zero and four, perform traffic splitting enabling blue green deployments.

The Knative Serving component provides the following resources, service, route, configuration and revision. These resources work together to ease the burden of setting up and managing Nubrik routes to your serverless workloads. A Knative Service is distinct and different to the standard Kubernetes Service.

Installing the Knative Serving component results in another resource named Service, which provides a different perspective on the concept of a service. The Service defines the container image to run, how it should scaled, and can even define routing and traffic splitting options. When you create a Knative Service, multiple resources get created as displayed in this diagram.

Creating a service resource creates both Route and Configuration resources, both of which are directly managed by the Service itself. The Configuration resource maintains a history of Revisions with the Route resource being configured to route traffic to by default, the latest revision, although this routing behavior can be altered.

Every time you modify the Service a new revision of the configuration is automatically snap-shotted, providing a point in time record of the configuration. Revisions are immutable objects that exist indefinitely or until such time that they are no longer of use. The following Service resource illustrates just how easy it is to create a new service which when deployed into your Kubenetes cluster wires up all of the networking including making it externally callable by default.

Expanding on the previous Service resource, we could for example add in some traffic splitting to split traffic across two named revisions like so: with the traffic splitting capabilities exposed directly within the Service resource, it now becomes trivial to perform blue/green deployments. To do so you would simply start off by deploying two revisions of the service, and having 100% of the traffic going to the blue revision. You can then update the routing configuration on the Service to spit the traffic 50/50 across both revisions. This will allow you to test and observe the behavior before committing all traffic to the green revision.

Once confirmed that the green revision is behaving as expected, you can again update the Service routing to split traffic 100% exclusively to the green revision. If the green version did not accomplish what it was designed to do, roll backing to the previous version is simple, with 100% of the traffic being delivered back to the blue version.

The key point here to highlight is that this blue/green traffic splitting capability exposed within the Knative Service can be leveraged directly by developers themselves, which is extremely useful when it comes to rolling out updates to production. It removes all the pain points and administration burden involved with running side-by-side versions of production.

Knative services are by default setup with publicly accessible HTTP endpoints and routes. When a Knative service is deployed, it will be immediately available to use from anywhere outside of the cluster over the Internet. Each deployed Knative service is given its own FQDN, Fully Qualified Domain Name, which follows the following format. For example, if our Knative setup is configured with the custom domain, and we install two Knative services, named ShoppingCart and Payment into the CloudacAdemy namespace within the cluster, then both services would be contactable with the following HTTP endpoints.

Each deployed service can additionally be tagged such that the tag also becomes part of the service's FQDN. For example, if we deployed the following payment service which has two revisions: payment-v1, tagged as prod, and payment-v2 tagged as staging, then both revisions can be contacted directly using the Fully Qualifies Domain Format tag-servicename.namespace.domain.

Keep in mind that when using the tag expanded FQDN Traffic splitting will not in effect which makes sense. Knative serving can be easily set up to serve using a custom domain. To do so, you need to update the domain within the config domain config map located in the Knative serving namespace.

For example, if we wanted to configure the domain to be the custom domain, then the following configmap setting would be applied. Next, you need to determine the public IP address for the Kubernetes cluster. To do so, you can query the address by interrogating the istio ingress gateway service located within the istio system namespace. The following example uses a jsonpath to navigate within the response to the ingress's assigned public IP address. Finally, you need to add a new wildcard A, record into your DNS zone for the domain in question. The wildcard needs to be setup to include the cluster namespace. Therefore, if our cluster's namespace were to be CloudAcademy, then the following wildcard A record would be entered, where is the public IP address assigned to the istio ingress gateway, and as determined by the previous commands.

Another extremely useful feature that Knative Serving brings to the table is the concept of scale to zero. Scaling to zero is the idea that when your serverless application has no demand on it, then the serverless components should scale themselves all the way back to zero.

In the Knative world, this would result in all pods for a particular service being removed from the cluster. Scaling to zero in a Kubernetes cluster has the advantage of freeing up cluster resources for other applications and workloads, and in the serverless world you don't waste money on running idle processes. At the other end of the scaling spectrum you may have unpredictable bursts of activity and that requires your serverless application to scale upwards.

To accomplish automatic pod scaling in either direction, Knative introduces the Knative PodAutoscaler, KPA, which is designed to automatically scale pods in either direction, and provides fast request-based auto scaling capabilities out of the box.

Now, Knative can be configured to use the Kubernetes default horizontal pod auto scaler, HPA, if you prefer to run with that one. And beyond this it even provides a plug and play option for custom-created pod auto scalers. The following example shows how to configure a Service with a scale to zero policy.

This example will use concurrency request-based routing behavior provided by the Knative Pod Autoscaler. When sustained bursts of activity come in, the Service will scale out the pods to a maximum of 20. As activity begins to drop off, pods will start to be removed one-by-one, all the way back down to zero if all activity has stopped.

Okay, that concludes this lesson. In summary, you learnt that the Knative Serving component provides a number of middleware primitives that collectively help to route and scale your Kubernetes hosted serverless workloads. 

Go ahead and close this lesson and I'll see you shortly in the next one.

About the Author
Learning Paths

Jeremy is a Content Lead Architect and DevOps SME here at Cloud Academy where he specializes in developing DevOps technical training documentation.

He has a strong background in software engineering, and has been coding with various languages, frameworks, and systems for the past 25+ years. In recent times, Jeremy has been focused on DevOps, Cloud (AWS, Azure, GCP), Security, Kubernetes, and Machine Learning.

Jeremy holds professional certifications for AWS, Azure, GCP, Terraform, Kubernetes (CKA, CKAD, CKS).