Interested in knowing what Knative is and how it simplifies Kubernetes?
Knative is a general-purpose serverless orchestration framework that sits on top of Kubernetes, allowing you to create event-driven, autoscaled, and scale-to-zero applications.
This course introduces you to Knative, taking you through the fundamentals, particularly the components Serving and Eventing. Several hands-on demonstrations are provided in which you'll learn and observe how to install Knative, and how to build and deploy serverless event-driven scale-to-zero workloads.
Knative runs on top of Kubernetes, and therefore you’ll need to have some existing knowledge and/or experience with Kubernetes. If you’re completely new to Kubernetes, please consider taking our dedicated Introduction to Kubernetes learning path.
For any feedback, queries, or suggestions relating to this course, please contact us at email@example.com.
By completing this course, you will:
- Learn about what Knative is and how to install, configure, and maintain it
- Learn about Knative Serving and Eventing components
- Learn how to deploy serverless event-driven workloads
- And finally, you’ll learn how to work with and configure many of the key Knative cluster resources
- Anyone interested in learning about Knative and its fundamentals
- Software Engineers interested in learning about how to configure and deploy Knative serverless workloads into a Kubernetes cluster
- DevOps and SRE practitioners interested in understanding how to install, manage and maintain KNative infrastructure
The following prerequisites will be both useful and helpful for this course:
- A basic understanding of Kubernetes
- A basic understanding of containers, containerization, and serverless based architectures
- A basic understanding of software development and the software development life cycle
- A basic understanding of networks and networking
The knative-demo GitHub repository used within this course can be found here:
Okay, so I'm now going to deploy a custom namespace called cloudacademy, and this will where we'll deploy the rest of our serverless workload resources. Okay, that's done I'm going to make that namespace the default, moving forward.
So now onto step three. In step three we're going to demonstrate an example Knative service. So in step 3.1, I'm going to install our first Knative serving service. Now make sure you're going to do this inside a for loop. So we're going to install this two times. And the reason for that is I'm going to generate two divisions of this particular service. So let's copy this block, back in the terminal, paste and execute it. So it's going to create our service and it's going to configure it twice because it's inside the for loop. If we now take a look at the Knative service resources, note here that it's KSVC, not SVC. So you can see our Knative service has been created. Hellosvc. It's been given a URL and that URL is using the zip, xip.io wildcarded DNS service that we set up earlier.
Back within our runbook, I'll extract this URL by running the following command. So if we now echo out that hollow underscore service URL, you can see we have indeed captured this URL for this Knative service. The final thing to do is now run a curl command against it. In this case I'm hitting the /hello path on the host. And here we can see we've got a valid response back. Hello from: cloudacademy.knative.v2.
Now where did the cloudacademy.knative.v2 come from? So in our service definition, we specified an environment variable and the second time that this for loop ran, the value for the sender environment variable is set to v$version. Where $version is coming from this sequence. So it's the second out of that sequence. So the final thing to do here is to acknowledge that we've actually got two revisions.
So if we run kubectl get revision you can see the first revision and the second revision and the second revision is the one that is actually being served at the moment behind this URL. So that's one of the cool things about Knative is that it tracks and maps and maintains a history of revisions, serving the latest one.
Okay. So moving on to step 3.2, we're going to examine the concept of traffic splitting. So in this case we're going to redeploy the same service, hollowsvc. We're going to update the revision name to be hollowsvc-v3, and we're also going to update the sender environment variable, this time to be cloudacademy.knative.v3. And then under the traffic configuration property we've got a prod tag pointing to this revision, hollowsvc-v3. We've got a staging tag pointing to hollowsvc.v2. And we've also got a latest tag and that is using the latest revision set to true.
So if we copy this, go back to our terminal, I'll clear it. I'll paste the command. And that has now updated the hellosvc. So again we'll run kubectl get ksvc to look at the Knative services. We can see that we've still got the same URL. But this time it is met to the latest revision, hellosvc-v3. So again, I'll extract the hellosvc URL, and this time I'm going to run a curl command ten times to it.
So when we run this command ten times, we should see that we get 50/50 split over the two revisions, v3 and v2, which we appear to be doing so. So that's great. So that means 50% of the traffic went to this tag and 50% of the traffic went to this tag. Hellosvc-v3 and hellosvc-v2. So that's traffic splitting. So the next thing we'll do is send traffic to the explicit tags that have been tagged on the revisions. So we'll run these two commands. So what this is doing is it's taking our hellosvc URL and then it's updating hellosvc to be prod-hellosvc, likewise with staging-hellosvc.
So if we then echo out these two updated URLs. So now if I curl to the first one, to the prod one, you can see that we have indeed got a response from v3. And then if we curl to the staging equivalent one, getting a response from v2. So, that is tag-based URLs.
Finally, again if we look at our revisions, we should see we've got three revisions, which we do. So let's move on to step 4, and in this step we're going to look at the Knative pod autoscaler. So the way this is set up, it's concurrency based and it's got a target of two. So any time that we send more than two concurrent requests to this service, it will launch extra pods so that we're sending no more than two requests per pod. So let's copy this command. Again I'll clear the terminal. I'll execute the command. And this time we now have hellosvc-v4 and it's set up with the Knative pod autoscaler.
Okay, so, to see this in action, I'm going to split this terminal pane and I'm going to run a watch across the pods and the cloudacademy namespace, like so. So at the moment, we can see we've got our single pod and it's running. I'm now going to copy this command here, which will send ten requests in parallel to our service. So because we're sending ten in parallel, this should cause the Knative pod autoscaler to introduce more pods. So you can see here because we haven't seen any traffic yet, it's actually autoscaling the existing pod down to zero, there's no request being sent to it. So now if I run this command, we're generating traffic. And here you can see indeed, pods are being created to service these ten concurrent requests.
Now, if we leave it for a while, for approximately thirty seconds without sending any more traffic to it, it will scale to zero, as it's designed to do so, because we've got a minimum scale of zero on it. And this is one of the features of designing serverless within Knative, is that you have the ability to scale down to zero, and this frees up resources for other parts or other workloads running within the same cluster. And indeed, here we can see the pods are now being terminated. So this is the scaling down of it and it's scaling down to zero.
Okay, if we cancel the current watch and this time we just do a kubectl get pods, we can indeed see that there are no pods in a running status, so we have indeed scaled down to zero. If you put the watch back on, go back to our top pane, re-run the same request, we're now scaling up.
So again, the reason we scaled up is because our pod autoscaler is using a concurrency setting and it has a target of two concurrent requests per pod. So because we sent in ten in parallel, I'll wait again to watch the scale to zero event cooking, because I was not sending any demand on it so approximately within 30 seconds of the last request it should scale back down as it is right now.
So that's really cool. It's a really cool feature and you get it for free.
About the Author
Jeremy is the DevOps Content Lead at Cloud Academy where he specializes in developing technical training documentation for DevOps.
He has a strong background in software engineering, and has been coding with various languages, frameworks, and systems for the past 20+ years. In recent times, Jeremy has been focused on DevOps, Cloud, Security, and Machine Learning.
Jeremy holds professional certifications for both the AWS and GCP cloud platforms.