1. Home
  2. Training Library
  3. Google Cloud Platform
  4. Courses
  5. Managing Google Kubernetes Engine and App Engine

Configuring Application Traffic

The course is part of this learning path

Google Associate Cloud Engineer Exam Preparation
course-steps 10 certification 1 lab-steps 7
Start course
Duration1h 6m
star star star star-border star-border


Google Cloud Platform has become one of the premier cloud providers on the market. It offers the same rich catalog of services and massive global hardware scale as AWS, as well as a number of Google-specific features and integrations. Mastering the GCP toolset can seem daunting given its complexity. This course is designed to help people familiar with GCP strengthen their understanding of GCP’s compute services, specifically App Engine and Kubernetes Engine.

The Managing Google Kubernetes Engine and App Engine course is designed to help students become confident at configuring, deploying, and maintaining these services. The course will also be helpful to people seeking to obtain Google Cloud certification. The combination of lessons and video demonstrations is optimized for concision and practicality to ensure students come away with an immediately useful skillset.

Learning Objectives

  • Learn how to manage and configure GCP Kubernetes Engine resources
  • Learn how to manage and configure GCP App Engine resources

Intended Audience

  • Individuals looking to build applications using Google Cloud Platform
  • People interested in obtaining GCP certification


  • Some familiarity with Google Kubernetes Engine, App Engine, and Compute Engine
  • Experience working with command-line tools


Now that we've gotten our feet wet with App Engine and actually deployed a basic app using the command line, let's go a bit deeper by looking at how configuration works. Our goal in this lesson is to understand how to configure traffic to different parts of our App Engine environment. However, before we do that we should review a bit about App Engine generally, just to set the foundation.

Recall that there are two types of App Engine environments, standard and flexible. The core difference between the two is that the flexible environment gives you direct control over your application runtime via Dockerfiles. You can also enable root access to the underlying VM instances, so overall this is a better option if you have a more unique environment use case, hence the name "flexible". The main tradeoff is that the flexible environment is slower and less resilient. Instances take minutes to deploy and startup, instead if seconds, and the instances are automatically restarted more frequently by GCP. You also don't get access to all of the same App Engine APIs, such as the Users and Images APIs, which are useful for scripting and environment automation. And yet another difference is that the standard environment has a bit more flexibility with its autoscaling options. We'll go into this in the next lesson.

Generally speaking, if you're using App Engine to go serverless, you're probably gonna want to use the standard environment for its greater speed, reliability, and feature richness. The flexible environment is more for niche use cases. For most of the remaining lesson and demo content we will be making use of the standard environment, but do keep in mind these differences with the flexible environment in case you have a need for it.

Now, whichever environment you use, you're going to configure your app with YAML files. Only one YAML file is absolutely required, and that is the app.yaml file. This is for application-level settings. An application may be made up of multiple services, and these are configured in service.yaml files. You can have more than one service.yaml file in your application's root directory. For a fairly simple app this is fine, but for something more complex it is better practice to have separate subdirectories for each service.

Here is a sample app.yaml file. This is the one for the Python app we launched in the last lesson.

runtime: python27
api_version: 1
threadsafe: true
- url: /.*
  script: main.app

As you can see, our app.yaml file here, it's pretty simple. The only thing app.yaml really needs to do is specify a runtime and a version parameter. The rest is optional. Here we see information about how the app will handle requests with the source code provided in the application directory.

If we wanted to, we could break this out into a separate service by using a service.yaml file. It would look quite similar with the only difference being that a service.yaml file starts with the field "service". So it might look like this.

service: python-service
runtime: python27
api_version: 1 
threadsafe: true

A more complex app might have several of these service.yaml files to create a microservices architecture within App Engine.

So you would name the YAMLs in an intelligent way. There are five other optional configuration files that can help extend your app's overall functionality, and those five are the dispatch.yaml, queue.yaml, index.yaml, cron.yaml, and dos.yaml.

So real quick, let's go over what each of these does. Dispatch.yaml is for overriding routing rules. You set this file in your application root directory and use it to route incoming requests to specific services based on specific paths or hostnames in the URL. So this is an important configuration file for custom routing, if you have multiple services running.

Queue.yaml is the configuring for push and pull task queues. This lets us define retry parameters such as minimum backoff time in seconds, maximum task age. We won't be doing any queue-related services in this course, but I will link to the documentation. It might be relevant to your use case if you're using queuing services within GCP. 

Then you have dos.yaml, D-O-S, dos.yaml. This is a security feature. This lets you blacklist IP addresses or subnets to protect your application from Denial of Service attacks. So possibly useful if that's a concern. 

And then we have cron.yaml. As you may guess, this lets you define scheduled tasks. You set a schedule, such as "every 24 hours" or "every Monday at six", then give it a URL for the task definition. The task definitions should live in the specified path. This is very handy for maintenance, for monitoring, and other standard automated tasks that you might want to configure and have in one easy to work with location.

And then finally there is the index.yaml file. You may have noticed that in our simple Python app from last lesson, an index.yaml file was automatically generated. This file is a reference of properties on various application entities. It's not something you'll need to edit manually all the time. For simple applications, you won't have to touch it at all. App Engine will update it for us.

So now that we have some background on App Engine configuration and the environments, let's talk about traffic management. There are a few different use cases we need to consider. As mentioned above, we know that we can control service routing by using the dispatch.yaml file. But what about running multiple versions of a service for A/B testing? Or, what if we wanna migrating traffic to a single specified version such as an upgrade?

To do these two specific things, we need to talk about traffic splitting and traffic migration. Splitting refers to taking a percentage of our traffic and directing it to one or more distinct versions of a service within our app. Migration refers to the opposite process, moving traffic split among different versions to a single specified new version. So let's start by talking about splitting. The first thing to know is that traffic splitting is automatically applied if the URLs in your service do not specify an app version.

So if you have multiple service.yaml files with different versions of the same service, and the URLs in your app.yaml don't have version parameters, then traffic will be split among the services randomly. So that's one way to do it if you just need random splitting among different versions of a service.

If you wish to be more precise, you can explicitly enable traffic splitting in the console or using gcloud CLI tools, or the GCP API. So for example, if we did this in the console, we would just go to the Application page in App Engine, select the versions we want to split, and then yo click on "Split Traffic", and then just put in the percentage each service should receive. We could do this same thing with the gcloud CLI tool with the command here, gcloud app services set-traffic, and then there's a number of additional flags and arguments you put. You can just go through it.

Now, one option to note here, probably the most important thing, is the "IP or COOKIE" argument, those options for the "--split-by" flag. We have to tell App Engine whether to split traffic using IP address, or using cookies. IP address splitting is easier to do. It will just make App Engine hash the request's IP address from zero to 999, and then route based on whatever random value it gets. This isn't as precise as cookie-based routing though because of the way IP addresses tend to be in GCP. So the issue is that IP addresses, they're somewhat ephemeral, particularly from cell phone traffic from the public internet. IP address splitting is also really not effective if your traffic coming from internal GCP services, because those services tend to utilize a very small set of internal IP addresses over and over again. So you'll end up getting stuck with the same version of your app, it won't split.

So for better precision, you should use cookie-based splitting. The only problem here is will take a bit more setup, because the application will look for a specific HTTP request header. So you may need to make a code change in your app to deal with this, to provide this, to help ensure you get a more precise split for your traffic.

So that's basics of splitting. Finally, now, let's talk about traffic migration. In the standard App Engine environment, we can choose to migrate traffic either immediately or gradually. In the flexible environment, we can only do it immediately. There is no option for gradual migration. When we do an immediate migration, you will generally see a spike in latency, as it causes all instances of the older versions of your service or services to shut down. For a latency-sensitive application, this could be a deal breaker, as you might see requests hang or even drop as the traffic is re-routed. For the standard environment, the solution is to use gradual migration. This is configured in your app.yaml file with one setting. Here you can see the inbound_services, you set it to warmup. And with that, you can have your traffic migrate gradually instead of immediately.

So that's basically all there is about deployment and traffic routing in App Engine. We made it through. It was a pretty deep dive on the configuration as well. Congrats, go you. You're almost ready to do some damage. So in the next lesson, we'll keep it short, we'll talk about autoscaling and deployment a bit more. If you're ready, let's get to it.

About the Author


Jonathan Bethune is a senior technical consultant working with several companies including TopTal, BCG, and Instaclustr. He is an experienced devops specialist, data engineer, and software developer. Jonathan has spent years mastering the art of system automation with a variety of different cloud providers and tools. Before he became an engineer, Jonathan was a musician and teacher in New York City. Jonathan is based in Tokyo where he continues to work in technology and write for various publications in his free time.