Configuring Application Traffic
Start course
1h 9m

*** PLEASE NOTE: This content is outdated, and the course should be retired ***


Google Cloud Platform has become one of the premier cloud providers on the market. It offers the same rich catalog of services and massive global hardware scale as AWS, as well as a number of Google-specific features and integrations. Mastering the GCP toolset can seem daunting given its complexity. This course is designed to help people familiar with GCP strengthen their understanding of GCP’s compute services, specifically App Engine and Kubernetes Engine.

The Managing Google Kubernetes Engine and App Engine course is designed to help students become confident at configuring, deploying, and maintaining these services. The course will also be helpful to people seeking to obtain Google Cloud certification. The combination of lessons and video demonstrations is optimized for concision and practicality to ensure students come away with an immediately useful skillset.

Learning Objectives

  • Learn how to manage and configure GCP Kubernetes Engine resources
  • Learn how to manage and configure GCP App Engine resources

Intended Audience

  • Individuals looking to build applications using Google Cloud Platform
  • People interested in obtaining GCP certification


  • Some familiarity with Google Kubernetes Engine, App Engine, and Compute Engine
  • Experience working with command-line tools

Section Two Part Three: Configuring Application Traffic. Welcome to Part Three. Now that we've gotten our feet wet with App Engine and actually deployed a basic app using the command line, let's go a bit deeper by looking at how configuration works. Our goal in this lesson is to understand how to configure traffic to different parts of our App Engine environment. However, before we do that, we should review a bit about App Engine generally just to set the foundation.

So recall that there are two types of App Engine environments, Standard and Flexible. The core difference between the two is that the Flexible environment gives you direct control over your application runtime via docker files. You can enable root access to the underlying VM instances. So overall, this is a better option if you have a more unique environment use case, hence the name Flexible. Now the main trade off is that the Flexible environment is slower and it's less resilient. Instances take minutes to deploy and start up and the instances are automatically restarted more frequently by GCP. You also don't get access to all the same App Engine API, such as the users and images APIs, which are useful for scripting an environment automation. Another difference is that the Standard environment has a bit more flexibility with its autoscaling options. Now we'll go into that in the next lesson.

But generally speaking, if you're using App Engine to go serverless, you're probably gonna wanna use the Standard environment for its greater speed, reliability, and feature richness. The Flexible environment is more for niche use cases. For most of the remaining lesson and demo content, we'll be making use of the Standard environment. But keep in mind these differences with the Flexible environment in case you have need of it.

Now whichever environment you use, you're going to configure your app with YAML files. Only one YAML file is absolutely required, and that is the app.yaml file. This is for application level settings. An application may be made up of multiple services. And these are configured in service.yaml files. You can have more than one service.yaml file in your applications root directory. Now for a fairly simple app, this is okay. But for something more complex, it is better practice to have separate subdirectories for each service.

So here's a sample app.yaml file. This is the one for the Python app we launched in the last lesson. One line, "runtime: python 37". As you can see, our app.yaml file is pretty simple. The only thing app.yaml really needs is runtime, is to specify a runtime and potentially a version parameter.

Now if we wanted to, we could break this out into a separate service by using a service.yaml file. And it would look pretty similar, with the only difference being that the service.yaml file generally starts with a field called "service". So it would look something like this.

service: python-service

runtime: python37

api_version: 1

threadsafe: true

We have a service name, runtime. And we can also put in an API version, and some other config. Now a more complex app might have several of these service.yaml files to create microservices, to create a microservices architecture within App Engine. Now aside from this, there are five other optional configuration files that can help you to extend your app's overall functionality. And these are dispatch.yaml, queue.yaml, index.yaml, cron.yaml, and dos.yaml.

So real quick, let's go through them. Dispatch.yaml is for overriding routing rules. You set this file in your application root directory and you use it to route incoming requests to specific services based on specific paths or host names in the URL.

Queue.yaml is for configuring push and pull task queues. This lets us define retry parameters, such as minimum back off time in seconds or, and a maximum task age, for example. We won't be doing any queue-based services in this course. But we'll link to the documentation. If it's relevant to your use case, definitely take a look at this, queue.yaml.

Now dos.yaml is a security feature that lets you blacklist IP addresses or subnets to protect your application from DOS attacks.

And then cron.yaml lets you define scheduled tasks. You set a schedule, such as every 24 hours, or every Monday at six p.m. and then give it a URL for a task definition, much like a typical cron job. The task definitions should live in the specified path. Now, this is very handy for maintenance, monitoring, and other standard automated tasks that you might want to configure and have in one easy to work with location.

And then finally, there is the index.yaml. Now you may have noticed that in our simple Python app from last lesson, an index.yaml file was automatically generated. This file is a reference of properties on various application entities. For simple applications, you will not have to manually edit this file at all. App Engine will automatically update it for you.

So, now that we have some strong background on App Engine configuration and environments, let's talk about traffic management. There are a few use cases that we need to consider. As mentioned above, we know that we can control service routing by using the dispatch.yaml file. But what about running multiple versions of a service for A/B testing or migrating traffic to a single specified version?

To do these two things, we need to talk about traffic splitting and traffic migration. Splitting refers to taking a percentage of our traffic and directing it to one or more distinct versions of a service within our app. Migration refers to the opposite process, moving traffic that's split among different versions to a single specified new version. So let's start by talking about splitting. The first thing to know is that traffic splitting is automatically applied if the URLs in your service do not specify an app version. So if you have multiple service.yaml files with different versions of the same service and the URL in your app.yaml file do not have version parameters at all, then traffic will be split automatically and randomly. If you wish to be more precise, you can explicitly enable traffic splitting in the console or by using G Cloud CLI tools or the GCP API.

So for example, if we did this in the console, we would just go to the application page in App Engine and select the versions we want to split. Click on Split Traffic, and then just put in the percentage each service should receive.

Now we could do the same thing with the G Cloud CLI tool with a command like this. Here we have "gcloud app services set-traffic" and a number of flags here. Now if you look at these flags here, there's one option you really need to note. It's the one that says "IP_OR_COOKIE" options in the dash dash split by flag. We have to tell App Engine whether to split traffic using IP address splitting or cookies.

IP address splitting is easier to do. It will just make App Engine hash the request's IP address from zero to 999 and then route based on whatever random value it gets. This isn't as precise as cookie routing, though, because of how IP addresses tend to be somewhat ephemeral, particularly from cellphone traffic and from the public internet. IP address splitting is also bad for traffic coming from internal GCP services because those services tend to utilize a small set of internal IP addresses. That will get stuck to the same version of your app.

So for better precision, you should use Cookie-based splitting. The trade-off here is that it will take a bit more setup because the application will look for a specific HTTP request header. So you may need to make a code change in your app to deal with this, but it will help ensure you get a precise split for traffic.

And then finally, let's talk a little bit about traffic migration. In the Standard App Engine environment, we can choose to migrate traffic either immediately or gradually. In the Flexible environment, we can only do it immediately. There's no option for gradual migration. When we do an immediate migration, you'll generally see a spike in latency, as it causes all instances of the older version of your service or services to shut down. For a latency-sensitive application, this could be a deal-breaker, as you could see requests hang or it can drop as the traffic is rerouted.

For the Standard environment, the solution is to use gradual migration. This is configured in your app.yaml file with this one setting here, "inbound_services: - warmup". And with that, you are informed and ready to deal with migration, splitting, and deployment in the App Engine world.

Phew, okay, so we made it through a deep dive into App Engine environments and configuration and traffic. You're almost ready to really do some damage. In the next short lesson, we'll talk about autoscaling and deployment. If you're ready, let's get going. Cheers.



Introduction - Section One Introduction - Kubernetes Concepts - Cluster and Container Configuration - Working with Pods, Services, and Controllers - Kubernetes Demo - Section Two Introduction - Creating a Basic App Engine App - Configuring Application Traffic - Autoscaling App Engine Resources - App Engine Demo - Conclusion

About the Author

Jonathan Bethune is a senior technical consultant working with several companies including TopTal, BCG, and Instaclustr. He is an experienced devops specialist, data engineer, and software developer. Jonathan has spent years mastering the art of system automation with a variety of different cloud providers and tools. Before he became an engineer, Jonathan was a musician and teacher in New York City. Jonathan is based in Tokyo where he continues to work in technology and write for various publications in his free time.