Deploying an Application to App Engine
Start course


NOTICE: This course is outdated and has been deprecated


Modern software systems have become increasingly complex. Cloud platforms have helped tame some of the complexity by providing both managed and unmanaged services. So it’s no surprise that companies have shifted workloads to cloud platforms. As cloud platforms continue to grow, knowing when and how to use these services is important for developers. 

This course is intended to help prepare individuals seeking to pass the Google Cloud Professional Cloud Developer Certification Exam. The Cloud Developer Certification requires a working knowledge of building cloud-native systems on GCP. That covers a wide variety of topics, from designing distributed systems to debugging apps with Stackdriver. 

This course focuses on the third section of the exam overview, more specifically the first five points, which cover deploying applications using GCP compute services.

Learning Objectives

  • Implement appropriate deployment strategies based on the target compute environment
  • Deploy applications and services on Compute Engine and Google Kubernetes Engine
  • Deploy an application to App Engine
  • Deploy a Cloud Function

Intended Audience

  • IT professionals who want to become cloud-native developers
  • IT professionals preparing for Google’s Professional Cloud Developer Exam


  • Software development experience
  • Docker experience
  • Kubernetes experience
  • GCP experience



Hello and welcome. In this lesson, we'll explore some of App Engine's functionality for deployments. In Google exam guide, they list four learning objectives. They list scaling configuration, versions, traffic splitting and blue/green deployments. Now, that's a lot to cover, so let's dive in and talk about scaling configuration. App Engine makes it easy to scale services by allowing us to specify the scaling configuration in the app.yaml file. The scaling configuration is used to determine how many instances were on at a given time. With App Engine, instances are different between standard and flexible environments. Standard instances are lightweight and it makes them start quickly and they run inside of a sandbox which isolates them from other instances. Now, this is a closed system that conceptually kind of seems like a giant, Google-run container orchestration service where each instance is a container of sorts which allows instances to start very quickly. Flexible instances are Docker-based and they run on Compute Engine instances which does have reasonably fast start-up times though it's certainly slower than standard instances. 

Now, there are three different scaling methods, one of which only applies to standard environments. Both standard and flexible environments support automatic and manual scaling. Standard environments also have a third option called basic. Let's start with manual scaling because the settings are the exact same for both environments. With manual scaling, you specify how many instances that you want to run and that's it. Autoscaling is the default and it tries to make scaling easier by scaling in response to certain events. This is a case where the differences between environments impact the settings. Standard environments use a custom autoscaling algorithm that can be tuned by adjusting the autoscaling properties. Flexible environments use the compute engine auto scaler though without including all of the scaling metrics. Now, standard environments have a basic scaling option which works well for apps that are infrequently accessed. Picture this as if it was a Cloud Function. Instances are started in response to a request and a shutdown if the instance is idle for a set amount of time which is configurable, though it defaults to five minutes. Okay, to reiterate, scaling config is set in app.yaml so making changes requires deploying an updated app.yaml file. Let's talk about versions next because service versioning is the reason we're able to perform rollbacks as well as traffic splitting and blue/green deployments. Whenever we deploy a change to a service, it's created as a new version while retaining the previous versions. We as engineers get to control which version or versions of a specific service accept traffic. Now, this may not sound like much, however it forms the basis for some really useful functionality. 

Let's use the console to demo the rest of the topics for this lesson. I'm going to use the Python 3.7 runtime on the standard environment. The app.yaml file only includes one line which defines the runtime. Since this is the Python runtime, it supports any WSGI compliant frameworks such as Flask and other run times have their own gateway interfaces. This app just shows the text 'Hello World!' and since this project doesn't have an existing app, this is going to be the first version of the default service. Now, running gcloud app deploy from the directory which contains the app.yaml file will deploy the service. Each time you deploy code, App Engine creates a new version of your service and here's our newly created service. By default, App Engine promotes newly-deployed versions of a service which make it the live version. So, we can use the live URL and see that it does show 'Hello World!' Let's create a new version. Let's add V2 to the end of our 'Hello World!' message and deploying that, refreshing the versions page shows that we have a new version on the top of the list and it's serving 100% of the traffic, showcasing that new versions get 100% of the traffic. They're promoted by default. Let's create one more. One more version V3 and we're going to deploy this one with the --no-promote flag so that it gets deployed, we'll be able to access it through its own URL but not on the live URL. And refreshing this, shows you our new version, it's on top of the list and it's not serving any traffic on the live URL. Using the version specific, we can see our V3 message, so let's pause here, let's recap what we've done. 

We've created three versions of our service. We deployed the first two with the app deploy and that promoted them to be the active one by default. First one was active. We deployed the second one. It became active. The third one we deployed with no promote and so it's not active but it is deployed allowing us to test it. In App Engine, traffic splitting supports multiple deployment methods because it allows traffic to be divided between different versions. For canary deployments, you could divert a small amount of traffic to a new version and once ready, move to 100% of the traffic being served by that canary. For A/B testing we could set up a 50-50 split between two versions and see which one users prefer and App Engine has three ways that it can split traffic. Splitting is done by either IP address, which takes that IP and turns into a hash. Splitting by cookie requires you to create a cookie in your service, putting the business logic on us, how we actually want to have that traffic split. And finally, random does its best to just divide the traffic up as we've allocated based on these percentages. If we set these to 50%, then we should be able to test the live URL and see version two half the time and roughly half the time see version three and setting version three to 100% will make version three the live version. 

Notice the version names here, these are kind of tough to distinguish what's what but they are created by App Engine because we didn't specify them. The -v flag for the gcloud app deploy command allows you to specify the version name of your own. Let's create a fourth version and deploy it so that we can see that using the -v flag, we can set a version and we can see in the console here that this can be easier to work with when we actually can see versions here in the dropdown that are more meaningful. Now, with traffic splitting, you can implement canary deployments, A/B testing, etc. Traffic migration is a little different. Traffic migration is a switch that you can use to cut traffic over between versions. This can support blue/green because we can deploy a version without promoting it. We can then cut over traffic when we're ready. Now the cut over by default happens immediately and it doesn't warm up any instances which means we could see some latency if we have a lot of traffic being served by the current version before the switchover. There is an option that you can enable to warm up instances that you can specify in the app.yaml file that can help to avoid that. All right, that is going to do it for this lesson and thank you so much for watching and I will see you in a future lesson.

About the Author
Learning Paths

Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.