GCP Architect Case Studies
The course is part of this learning path
This course will help you prepare for the Professional Cloud Architect Exam. We cover the 4 case studies presented in the exam guide, explain what they are, why they are important, and how to use them to prepare for the exam.
Examine the 4 case studies presented in the exam guide:
- EHR Healthcare
- Helicopter Racing League
- Mountkirk Games
Anyone planning to take the Professional Cloud Architect Exam.
Basic knowledge of GCP.
In this lesson, I am going to walk you through the case study for a fictional company called “Mountkirk Games”. Let’s read the company overview:
“Mountkirk Games makes online, session-based, multiplayer games for mobile platforms. They have recently started expanding to other platforms after successfully migrating their on-premises environments to Google Cloud. Their most recent endeavor is to create a retro-style first-person shooter (FPS) game that allows hundreds of simultaneous players to join a geo-specific digital arena from multiple platforms and locations. A real-time digital banner will display a global leaderboard of all the top players across every active arena.”
Alright, so it appears they are going to be making online games for mobile platforms. So you might want to brush up on Google services that are designed for supporting mobile devices. I can also see that there are going to be hundreds of simultaneous players, so they are going to require infrastructure for dealing with a lot of different players at once. This is a video game so it has to be real-time and it has to be as low of latency as possible.
This geo-specific arena concept is interesting. So I assume they mean that all the players in Germany would be playing together, and all the players in Japan would be playing together. So expect some questions about detecting user location. And this of course implies you are going to be supporting multiple regions and zones across your services.
So it sounds like these arenas are going to be kept separate. But you also need to be able to access data from each and use that to create the global leaderboard. So maybe each arena is going to be a separate project, but you are going to have to access data from a separate project. So I could see some IAM permission questions popping up here. If the leaderboard is in real-time, you can’t rely on exporting the data to a bucket and then importing it later on. So maybe this will involve some APIs as well. Let’s read the solution concept:
“Mountkirk Games is building a new multiplayer game that they expect to be very popular. They plan to deploy the game’s backend on Google Kubernetes Engine so they can scale rapidly and use Google’s global load balancer to route players to the closest regional game arenas. In order to keep the global leader board in sync, they plan to use a multi-region Spanner cluster.”
Ok, so again we see there could be a lot of simultaneous players. Now, this sounds like we need things to be very scalable. You don’t want to launch a new video game and no one can play it because you can’t handle the influx of new customers. Also, it appears that they are going to start using Google Kubernetes Engine. So I could think of quite a few questions about that. How to migrate a cluster. How to autoscale a cluster. How to create a new cluster. You should be familiar with all of that.
Oh, it looks like you could get questions on load balancers as well. You might want all players to enter at the same point, and then forward them to the appropriate regional arena. So think about how you would do that. And they also specifically call out that they plan to use Cloud Spanner. That makes sense given the requirements. So make sure you read up on Cloud Spanner. You should understand the difference between it and say Cloud SQL. You are going to have to use it to support multiple regions, so make sure you understand that. And you might even want to try to think about the types of data that should be stored in Cloud Spanner. And think about the types of data that should NOT be stored in Spanner. Alright, next let’s go through the technical environment:
“The existing environment was recently migrated to Google Cloud, and five games came across using lift-and-shift virtual machine migrations, with a few minor exceptions. Each new game exists in an isolated Google Cloud project nested below a folder that maintains most of the permissions and network policies. Legacy games with low traffic have been consolidated into a single project. There are also separate environments for development and testing.”
So I see mention of lift-and-shift VMs. You might be asked how to do that, so just be aware. You should also probably think about how you auto-scale these. Both up and down. You also should know that lift-and-shift usually doesn’t take advantage of any GCP-specific features, so you might want to start thinking about what it would take to optimize these. Maybe at some point, they want to containerize these VMs and run them on Kubernetes. So think about what that would take.
Here they explicitly say that each game is running in its own project. And they also use folders. So you should be familiar with the resource hierarchy of organization, folders, and projects. You need to understand permissions and policy inheritance. You should be prepared in case you get a question on how Project A can access something in Project B. And I see there could be potential questions on network policies as well. It also looks like you need to support multiple environments per game. So you want to understand how to set those up and maintain them. Your developers might need full control over development. They might have limited permissions in testing. And they probably have no permissions or at least very limited permissions for production. The business requirements are:
- “Support multiple gaming platforms
- Support multiple regions
- Support rapid iteration of game features
- Minimize latency
- Optimize for dynamic scaling
- Use managed services and pooled resources
- Minimize costs”
So we already have seen multiple platforms. This could imply mobile phones, tablets, and computers. But it could also include gaming consoles as well. I’m not sure. It definitely seems like you need to think about juggling a huge array of different devices. Now, we already covered multiple regions. Rapid iteration is new. Software updates are going to need to be pushed out pretty frequently. Bug fixes are going to be very common, especially in games. And they are going to want to add new game features as well. So to me, this means you are going to have to think about things like Continuous Integration/Continuous Deployment. With all these rapid changes, versioning is going to be critical. So you might get questions on Cloud Source Repository, Container Registry, and Artifact Registry. Basically, any solution you suggest has to be able to handle a constant amount of small change.
We already covered scaling. And we see managed services are going to be important. Pooled resources are going to be important. Definitely, you want to understand load balancers. And you are probably going to get at least one question about optimizing for cost. Autoscaling is great for this. You will use less resources when you can. And basically, you just want to understand cost differences between similar services. So, for example, you should realize that Cloud Spanner is very powerful, but it is also very expensive. For every option, you want to know the associated cost. Now let’s read the technical requirements:
- “Dynamically scale based on game activity
- Publish scoring data on a near real–time global leaderboard
- Store game activity logs in structured files for future analysis
- Use GPU processing to render graphics server-side for multi-platform support
- Support eventual migration of legacy games to this new platform”
We already covered scale. We covered the leaderboards. Ok, here we see the potential for there being a lot of logs. That means you need to know how to store logs, how to organize logs, and how to search logs. I also notice that logs will be stored in structured files. So to me, that sounds like they are hinting about BigQuery. I would say you should be familiar with building a data warehouse and writing queries. Understand all the best practices about BigQuery.
Now they are specifically calling out GPUs here. So you might get some questions on that. You are going to want to know how to launch a GPU VM. And since they are planning on using GKE, you also want to know how to create node pools that are equipped with GPUs as well. And it looks like they want to migrate some of their older games, so you could get questions about containerizing VMs. On to the executive summary:
“Our last game was the first time we used Google Cloud, and it was a tremendous success. We were able to analyze player behavior and game telemetry in ways that we never could before. This success allowed us to bet on a full migration to the cloud and to start building all-new games using cloud-native design principles. Our new game is our most ambitious to date and will open up doors for us to support more gaming platforms beyond mobile. Latency is our top priority, although cost management is the next most important challenge. As with our first cloud-based game, we have grown to expect the cloud to enable advanced analytics capabilities so we can rapidly iterate on our deployments of bug fixes and new functionality.”
So this makes it look like they want to analyze player behavior. Make sure you think about collecting telemetry data and then using that for deriving insights about the players. I’ve already covered potential migration scenarios. Gaming platforms beyond mobile heavily implies gaming consoles to me. Latency and cost will be key requirements to optimize for. Ok here, advanced analytics capabilities heavily suggests that Big Data is going to be very important. So make sure you brush up on Big Query and any other Google product with the word “data” in it. Datprep, Dataflow, Data Studio. You could get questions on any of these. And we covered “rapidly iterate” already, so I think that covers the Mountkirk Games case study pretty well.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.