GCP Architect Case Studies
The course is part of this learning path
This course will help you prepare for the Professional Cloud Architect Exam. We cover the 4 case studies presented in the exam guide, explain what they are, why they are important, and how to use them to prepare for the exam.
Examine the 4 case studies presented in the exam guide:
- EHR Healthcare
- Helicopter Racing League
- Mountkirk Games
Anyone planning to take the Professional Cloud Architect Exam.
Basic knowledge of GCP.
In this lesson, we are going to dive into the case study for a fictional company called “EHR Healthcare”. I am going to read each section, and then I’ll point out key requirements and topics for study. Alright, so let's start with the company overview:
“EHR Healthcare is a leading provider of electronic health record software to the medical industry. EHR Healthcare provides their software as a service to multi-national medical offices, hospitals, and insurance providers.”
Now there are a few phrases that jump out at me when I read this. First of all, I see that this company will be storing “health records”. When I hear that I think about needing to store people’s private medical information. This means you need to think very carefully about security, encryption and compliance. Of course you should always be careful with customer’s data, but medical information is in a class of it’s own. There are all kinds of laws regulating this stuff.
That means you should understand how to properly set up IAM roles to prevent your employees from accessing your customer’s data. You need to be familiar with picking the right services to store and transmit this information securely. Know how to detect if private information is being accidentally written out to your logs. On your exam, you could get questions about any of these topics.
Now the next thing that I see is “multinational”. If your customers are going to be all over the world, that implies you probably need to support multiple regions. So a single instance running in a single zone is not going to work. You will need multiple instances across multiple zones in multiple regions. Also, this suggests that you might have to think about different jurisdictions and laws in various countries. For example, the GDPR in the EU (which is the General Data Protection Regulation). So start thinking about all the different services and options you would need to support customers all over the globe. Next, let’s go through the solution concept:
“Due to rapid changes in the healthcare and insurance industry, EHR Healthcare’s business has been growing exponentially year over year. They need to be able to scale their environment, adapt their disaster recovery plan, and roll out new continuous deployment capabilities to update their software at a fast pace. Google Cloud has been chosen to replace their current colocation facilities.”
Ok, so the first thing I notice here is “rapid changes”. This implies agility and flexibility will be very important. So to me this means that you will generally want to avoid custom solutions. Your answers should probably stick to the built-in services and things that are easy to change. So if the question involved VMs, you’d want to stick with a Google base image instead of trying to roll your own. If the question was about storage, you’d be better off using Cloud SQL instead of trying to maintain your own MySQL server. Things like that.
Also, I see that things have been growing “exponentially” and that things need to be able to “scale”. So for any questions involving this case study, you are going to want to make sure that your solutions are scalable. Now for compute, that means using things like managed instance groups. Or for GKE, you would use a cluster autoscaler. For storage, you need to pick solutions that won’t run out of space and can automatically handle higher levels of I/O. All of your answers need to be able to automatically scale with demand and avoid common bottlenecks.
I did notice that they do have a disaster recovery plan, and so you may be asked how to migrate this over to GCP. You might also get questions about dealing with lost data, or how to deal with a major outage. You should be familiar with Google best practices for disaster recovery. I see here that they are currently using continuous deployment, so be aware of how to set up CI/CD on Google. Know the common services involved, and how to maintain and troubleshoot issues. Alright, next let’s go through the existing technical environment:
“EHR’s software is currently hosted in multiple colocation facilities. The lease on one of the data centers is about to expire. Customer-facing applications are web-based, and many have recently been containerized to run on a group of Kubernetes clusters. Data is stored in a mixture of relational and NoSQL databases (MySQL, MS SQL Server, Redis, and MongoDB). EHR is hosting several legacy file- and API-based integrations with insurance providers on-premises. These systems are scheduled to be replaced over the next several years. There is no plan to upgrade or move these systems at the current time. Users are managed via Microsoft Active Directory. Monitoring is currently being done via various open source tools. Alerts are sent via email and are often ignored.”
So we see they currently have multiple co-locations. As I mentioned before, this implies that any solution would need to be able to support multiple regions. So if you get a question about storage solutions, you might need to think about things like data replication. It looks like they are using containers on Kubernetes, so you could get some questions about GKE. You should be familiar with migrating applications from one cluster to another. They also previously mentioned scalability so understand how to automatically scale containers using GKE. Basically, you could get any number of Kubernetes questions on the exam.
It also looks like you could get any number of database questions as well. They currently are using both SQL and noSQL databases. So you should be familiar with how to migrate data over into the various offerings. You want to be able to know how to pick the right type of database depending on the data. I also notice Redis is mentioned as well, so be prepared to get questions about caching. When should data be stored temporarily vs. permanently? How do you set up a Redis cache on GCP? Etc.
Now this says that the APIs on-prem will NOT be migrated. So they might NOT ask you anything about building APIs, but you probably DO need to know how to connect your on-prem environment to GCP. That means you need to be familiar with things like Cloud VPN and Cloud Interconnect. Basically, any way in which your GCP services can securely access your on-prem APIs. They might even ask you how to handle DNS routing between the two.
The company is currently using Active Directory, so you need to be familiar with syncing to AD and using LDAP. They might even ask you to replace Active Directory with Google Cloud Identity. Also, the company is currently using monitoring and alerting, and it looks like the alerts are not effective. So you will be expected to understand how to set up effective monitoring and alerting in GCP. Next, let’s go through the business requirements:
- “On-board new insurance providers as quickly as possible
- Provide a minimum 99.9% availability for all customer-facing systems
- Provide centralized visibility and proactive action on system performance and usage Increase ability to provide insights into healthcare trends
- Reduce latency to all customers
- Maintain regulatory compliance
- Decrease infrastructure administration costs
- Make predictions and generate reports on industry trends based on provider data”
So here, on-boarding quickly means that your solutions should not need a lot of setup. You don’t have to manually provision resources every time you add a new provider. The entire system needs to be highly available, so understand how to achieve that. Be familiar with services that automatically provide 99.9% availability. If a VM dies, another should automatically replace it. If a region goes down, requests should automatically be re-routed. Things like that.
You could also be asked about system performance and usage, so know how to track and monitor that. And know how to set up alerts, so if your performance dips or if there is a usage spike, you can be notified. It looks like they want to track insights and identify trends. So I would imagine this could involve logging, creating dashboards, and maybe even Big Data. You might be asked about using Bigtable here, so I’d be familiar with that.
Low latency is important, so understand how to reduce latency to your customers spread out all over the world. This sounds like supporting multiple regions. You might get questions on Cloud CDN or load balancers. Anything that might impact latency. You need to be aware about how to handle regulatory compliance. We’ve already covered this. There could be questions that require you to pick a solution based on cost. Here it talks about administration costs, so this I assume means to automate as much as possible. You don’t want to pick solutions that require an employee to manually build and maintain them. Managed services are your friend.
This last part talks about making predictions and generating reports. So that could imply either AI or ML questions. It definitely seems to indicate Big Data at the very least. So know how to collect data, store data, and generate custom reports. Let’s go through the technical requirements:
- “Maintain legacy interfaces to insurance providers with connectivity to both on-premises systems and cloud providers
- Provide a consistent way to manage customer-facing applications that are container-based
- Provide a secure and high-performance connection between on-premises systems and Google Cloud
- Provide consistent logging, log retention, monitoring, and alerting capabilities
- Maintain and manage multiple container-based environments
- Dynamically scale and provision new environments
- Create interfaces to ingest and process data from new providers”
So, we see the need to establish and maintain a connection between GCP and the on-prem. That implies VPNs, Cloud Interconnect, routing, DNS, all that and more. We are reminded that containers are being used. I think we covered that. Again, we see the connection to on-prem needs to be both secure and fast. So you should understand the different options and be able to pick the best one based upon security and speed. We know that there is going to be logging, monitoring and alerting.
Here we see there will be multiple container-based environments, so you might need to deal with multiple GKE clusters. Either in different regions, or for different environments such as testing vs. staging vs. production. You might also be asked about dynamically provisioning and scaling environments. Or even creating APIs. So finally, let’s read the executive statement:
“Our on-premises strategy has worked for years. But it has required a major investment of time and money in training our team on distinctly different systems, managing similar but separate environments, and responding to outages. Many of these outages have been a result of misconfigured systems, inadequate capacity to manage spikes in traffic, and inconsistent monitoring practices. We want to use Google Cloud to leverage a scalable, resilient platform that can span multiple environments seamlessly and provide a consistent and stable user experience that positions us for future growth.”
They are mostly just repeating themselves here. I again see they mention multiple environments. I see there may be questions about dealing with outages. There could be questions about fixing a misconfigured system. Scaling should already take care of inadequate capacity and spikes in traffic. And we already mentioned monitoring. And again, things need to be scalable, resilient and we should support multiple environments. So I think we covered everything here. You can see that this case study covers a lot of ground. But that should be it for the EHR Healthcare case study.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.