1. Home
  2. Training Library
  3. Google Cloud Platform
  4. Courses
  5. Designing a Google Cloud Infrastructure

Data Protection and Encryption


Course Introduction
Case Study
Mapping Needs to GCP Services
7m 7s
7m 51s
11m 17s
Disaster Recovery

The course is part of these learning paths

Google Professional Cloud Developer Exam Preparation
Google Data Engineer Exam – Professional Certification Preparation
Google Cloud Platform for Solution Architects
more_horizSee 3 more
Start course
Duration1h 6m


Google Cloud Platform (GCP) lets organizations take advantage of the powerful network and technologies that Google uses to deliver its own products. Global companies like Coca-Cola and cutting-edge technology stars like Spotify are already running sophisticated applications on GCP. This course will help you design an enterprise-class Google Cloud infrastructure for your own organization.

When you architect an infrastructure for mission-critical applications, not only do you need to choose the appropriate compute, storage, and networking components, but you also need to design for security, high availability, regulatory compliance, and disaster recovery. This course uses a case study to demonstrate how to apply these design principles to meet real-world requirements.

Learning Objectives

  • Map compute, storage, and network needs to Google Cloud Platform services
  • Create designs for high availability and disaster recovery
  • Use appropriate authentication, roles, service accounts, and data protection
  • Create a design to comply with regulatory requirements



Protecting data is critical in any organization. Google Cloud platform is very strong in this area because of its default encryption policies. Before we get into encryption though, let's look at access control lists or ACLs.

ACLs specify who has access to Cloud Storage buckets and objects in buckets. I'm not going to cover this topic in depth, but there are a few things to keep in mind when you're deciding what ACLs to apply to your cloud storage.

First, there are actually five different mechanisms for controlling access to cloud storage, IAM Permissions, ACLs, signed URLs, signed policy documents, and firebase security rules. With so many different ways to control access, you have to be careful not to create conflicting Permissions. Start with the first two, IAM Permissions and ACLs.

IAM Permissions work at the project level, for example you can specify that a user has full control of all the objects in all the buckets in your project but cannot create, modify or delete the buckets themselves so they're a nice way to grant broad access to buckets and objects, but if you want to set fine grained access such as which buckets or objects a particular group can read then you need to use ACLs.

The confusing thing about using these two mechanisms is that you have to look at both of them to get a complete picture of access permissions. For example, you could list the ACLs for a bucket and see that only Bob has been granted write access but it wouldn't show that Joe has also been granted write access to all buckets by IAM. For this reason, whenever possible you should try to use either IAM or ACLs but not both.

Another potential source of confusion is that bucket and object ACLs are independent of each other. The ACLs in a bucket do not affect ACLs on objects inside that bucket. For example, you might think that Jane doesn't have access to the objects in a bucket because she hasn't been granted access to the bucket itself but she could have been granted access to any of the objects in the bucket.

So you should keep in mind a couple of principles, first apply the principal of least privilege. Grant users in groups only as much access as they need. Second, keep your access control as simple as possible. Try to use as few control mechanisms as you can.

If GreatInside decides to replace its internal file server using Cloud Storage, then the best way to secure the files would be to use ACLs. You could create groups to match the teams in the company and create ACLs that give those groups access to the appropriate resources. For example, you could create a bucket for each group, then for each bucket you would make the associated group a writer of the bucket. Finally, you would set the object default permissions so that any new object uploaded to the bucket would get the same permissions and everyone in the group would have full access. If the companies needs aren't that simple, then you would set more complex ACLs.

Now let's move on to encryption. To ensure that your data is encrypted at all times, it needs to be encrypted when it's in storage, also known as at rest, and when it is being sent over a network, also known as in flight. Google Cloud platform takes care of both of these situations.

Encryption in flight is handled very simply, all of the Cloud Platform services are accessible only by API, even when you're using other methods such as the cloud console or the gcloud command. They're making API calls under the hood and all API communication is encrypted using SSL/TLS channels. Furthermore, every request has to include a time limited authentication token so the token cannot be used by an attacker after it expires. Of course for any communications between your Google Cloud infrastructure and outside parties such as website visitors, you have to use SSL/TLS yourself to encrypt the traffic.

Encryption at rest is just as simple if you're willing to leave it to Google because Cloud Platform encrypts all customer data at rest by default.

So without you having to do anything, all of your data will be encrypted both at rest and in flight. Then, why isn't this the end of the lesson? Well, because your organization might want to take on some of the encryption responsibilities itself.

There are actually two layers of encryption for data at rest. First, the data is broken into subfile chunks, and each chunk is encrypted with an individual data encryption key or DEK. These keys are stored near the data to ensure low latency and high availability. The DEKs are then encrypted with a key encryption key or a KEK. The keys are AES-256 symetric encryption keys.

Google always manages the data encryption keys but your organization can manage the key encryption keys if that's your preference. There are two options for doing this. Customer-managed encryption keys, or customer-supplied encryption keys. With the customer managed option, use the Cloud KMS service to create, rotate or automatically rotate and destroy your encryption keys. The keys are hosted on Google Cloud. You can have as many keys as you want, even millions of them if you actually need that many. You can set user level permissions on individual keys using IAM and monitor their use with Cloud Audit Logging.

Cloud KMS is a nice service, but why wouldn't you just let Google manage your key encryption keys and not have to deal with it yourself? The biggest reason is compliance with standards or regulatory requirements such as HIPPA for health information or PCI for credit card information.

If your organization requires that you generate your own keys and/or that they're managed on premises, then you have to use customer supplied encryption keys. Be aware that this option is only available for Cloud Storage and Compute Engine.

With CSEK, Google doesn't store your key. You have to provide your key for each operation and your key is purged from Google Cloud after the operations is complete. It only stores an SHA256 hash of the key as a way to uniquely identify the key that was used to encrypt the data. When you make a request to read or write the data in the future, your key can be validated against the hash. The hash cannot be used to decrypt your data.

There is a big risk in using this method though. If you lose your keys, you won't ever be able to read your data again and you'll end up deleting it so you won't be paying storage charges for unreadable data.

So far all of the encryption methods we've covered including default encryption, Cloud KMS and CSEK have been examples of server side encryption. This is where your data is encrypted after Google Cloud receives your data. The only major difference between the three methods is where the key comes from but there is another way. It's client side encryption. This means you encrypt the data before you send it to Google Cloud. Google won't even know that it's already encrypted and it will encrypt it again. When you read your data back, Google Cloud will decrypt it on the server side first and then you'll decrypt your own layer of encryption on the client side. The same warning applies. If you lose your keys, your data will effectively be gone.

Since our case study includes credit card information, we'll need to be PCI DSS compliant so we should use Cloud KMS to manage our keys. I'll talk more about PCI compliance in the next lesson.

About the Author
Learning paths63

Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).