image
Storage
Start course
Difficulty
Intermediate
Duration
1h 8m
Students
7870
Ratings
4.7/5
starstarstarstarstar-half
Description

Google Cloud Platform (GCP) lets organizations take advantage of the powerful network and technologies that Google uses to deliver its own products. Global companies like Coca-Cola and cutting-edge technology stars like Spotify are already running sophisticated applications on GCP. This course will help you design an enterprise-class Google Cloud infrastructure for your own organization.

When you architect an infrastructure for mission-critical applications, not only do you need to choose the appropriate compute, storage, and networking components, but you also need to design for security, high availability, regulatory compliance, and disaster recovery. This course uses a case study to demonstrate how to apply these design principles to meet real-world requirements.

Learning Objectives

  • Map compute, storage, and networking requirements to Google Cloud Platform services
  • Create designs for high availability and disaster recovery
  • Use appropriate authentication, roles, service accounts, and data protection
  • Create a design to comply with regulatory requirements

Intended Audience

This course is intended for anyone looking to design and build an enterprise-class Google Cloud Platform infrastructure for their own organization.

Prerequisites

To get the most out of this course, you should have some basic knowledge of Google Cloud Platform.

Transcript

Each of the instances for the Tomcat and IIS servers will come with a standard persistent boot disk by default, but we might need something different. There are many options for instance storage, including Standard Persistent Disk, SSD Persistent Disk, Local SSD, RAM Disk, and Cloud Storage.

Standard Persistent Disks are magnetic drives. Their main advantage is low cost.  SSD Persistent Disks (or solid state disks) have up to 4 times the throughput and up to 40 times the I/O operations per second of a Standard Persistent Disk, so if you need high performance, SSDs are a must.

But SSD Persistent Disks aren’t even your fastest option.  Local SSDs are up to 600 times as fast as Standard Persistent Disks in IOPS and up to 15 times as fast in throughput.

Why are Local SSDs so much faster than SSD Persistent Disks, which are obviously both using SSD technology? Well, it’s because Local SSDs are not redundant and are directly attached to an instance. That gives them major speed advantages, but with high risk because if they suffer a hardware failure, then your data will be gone. Furthermore, Local SSDs disappear when you stop or delete an instance, so you should only use them for temporary data that you can afford to lose, such as a cache.

There are a couple more disadvantages of Local SSDs too. First, they are only available in one size -- 375GB, which is kind of an awkward number. Second, they can’t be used as boot disks.

If you need even faster storage, then you can use a RAM disk, which essentially makes a chunk of memory look like a filesystem. Although RAM disks are the fastest option, they’re even less durable than Local SSDs, so they’re only suitable for temporary data. It’s also an expensive option because RAM is much more expensive than SSDs.

One more option is Cloud Storage. This is kind of a weird way to add storage to an instance because a bucket is object storage rather than block storage. That means it can’t be used as a root disk and it may be unreliable as a mounted filesystem. So why would you ever use it? The first advantage of using Cloud Storage is that multiple instances can write to a bucket at the same time. You can’t do that with persistent disks, which can only be shared between instances in read-only mode. The danger is that one instance could overwrite changes made by another instance, so your application would have to take that into account.

The second advantage is that an instance can access a bucket in a different zone or region, which is great for sharing data globally, especially if it’s read-only data, which would avoid the overwriting problem.

However, Cloud Storage usually isn’t a good option for instance storage. It is good for general-purpose file serving, though, so it would be a potential choice for replacing GreatInside’s internal file server if they want to move it to the cloud. To do this, you’d need to use Cloud Storage FUSE, which is open source software that translates object storage names into a file and directory system. Essentially, it makes Cloud Storage buckets look like network file systems. A better choice, though, would be Cloud Filestore, which is a fully-supported file sharing service that’s designed specifically for this purpose. It’s compatible with NFS version 3.

So, which instance storage option should we use for our instances? Since performance is important, we should use something faster than Standard Persistent Disks. SSD Persistent Disks are many times faster than standard ones, so they’d be a good choice. Should we consider Local SSDs or RAM disks? Well, neither of those can be boot disks, so we would have to use them in addition to a persistent boot disk. The higher performance wouldn’t outweigh the extra cost and complexity of using one of these options, though, so we should just stick with SSD Persistent Disks. Furthermore, since persistent disks are redundant, we don’t need to have two mirrored disks on each instance like GreatInside does in its existing data center. We can just have a single persistent boot disk on each instance.

As for the size, we can specify the exact amount we need, so for the Tomcat servers, we should use one 200GB disk on each instance, and for the IIS servers, we should use one 250GB disk on each.

Next, we need to look at our database options. Google Cloud has 5 different database services: Cloud SQL, Cloud Datastore, Bigtable, BigQuery, and Cloud Spanner.

Cloud SQL is a relational database. It’s a managed service for MySQL, PostgreSQL, or Microsoft SQL Server. It’s suitable for everything from blogs to ERP and CRM to ecommerce.

Cloud Datastore is a NoSQL database service. Unlike a relational database, such as Cloud SQL, it is horizontally scalable. A relational database can scale vertically, meaning that you can run it on a more powerful VM to handle more transactions, but there are obviously limits to the size of a VM. You can also scale a relational database horizontally for reads by using read replicas, but most relational databases can’t scale horizontally for writes. That is a major problem that is solved by NoSQL databases.

Because of this and because it’s an eventually consistent database, Cloud Datastore is faster than Cloud SQL. It’s best suited to relatively simple data and queries, especially key-value pairs. Typical examples include user profiles, product catalogs, and game state. For complex queries, Cloud SQL is a better choice.

Cloud Bigtable is also a NoSQL database. It’s designed to scale into the petabyte range with high throughput and low latency. It does not support ACID transactions, so it shouldn’t be used for transaction processing. It’s best suited for storing huge amounts of single-keyed data. If you have less than one terabyte of data, then Bigtable is not the best solution. It can handle big data in real-time or in batch processing. Typical examples are Internet of Things applications and product recommendations.

BigQuery also handles huge amounts of data, but it’s more of a data warehouse. It’s something you use after data is collected, rather than being a transactional system. It’s best suited to aggregating data from many sources and letting you search it using SQL queries. In other words, it’s good for OLAP (that is, Online Analytical Processing) and business intelligence reporting.

Google’s newest database service is Cloud Spanner, which seems to combine the best of all worlds. It’s a relational database that also scales horizontally. That is, it combines the best features of traditional databases like Cloud SQL and the best features of NoSQL databases like Cloud Datastore. So why wouldn’t you use it for all of your database needs? Well, mostly because it’s more expensive than the other options. Also, if your application is written specifically for a particular database, such as MySQL, then Cloud SQL would be a better choice, unless you can rewrite it to work with Cloud Spanner. 

So use Cloud Spanner when you need a relational database that is massively scalable. Typical uses are financial services and global supply chain applications.

Now, which database services should GreatInside use? It currently has two production databases -- MySQL for the interior design application and SQL Server for payment processing. There are two ways you could migrate the MySQL database to Google Cloud. You could use Cloud SQL or run MySQL on a regular instance. Considering that GreatInside wants to reduce system management tasks, Cloud SQL would be the best choice since it’s a fully managed MySQL service, with automatic replication and backups.

For SQL Server, you have the same two options. You could use Cloud SQL, or you could run it on a regular instance. Again, Cloud SQL is the best choice.

GreatInside does have one more database -- their experimental NoSQL datastore. Since the development team is still evaluating this technology, you should talk to them about trying Cloud Datastore. They should also try App Engine because Cloud Datastore works best when used with App Engine.

And that’s it for storage and databases.

 

About the Author
Students
201667
Courses
97
Learning Paths
162

Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).