CloudAcademy
  1. Home
  2. Training Library
  3. Google Cloud Platform
  4. Courses
  5. Designing a Google Cloud Infrastructure

Storage

The course is part of these learning paths

Google Cloud Platform for Solution Architects
course-steps 3 certification 1 lab-steps 1 quiz-steps 3

Contents

keyboard_tab
Introduction
Mapping Needs to GCP Services
3
Compute6m 45s
4
Storage7m 51s
6
Networks7m 27s
Disaster Recovery
Conclusion
play-arrow
Start course
Overview
DifficultyIntermediate
Duration1h 1m
Students1269

Description

Google Cloud Platform (GCP) lets organizations take advantage of the powerful network and technologies that Google uses to deliver its own products. Global companies like Coca-Cola and cutting-edge technology stars like Spotify are already running sophisticated applications on GCP. This course will help you design an enterprise-class Google Cloud infrastructure for your own organization.

When you architect an infrastructure for mission-critical applications, not only do you need to choose the appropriate compute, storage, and networking components, but you also need to design for security, high availability, regulatory compliance, and disaster recovery. This course uses a case study to demonstrate how to apply these design principles to meet real-world requirements.

Learning Objectives

  • Map compute, storage, and network needs to Google Cloud Platform services
  • Create designs for high availability and disaster recovery
  • Use appropriate authentication, roles, service accounts, and data protection
  • Create a design to comply with regulatory requirements

 

Transcript

Each of the instances for the Tomcat and IIS servers will come with the Standard Persistent Boot Disk by default, but we might need something different. There are many options for instance storage including Standard Persistent Disks, SSD Persistent Disk, Local SSD, RAM Disk, and Cloud Storage.

Standard Persistent Disks are magnetic drives. Their main advantage is low cost. SSD Persistent Disks or solid state disks, have up to four times the throughput and up to 40 times the IO operations per second of a standard persistent disk. So if you need high performance, SSDs are a must. But SSD Persistent Disks aren't even your fastest option. Local SSDs are up to 600 times as fast as standard persistent disks in IOPS and up to 15 times as fast in throughput.

Why are local SSDs so much faster than SSD persistent disks? Which are obviously both using SSD technology. Well, it's because local SSDs are not redundant and are directly attached to an instance. That gives them major speed advantages, but with high risk. Because if they suffer hardware failure, then your data will be gone. Furthermore, local SSDs disappear when you stop or delete an instance. So you should only use them for temporary data that you can afford to lose, such as a cache.

There are a couple more disadvantages of local SSDs too. First, they are only available in one size. 375 GIG. Which is kind of an awkward number. Second, they can't be used as boot disks.

If you need even faster storage, then you can use RAM Disks. Which essentially makes a chunk of memory look like a file system. Although RAM Disks are the fastest option, they are even less durable than local SSDs. So they are only suitable for temporary data. It's also an expensive option, because ram is much more expensive than SSDs.

One more option is Cloud Storage. This is kind of a weird way to add storage to an instance. Because a bucket is object storage rather than block storage. That means it can't be used as root disk and it may be unreliable as a mounted file system. So why would you ever use it? Well, the first advantage of using cloud storage is that multiple instances can write to a bucket at the same time. You can't do that with persistent disks, which can only be shared between instances in read only mode The danger is one instance could overwrite changes made by another instance. So your application would have to take that into account.

The second advantage is that an instance can access a bucket in a different zone or region. Which is great for sharing data globally. Especially if it's read only data, which would avoid the overwriting problem.

However, Cloud Storage usually isn't a good option for instance storage. It is good for general purpose file serving though. So it would be a good choice for replacing GreatInside's internal file server if they want to move it to the cloud. The best way would be to use Cloud Storage FUSE. Which is open sourced software that translates object storage names into a file and directory system. Essentially, it makes cloud storage buckets look like network file systems.

So which instance storage option should we use for our instances? Since performance is important, we should use something faster than standard persistent disks. SSD persistent disks, are many times faster than standard ones so they'd be a good choice. Should we consider local SSDs or RAM disks? Well, neither of those can be boot disks. So we would have to use those in addition to a persistent boot disk. The higher performance would not weigh the extra costs and complexity of using one of these options though. So we should just stick with SSD persistent disks. Furthermore, since persistent disks are redundant, we don't need to have two mirror disks on each instance like GreatInside does in its existing data center. We can have a single persistent book disk on each instance.

As for the size, we can specify the exact amount we need. So for the Tomcat servers, we should use one 200 gig disk on each instance and for the IIS servers, we should use one 250 gig disk on each.

Next we need to look at our database options. Google Cloud has five different database services. Cloud SQL, Cloud Datastore, Bigtable, BigQuery and Cloud Spanner.

Cloud SQL is a relational database. It's a managed MySQL or PostGres serivce. It is suitable for everything from blogs to ERP and CRM to E-commerce.

Cloud Datastore is a NoSQL database service. Unlike a relational database, such as CloudSQL, it's horizontally scalable. A relational database can scale vertically. Meaning you can run it on a more powerful VM to handle more transactions, but there is obviously limits to the size of a VM. You can also scale a relational database horizontally for reads by using read replicas. But most relational databases can't scale horizontally for writes. That is a major problem that is solved by NoSQL databases.

Because of this, and because it is an eventually consistent database, Cloud Datastore is faster than CloudSQL. It's best suited to relatively simple data and queries, especially key value pairs. Typical examples include, user profiles, product catalogs and game state. For complex queries, CloudSQL is a better choice.

Cloud Bigtable is also a NoSQL database. It is designed to scale into the petabyte range with high throughput and low latency. It does not support ACID transactions so it shouldn't be used for transaction processing. It's best suited for storing huge amounts of single key data. If you have less than one terabyte of data then Bigtable is not the best solution. It can handle big data in real time or in batch processing. Typical examples are Internet of Things applications and product recommendations.

BigQuery also handles huge amounts of data, but it's more of a data warehouse. It's something you use after data is collected rather than being a transactional system. It's best suited to aggregating data from many sources and letting you search it using SQL queries. In other words, it is good for OLAP. That is, Online Analytical Processing and business intelligence reporting.

Google's newest database services is Cloud Spanner. Which seems to combine the best of all worlds. It's a relational database that also scales horizontally. That is, it combines the best features of traditional databases like Cloud SQL and the best features of NoSQL databases like Cloud Datastore. So why wouldn't you use it for all of your database needs? Well mostly because it's more expensive than the other options. Also if your application is written specifically for a particular database such as MySQL, then CloudSQL would be a better choice. Unless you can rewrite it to work with Cloud Spanner. Finally, Cloud Spanner is Google's newest database service so you may want to wait a while unit it has proven itself.

So use Cloud Spanner when you need a relational database that is massively scalable. Typical uses are financially services and global supply chain applications.

Although Google Cloud seems to provide a database service for every need, you may still want to manage your own databases using Compute Engine. One of the most common reasons for doing this, is if your application is written to work with Microsoft SQL server. As I mentioned early, the easiest way to do this is to build your VMs with one of the pre-configured SQL boot disks.

Now, which database services should GreatInside use? It currently has two production databases. MySQL for the interior design application and SQL server for payment processing. There are two ways you could migrate the MySQL database to Google Cloud. You could use CloudSQL or run MySQL on a regular instance. Considering that GreatInside wants to reduce system management tasks, CloudSQL would be the best choice since it's a fully managed MySQL service with automatic replications and backups.

In contrast, Google does not have a managed Microsoft SQL server offering. So you have no choice, but to implement it on regular instances. The preferred way to do this, is to use Google's pre-configured SQL Server images. Which are standardized and take care of pay per use licensing for both SQL server and Windows server.

GreatInside does have one more database. Their experimental NoSQL datastore. Since the development team is still evaluating this technology, you should talk to them about trying Cloud Datastore. They should also try AppEngine. Because Cloud Datastore works best when used with AppEngine.

And that's it for storage and databases.

About the Author

Students12788
Courses41
Learning paths20

Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).