The course is part of these learning paths
If you’re going to work with modern software systems, then you can escape learning about cloud technologies. And that’s a rather broad umbrella. Across the three major cloud platform providers, we have a lot of different service options, and there’s a lot of value in them all.
However, the area that I think Google Cloud Platform excels in is providing elastic fully managed services. Google Cloud Platform to me, is the optimal cloud platform for developers. It provides so many services for building out highly available - highly scalable web applications and mobile back-ends.
For me personally, Google Cloud Platform has quickly become my personal favorite cloud platform. Now, opinions are subjective, but I’ll share why I like it so much.
I’ve worked as a developer for years, and for much of that time, I was responsible for getting my code into production environments and keeping it running. I worked on a lot of smaller teams where there were no operations engineers.
So, here’s what I like about the Google Cloud Platform, it allows me to think about the code and the features I need to develop, without worrying about the operations side because many of the service offerings are fully managed.
So things such as App Engine allow me to write my code, test it locally, run it through the CI/CD pipeline, and then deploy it. And once it’s deployed, for the most part, unless I’ve introduced some software bug, I don’t have to think about it. Google’s engineers keep it up-and-running, and highly available. And having Google as your ops team is really cool!
Another thing I really like about is the ease of use of things such as BigQuery and their Machine Learning APIs. If you’ve ever worked with large datasets, you know that some queries take forever to run. BigQuery can query massive datasets in just seconds. Which allows me to get the data I need quickly, so I can move on to other things.
And with the machine learning APIs, I can use a REST interface to do things like language translation, or speech to text, with ease. And that allows me the ability to integrate this into my applications, which gives the end-users a better user experience.
So for me personally, I love that I can focus on building out applications and spend my time adding value to the end-users.
If you’re looking to learn the fundamentals about a platform that’s not only developer-friendly but cost-friendly, then this is the right course for you!
By the end of this course, you'll know:
- The purpose and value of each product and service
- How to choose an appropriate deployment environment
- How to deploy an application to App Engine, Kubernetes Engine, and Compute Engine
- The different storage options
- The value of Cloud Firestore
- How to get started with BigQuery
This is an intermediate-level course because it assumes:
- You have at least a basic understanding of the cloud
- You’re at least familiar with building and deploying code
- Anyone who would like to learn how to use Google Cloud Platform
Welcome back to Google Cloud Platform: Fundamentals. I'm Ben Lambert and I'll be your instructor for this lesson. In this lesson, we'll talk about some of the different options for data storage on Google Cloud. We'll cover cloud storage, Bigtable, and Cloud SQL. Let's start with cloud storage.
Cloud storage is a BLOB storage service. It allows you to store your files using Google's highly scalable, incredibly durable storage infrastructure. By default, Cloud Storage encrypts our data both at rest and in flight and it offers three different storage classes and the way Google Storage is set up allows us to store data using the same service and same APIs, but with different access pricing.
First up we have standard storage which is meant for data that's accessed frequently or for data that doesn't need to be stored for very long periods of time. It has the highest availability, but it also comes with the highest price tag.
Then we have Nearline storage which offers storage for files that only need to be accessed at an infrequent rate. We're talking maybe monthly or so. And this would be useful for certain types of backup files.
And then finally we have Coldline storage which offers a way to store archives and backups that we really don't need to access all that often. Maybe it's a compliance issue. And these sorts of documents might be accessed yearly or even not at all. They might just need to be there for compliance reasons. And unlike some cold storage services, you don't have to wait days to retrieve the files.
There are three options for where your data gets stored. To maximize an application's performance, you can choose to store your data in the same region as the services that are going to consume that data. Even though it's only stored in one region, it still does have a fairly high level of availability because it's replicated across multiple zones in that region.
If you wanna have the same performance benefits, but you also wanna increase your data's availability, you can choose the dual-region option and that will give you geo-redundancy.
Now finally there's the multi-region option which is the best way to make your data available around the world with very low latency and it's great for using with website content distribution, video streaming and those sorts of things.
If you wanna learn more about Cloud Storage basics, I recommend you check out the documentation. Check it out at cloud.google.com/storage/docs.
Okay. Let's create a storage bucket. When we're prompted, we're gonna select a name and we're gonna call this ca-storage.
And we have options for a location and a storage class. Okay, let's create this.
Now, let's check out some of the functionality of buckets. Buckets offer things like versioning through the SDK. Let's create a file. We're gonna use the echo command and we'll pipe in some text to the file demo.txt. And now we'll use the gsutil command to upload it to cloud storage.
And once we refresh the page, there it is, perfect. If we click it, we'll see in another tab that it has the contents we entered. Okay. Let's check if versioning is enabled and we can do that with the get call. Okay, it's not currently enabled. So let's enable it with the gsutil command. And I will double-check to make sure that we set it.
Okay, it's set. Let's echo some different text into that demo.txt file and now let's upload it again, and if we refresh and view it again, we can see that it has the new text. So, if we use the gsutil command to list all of the files in our bucket, we can see that we have a demo.txt file, and it's the current version.
However, if we run that command again, but we pass in the A flag, we can see all of the versions for all of the files. Now, if we use the cat command to view the latest file, we can see that its content is what we'd expect, but if we use the cat command again, passing in the original version, we can see the original content.
So, we can use this to track and utilize different versions of our files. There's additional functionality, such as life cycle management that allows us to have files deleted on specific dates or after a set number of days, and we can even save just a set number of versions of a given file and then delete the others.
So, there's a lot of functionality baked into cloud storage. Two things I really like about cloud storage are the ability to transfer objects from other buckets or from AWSS3. Being able to transfer from S3 into a cloud storage bucket allows for a very simple backup solution or a great way to switch your BLOB storage to Google Cloud.
And the other feature that I really like is the ability to create static web sites. This gives you a highly scalable, inexpensive static website, which is something I find comes in handy for things like deploying marketing sites, demos, and wireframes. Cloud Storage integrates with many of the other Google Cloud services and is sort of a central hub for getting data into different Google Cloud services.
The next storage option we'll talk about is Bigtable, which is like Cloud Datastore in that it's a no SQL database. However, they're pretty different. Bigtable is a sparsely populated database supporting billions of rows and thousands of columns. It's ideal for having very large amounts of single-keyed data with very low latency.
Let's create a cluster and check it out. We'll set the name of the instance and then we're gonna set the zone that we wanna run it in. And then we can change the storage from solid-state to standard hard disc drives. However, we'll leave it as SSDs. Now, there's a pretty noticeable difference between the two for IOPS, and we'll create this.
And now let's use some Python and interact with it just a bit. First, we're gonna clone the git repo that has some sample code, and we'll CD into the Bigtable directory and we'll need to install some additional dependencies, so we'll use pip for that. And once that's done. Now, let's run the code sample that lives in main.py.
Okay, so what's happening with this code is that it's creating a table. It's adding some data and then it's scanning the table for the data and listing it out before it deletes the table. Let's look over that Python code. It starts by importing some libraries, things like argparse ad gcloud, and if we scroll down, what it's doing is creating a table, and then it's gonna add some greetings to the table, and then just to demonstrate fetching a single row, it's gonna fetch the first row, and then it scans through all of the rows and prints them out, and finally it deletes the table.
Now, we're not really using this to its full potential. If you were to use Bigtable, you'd wanna use it with large data sets, and by that, what I mean is at least a terabyte. Otherwise, you'd be better off using some other storage option most likely. If you do use it, you can stream data into it with Cloud Dataflow Streaming, Spark Streaming, or Apache Storm, or you can use batch processes such as Hadoop, MapReduce, Dataflow, or Spark.
So, you have options for how you interact with it. Bigtable is the same technology that Google uses for Google Analytics and for Gmail. It's a highly proven technology and it's worth a look if you deal with large data sets. Let's switch to relational databases. Google Cloud Platform offers Cloud SQL, which is a managed relational database service.
There are plenty of tasks where SQL databases are the right tool for the job, so Google offers two managed SQL options. They currently have options for MySQL and Postgres. Cloud SQL has evolved over time. The first version only supported MySQL, so for MySQL, there are now two options, the first and second generations.
They offer a MySQL instance with vertical scaling for reads and writes and horizontal scaling for reads. The first generation instances can have up to 16 gigs of RAM and 500 gigs of data storage, and the second generation instances can have up to 104 gigs of RAM and 10 terabytes of storage for data. So, let's see how easy it is to create an instance.
We'll start by selecting the database engine and for this demo, let's use MySQL. Let's also select the Second Generation. Okay. The first thing we need to do is set the instance ID. Perfect. The next setting allows us to select the MySQL version, which can be either 5.6 or 5.7, and then we can set the region and zone.
And we can also change the machine type. You can see here that there are a lot to choose from, and we have an option for storage type. We can use SSDs or hard disks, and we can set an auto backup time frame. And we can allow for binary replication logging here, and we can also create a failover with just a click of a mouse, which is a really cool feature.
Next up, we can set the maintenance window, which allows you to determine which day and roughly the time of day that any Google applied updates will happen, and that doesn't happen all that often. And we can also set the Root password here. Okay, let's create this. This is going to take a few minutes, so what I'll do is fast forward to once this is complete.
Okay, there we are. This took about five minutes in the end, and now clicking on the instance takes us to the dashboard where we can monitor the instance itself. Scrolling down you can see we have some details about the instance that's running MySQL, and this chart here allows you to change the property and get some information about different data points.
Clicking on access control, you can see we have the ability here to limit the access to our database. Under the authorization sub-tab, we can determine which networks can access the database. The users option allows us to create some MySQL database users, and then the SSL tab allows you to manage client certs as well as determine if all connections should be over SSL or not.
The database tab allows you to create a new database from the console, which is a nice feature. On the backups tab, we can perform manual backups or adjust the schedule. On the replicas tab, we can add and delete replicas, and the operations tab is going to show you basically a list of what's happened.
Now, you can also import and export data using these import and export links at the top, and this allows you to use SQL or CSV formats. So, in a nutshell, that's Cloud SQL. Let's wrap up with a comparison of the storage options on Cloud Platform. Cloud storage is BLOB storage option, which is good for structured or unstructured binary or object data.
That's things like images, media files, et cetera. We have two NoSQL options. The first is Cloud Datastore, which is great for app engine applications and works well for things like product catalogs, user profiles, and things like this. The other NoSQL option is Bigtable, which is great for things that will perform a lot of reads and writes, analytical and event data as well.
And it's often used by ad agencies for ad engines or financial data and disparate Internet of Things devices, and then we have Cloud SQL, which is a relational database using MySQL, and it's a great option for things like existing web frameworks and existing MySQL-based applications. Now, storage is a big part of applications, and few companies have perfected storage at the scale Google has. So, regardless of your storage needs, Google most likely will have you covered.
In our next lesson, we're gonna talk about Kubernetes Engine. So, when you're ready to keep learning, I will see you in the next lesson.
Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.