This course introduces machine learning on Google Cloud Platform.
Learning Objectives
- What machine learning actually means
- The types of problems machine learning can solve
- The basics of how machine learning works
Intended Audience
- Anyone interested in machine learning
Prerequisites
- A basic understanding of computers
Machine learning can seem almost magical. It can do some amazing things. However, it does have limits. In this lesson, we are going to discuss the kinds of problems machine learning is used to solve today.
Now at a very basic level, machine learning works via pattern recognition. A machine learning algorithm is fed a large amount of data and then it tries to identify any patterns. However, there are different ways to solve different problems. Currently, these techniques can be broken up into four main approaches: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.
Supervised learning is sort of like learning by example. I want you to think back to being a child and learning to identify different kinds of animals. When you were young, your parents probably did not describe all the physical attributes of a dog. Instead, they pointed at a dog and said “dog”. Then they pointed at a cat and they said “cat”. You might have looked at pictures in books or images on the internet. But eventually, after seeing enough examples of cats and dogs, you were able to tell the difference.
Supervised machine learning is very similar. It has a training phase and a testing phase. During the training phase, you feed in a labeled dataset (like say, pictures of dogs). The algorithm will then look for patterns in the dataset to learn how to identify a dog. Then in the testing phase, you provide some new data (like photos it has never seen before) to verify that the algorithm can correctly identify them. And if it scores high enough on the tests, then it is considered successful. But if it gets things wrong too often, then you will need to either tweak the data or tweak the algorithm.
The key to supervised learning is that it requires a human supervisor. You need someone to create the labeled dataset to use for training. And you need someone to create tests to verify that the algorithm is working properly. So supervised learning works great when you have a clearly defined standard for success. You understand what a dog is. You know what a dog looks like. And you just need to teach that to a computer.
Typically, supervised learning is used for solving both classification and regression problems.
Classification problems ask you to place something into a category.
So it is questions like:
-
Is it a dog or a cat?
-
Is the current temperature hot or cold?
-
Is this email legitimate or is it a scam?
Regression problems ask you to predict a specific number or amount.
So it is questions like:
-
What temperature will it be tomorrow?
-
How many customers will purchase my new product?
-
What is the fair market value of this house?
The second approach is called unsupervised learning. Unsupervised learning does not require label data to work. Therefore, it also does not require a supervisor. So instead of needing to compile a huge list of dog photos, in unsupervised learning you would just feed in a bunch of random photos. The algorithm will still look for patterns, but it discovers them on its own instead of being told what to look for.
So it is sort of similar to you learning which foods taste good as a child. Your parents probably fed you many different meals, and you learned to group foods into certain categories. Maybe you think beets are delicious, but eggplant is not. You might enjoy the taste of white chocolate, but you think dark chocolate is just ok. No one taught you what you should like. Well, maybe your parents tried. But really you figured it out on your own.
Unsupervised learning is the same idea. You feed in unlabeled data (like say family photos). Then it will look for patterns and group those photos into categories. So it might group all the pictures of your mom together. And then all the photos of your dog together. You are not telling it what your mom or dog looks like. It just can tell which pictures are similar looking.
Unsupervised learning is most useful when you do not have a specific result in mind. So if you want to organize your photos, but you don’t want to define the categories. You can let the algorithm do it for you.
Unsupervised learning is great at solving both clustering and association problems.
Clustering problems require you to group large sets of unorganized data. So let's say you have millions of customers and you want to break them up into marketable groups. You can use unsupervised learning to discover which groups exist and to assign your customers to the appropriate ones.
Association problems ask you to find relationships between different variables. So think about being able to predict what your customers like based on their previous behavior. This is most commonly seen in recommendation engines. For example, Amazon recommends new products based upon your purchase history. And Youtube suggests new videos based upon what you previously watched.
Semi-supervised learning is a hybrid approach and it uses parts of both supervised and unsupervised learning. Basically, you provide a small incomplete labeled dataset for training. However, the majority of the data processed will still be unlabeled. Now, this is useful for situations when you want to provide some guidance, but you don’t have the time or expertise to create and label an entire dataset.
Reinforcement learning works a little bit differently from the other three. Here you are trying to teach the algorithm to make a series of decisions. And these decisions need to be based on certain criteria. So for example, you might want an algorithm that can figure out the best way to travel to a certain destination. Should you take a plane? Or should you take a boat? Perhaps you need to fly part of the way, and then switch to a train. In reinforcement learning, the algorithm makes some decisions and then calculates a cost. It tries many different variations, calculating the cost each time, and then it picks the optimal solution. So it’s trial and error.
As you can see, Machine Learning can be used to solve a pretty wide range of problems. Some of these problems are well understood, and we simply want a way to automate them. But some problems are simply too complex for a human to solve. Machine learning can provide answers that we would otherwise not have.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.