1. Home
  2. Training Library
  3. Big Data
  4. Courses
  5. Common Machine Learning Models & How to Train Them

How to Choose?

Developed with
Calculated Systems


Course Introduction
Machine Learning Concepts & Models
Explaining Concepts
10m 11s
Start course


This course explores the core concepts of machine learning, the models available, and how to train them. We’ll take a deeper look at what it means to train a machine learning model, as well as the data and methods required to do so. We’ll also provide an overview of the most common models you’re likely to encounter, and take a practical approach to understand when and how to use them to solve business problems.

In the second half of this course, you will be guided through a series of case studies that will show you how to apply the concepts covered in this course to real-life examples.

If you have any feedback relating to this course, feel free to contact us at support@cloudacademy.com.

Learning Objectives

  • Understand the key concepts and models related to machine learning
  • Learn how to use training data sets with machine learning models
  • Learn how to choose the best machine learning model to suit your requirements
  • Understand how machine learning concepts can be applied to real-world scenarios in property prices, health, animal classification, and marketing activites

Intended Audience

This course is intended for anyone who is:

  • Interested in understanding machine learning models on a deeper level
  • Looking to enrich their understanding of machine learning and how to use it to solve complex problems
  • Looking to build a foundation for continued learning in the machine learning space and data science in general


To get the most out of this course, you should have a general understanding of data concepts as well as some familiarity with cloud providers and their managed services, especially Amazon or Google. Some experience in data or development is preferable but not essential.


Finally, before we dive into some case studies, I wanna leave everybody with one handy tool to help narrow down the broad field of machine learning models. This flow chart, if you go down it, will guide you to at least the right category of model to start looking at.

The flow chart starts with a very simple question, do you understand the problem enough to begin to model the inputs and outputs? If you don't, you need to go back to thinking about what exactly you're trying to answer.

Remember, you have to ask the machine learning model a good question such as, is it going to rain? Is it going to snow? How tall is this person? Once you can do that, you can move on to, do you have labeled data? Can you even generate labeled data?

If you're not able to get labeled data, you have to go down the path of unsupervised or maybe reinforcement-based learning. However, if you do have labeled data, and in many cases you will, you can go down the route of what's called supervised learning as we previously discussed.

To go a little beyond just saying supervised learning because there are so many different types, let's dive in just a little bit more into the different types of supervised learning. Once you've determined you can have labeled data, you need to think, what are your labels?

Remember, this is what you're expecting out of it. Are you expecting categorical data or numeric data? Numeric data is self-explanatory. It is something along the lines of a score in a video game or the speed of a car or the height of a person. It's literally just numbers.

On the flip side is categorical data. This is where the information is more left-handed or right-handed, or top or bottom. There's no clear number to put it to. Now, where it gets a little fuzzy is if it is an ordered categorical data such as bad, neutral, positive reviews 'cause you might be able to convert that to numeric.

So if that's the case, you need to think about which one you want a little bit more. But at the high level, numeric versus categorical will either drive you towards regression or classification-type machine learning. And finally, within classification, there are really two types of problems. There's binary and multi-class.

Binary problems simply are if there's two categories. Is the switch on or is the switch off? Is it true? Is it false? On the flip side, you have multi-class. This is if there's more than two categories such as, is it positive, neutral, or negative?

Hopefully, this chart helps you begin to narrow down what machine learning algorithms you want to use. But remember, always follow the process of defining the problem, understanding the inputs and outputs, picking which type of model you're going to use, and then picking a specific algorithm within that category.

Hopefully, this tool has helped you narrow down at least from hundreds of potential models to just a few. And now let's dig into some of the case studies to help you see how this process can be used in real life for answering real problems.

About the Author
Learning paths10

Calculated Systems was founded by experts in Hadoop, Google Cloud and AWS. Calculated Systems enables code-free capture, mapping and transformation of data in the cloud based on Apache NiFi, an open source project originally developed within the NSA. Calculated Systems accelerates time to market for new innovations while maintaining data integrity.  With cloud automation tools, deep industry expertise, and experience productionalizing workloads development cycles are cut down to a fraction of their normal time. The ability to quickly develop large scale data ingestion and processing  decreases the risk companies face in long development cycles. Calculated Systems is one of the industry leaders in Big Data transformation and education of these complex technologies.