Handling Missing Data

Lab Steps

lock
Logging in to the Amazon Web Services Console
lock
Opening the Lab's Jupyter Notebook
lock
Solutions to Handling Missing Data

The hands-on lab is part of this learning path

AWS Machine Learning – Specialty Certification Preparation
39
14
15

Ready for the real environment experience?

DifficultyIntermediate
Time Limit1h
Students114
Ratings
5/5
starstarstarstarstar

Description

What do you do when there is unknown or missing values in your data?

This lab will walk you through a number of ways to handle missing data including using a default value and building a model to predict the missing data based on other variables that are present in the data set.

Learning Objectives

Upon completion of this lab you will be able to:

  • Import data using pandas
  • Check for missing values
  • Drop rows with missing data
  • Replace missing values with default values
  • Impute missing values using a prediction model

Intended Audience

This lab is intended for:

  • Machine learning engineers
  • Anyone interested in evaluating machine learning model performance

Prerequisites

You should possess:

  • A basic understanding of Python
About the Author
Students7287
Labs31
Courses13
Learning paths17

Calculated Systems was founded by experts in Hadoop, Google Cloud and AWS. Calculated Systems enables code-free capture, mapping and transformation of data in the cloud based on Apache NiFi, an open source project originally developed within the NSA. Calculated Systems accelerates time to market for new innovations while maintaining data integrity.  With cloud automation tools, deep industry expertise, and experience productionalizing workloads development cycles are cut down to a fraction of their normal time. The ability to quickly develop large scale data ingestion and processing  decreases the risk companies face in long development cycles. Calculated Systems is one of the industry leaders in Big Data transformation and education of these complex technologies.