1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. Get Started with Amazon SageMaker Data Wrangler, Data Pipeline, Feature Store and Ground Truth

Get Started with Amazon SageMaker Data Wrangler, Data Pipeline, Feature Store and Ground Truth

The course is part of these learning paths

Start Modelling Data with Amazon SageMaker
course-steps
2
certification
1
lab-steps
3
AWS Machine Learning – Specialty Certification Preparation
course-steps
37
certification
14
lab-steps
11
play-arrow
Introduction to SageMaker Data Wrangler
Overview
DifficultyBeginner
Duration32m
Students153
Ratings
3.5/5
starstarstarstar-halfstar-border

Description

Get started with the latest Amazon SageMaker services — Data Wrangler, Data Pipeline and Feature Store services — released at re:Invent Dec 2020. We also learn about the SageMaker Ground Truth and how that can help us sort and label data. 

Get a head start in machine learning by learning how these services can reduce the effort and time required for you to load and prepare data sets for analysis and modeling. Data scientists will often spend 70% or more of their time cleaning, preparing, and wrangling their data into a state where it’s suitable to train machine learning algorithms against the data. It’s a lot of work, and these new SageMaker services provides an easier way. 

Transcript

Hello and welcome, I'm Andy Larkin. and in this fast track course, I'm going to introduce you to the new Amazon SageMaker Data Wrangler, the SageMaker Pipelines and the SageMaker Feature Store. Now, these three SageMaker services are real game changers for budding data engineers and data scientists. So I'll show you how you can get started using these three services, the Data Wrangler, the Data Pipelines, and the Feature Store within the Amazon SageMaker Studio.

So what are they? Data Wrangler is a way to fast-track the loading and normalizing of data sets. Data Pipelines enables you to integrate clean that cleansing a normalization process with modeling and combine them into a workflow that could be shared across teams in a very visual interface. The SageMaker Feature Store enables you to save all of this process, the data loading, selection, cleansing exploration, and visualization processes as a library so they can be used and reused by other team members.

These three services make the job of data engineer and data scientists much easier. They reduce some of the heavy lifting and repetition that we tend to get stuck with in importing and cleansing data. These three new services are available within the SageMaker studio. Data Wrangler requires a little bit of one time configuration. For example, you have to use a specific EC2 instance for your notebook, but once you get that set up and going, the process is a game changer in the way that you load and normalized data sets.

So the Data Wrangler service lets you complete each step of a data preparation workflow. So you can have data ready for modeling sooner. You can do the four steps of data preparation from this one place. What I like most about that preparation stage is that the SageMaker visualization tool allows you to preview the data that you've loaded normalized, just to check it's readiness and completeness before you begin any modeling or passing it to another part of part of a team to run analysis on.

Okay, so we're gonna look at these features and get familiar with how to use them. So you can start using them to prepare data for modeling and for visualization.

Lectures

Getting Started with Data Wrangler - Setting Up SageMaker to Run Data Wrangler - Using Data Wrangler - Introduction to SageMaker Ground Truth - Service and Cost Review

About the Author
Students101790
Courses98
Learning paths82

Head of Content

Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe.  His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.