AWS Data Wrangler or SageMaker Data Wrangler
Start course

Learning Objectives

This course is an introductory level AWS development course. You will learn about the AWS Data Wrangler library, what it does, and how to set it up to be able to use it. 

Intended Audience

This course is intended for AWS Python developers familiar with the Pandas and PyArrow libraries who are building non-distributed pipelines using AWS services. The AWS Data Wrangler library provides an abstraction for connectivity, extract, and load operations on AWS services. 


To get the most out of this course, you must meet the AWS Developer Associate certification requirements or have equivalent experience.

This course expects that you are familiar with and have an existing Python development environment and have set up the AWS CLI or SDK with the required configuration and keys. Familiarity with Python syntax is also a requirement. We walk through the basic setup for some of these but do not provide detailed explanations of the process. 

For fundamentals and additional details about these skills, you can refer to the following courses here at Cloud Academy:  

1) Python for Beginners 

2) Data Wrangling With Pandas

3) Introduction to the AWS CLI 

4) How to Use the AWS Command-Line Interface



AWS Data Wrangler or SageMaker Data Wrangler. Before we close this discussion, it's important to clarify a detail, and that is the answer to the question, what is AWS Data Wrangler and how is it different from Amazon

SageMaker Data Wrangler? The short answer is, they both wrangle data, but otherwise are different things used in different places. Amazon SageMaker Data Wrangler is a recently introduced SageMaker studio feature, and that has a similar name, but has a different purpose than the AWS Data Wrangler Open source project. For additional details regarding AWS Data Wrangler and the open-source project, you can contact the AWS Professional Service Open Source Initiative via email at So, in short, Amazon SageMaker Data Wrangler is specific for the SageMaker studio environment, and has no relation to AWS Data Wrangler, which is open source, runs anywhere Python does, and is focused on developers and the integration between Python pandas library and AWS data services in general.


About the Author
Jorge Negrón
AWS Content Architect
Learning Paths

Experienced in architecture and delivery of cloud-based solutions, the development, and delivery of technical training, defining requirements, use cases, and validating architectures for results. Excellent leadership, communication, and presentation skills with attention to details. Hands-on administration/development experience with the ability to mentor and train current & emerging technologies, (Cloud, ML, IoT, Microservices, Big Data & Analytics).