AWS Big Data Specialty - Processing


Course Introduction
Amazon Web Services Elastic MapReduce
EMR Overview
1h 15m

In this course for the Big Data Specialty Certification, we learn how to identify the appropriate data processing technologies needed for big data scenarios. We explore how to design and architect a data processing solution, and explore and define the operational characteristics of big data processing. 

Learning objectives

  • Recognize and explain how to identify the appropriate data processing technologies needed for big data scenarios.
  • Recognize and explain how to design and architect a data processing solution.

Intended audience

This course is intended for students wanting to extend their knowledge of the data processing options available in AWS.


While there are no formal prerequisites for this course, students will benefit from having a basic understanding of cloud computing services. If you would like to gain a solid foundation in compute fundamentals, then check out our Compute Fundamentals For AWS course.

This Course Includes

75 minutes of high-definition video.

What You'll Learn

  • Course Intro: What to expect from this course
  • Amazon Elastic MapReduce Overview: In this lesson, we discuss how EMR allows you to store and process data
  • Amazon Elastic MapReduce Architecture: In this lesson, you’ll learn about EMR’s clustered architecture.
  • Amazon Elastic MapReduce in Detail: In this lesson, we’ll dig deeper into EMR storage options, resource management, and processing options.
  • Amazon Elastic MapReduce Reference Architecture: Best practices for using EMR.
  • Amazon Lambda Introduction: This lesson will kick off our discussion of Lambda and how it’s used in Big Data scenarios.
  • Amazon Lambda Overview: This lesson discusses how Lambda allows you to run code for virtually any type of application or backend service with no administration.
  • AWS Lambda Architecture: In this lesson, we’ll discuss generic Lambda architecture and Amazon’s serverless service.
  • AWS Lambda in Detail: In this lesson, we’ll dig into Events and Service Limits.
  • AWS Lambda Reference Architecture: In this lesson, we'll look at a real-life scenario of how lambda can be used.

Hello and welcome to another Big Data on AWS course from Cloud Academy. In this course, we focus on Amazon Big Data services which are designed to process data. This course is part of a larger learning path that covers the broad range of Big Data services available from AWS.

This course assumes you have a good understanding of cloud computing and AWS. And that you are proficient with provisioning and using services within AWS. Ideally you will also have some background understanding of Big Data. There are a large number of AWS Big Data services available. And this course is designed to provide the initial core concepts required for each of these services. And to assist people in passing, the AWS Big Data Specialist Exam.

A little bit about me. My name is Shane Gibson and I've worked in the area of data and business intelligence for over 20 years. And for the last three years I've been focusing on how we can use HR processes in cloud computing technologies to accelerate the delivery of data and content to our users. I was born and I still live in New Zealand. I love craft beer and good coffee. And you can learn more about me by following either my Twitter or LinkedIn.

At the end of this course, you will be able to describe in detail how Amazon Big Data services can be used to process data within a Big Data solution. In this Big Data and AWS learning path, we cover many AWS Big Data services that can be used to collect, store, process, analyze, visualize and secure Big Data. In this course, we provide two modules which cover the Big Data processing services of Amazon EMR and Amazon Lambda.

Each of these two Big Data processing services can be used on their own or on combination with each other to provide processing capability for your Big Data solution. Each of these processing services have specific strengths that make them more suitable for the processing of different types and volumes of data. And we discuss these as we progress through the course. In each of the modules we cover, we which processing and storage patterns the storage service fits within. The architecture of the service as well as the core concepts that help you understand that service in detail.

We also cover the service limits for each service where applicable. At the end of the two modules we will wrap up with a quick overview of the reference architecture which uses these services. So let's begin and find our how we can process Big Data using Amazon Web Services capabilities.

About the Author

Shane has been emerged in the world of data, analytics and business intelligence for over 20 years, and for the last few years he has been focusing on how Agile processes and cloud computing technologies can be used to accelerate the delivery of data and content to users.

He is an avid user of the AWS cloud platform to help deliver this capability with increased speed and decreased costs. In fact its often hard to shut him up when he is talking about the innovative solutions that AWS can help you to create, or how cool the latest AWS feature is.

Shane hails from the far end of the earth, Wellington New Zealand, a place famous for Hobbits and Kiwifruit. However your more likely to see him partake of a good long black or an even better craft beer.