Introduction to EMR
Introduction to EMR
3h 46m

Domain One of The AWS Solution Architect Associate exam guide SAA-C03 requires us to be able to Design a multi-tier architecture solution so that is our topic for this section.
We cover the need to know aspects of how to design Multi-Tier solutions using AWS services. 

Want more? Try a lab playground or do a Lab Challenge!

Learning Objectives

  • Learn some of the essential services for creating multi-tier architect on AWS, including the Simple Queue Service (SQS) and the Simple Notification Service (SNS)
  • Understand data streaming and how Amazon Kinesis can be used to stream data
  • Learn how to design a multi-tier solution on AWS, and the important aspects to take into consideration when doing so
  • Learn how to design cost-optimized AWS architectures
  • Understand how to leverage AWS services to migrate applications and databases to the AWS Cloud

Hello and welcome to this lecture covering Elastic MapReduce, known as EMR.

Amazon Elastic MapReduce is a managed service designed to process and analyze vast amounts of data through the use of jobs that can be short running with per second costs, or for long-running workloads allowing you to build in high availability into your architecture.

EMR is based on the popular and solid Apache Hadoop framework, an open-source distributed processing framework intended for big data processing. Organizations and companies can gain great benefit in using Amazon EMR because it abstracts and reduces the complexity of the infrastructure layer used with traditional MapReduce frameworks.

The efforts involved in implementing a healthy Hadoop cluster setup are not so trivial. So what AWS did was to encapsulate all the infrastructure of the Hadoop framework into an integrated environment so you can launch a cluster in minutes and focus on the real important part, which is not managing infrastructure but getting your data processed according to your needs.

Amazon EMR securely and reliably handles your data analytic use cases, including log analysis, web indexing, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics. Amazon EMR takes advantage of Amazon EC2 instances that are configured with the Hadoop framework to deliver petabyte-scale processing power. 

Amazon EMR supports a number of other frameworks used within the field of big data and data analytics, these include Spark, Presto, and HBase.  Using the AWS Management Console or AWS CLI, you can quickly and easily create clusters for each of these frameworks. 

About the Author
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.