AWS Data Pipeline Reference Architecture
Start course

This course is designed to show you how to use the AWS Data Pipeline service for data collection requirements. 

Learning Objectives

  • Recognize and explain the operational characteristics of a data collection system.
  • Recognize and explain how a collection system can be designed to handle the frequency of data change and the type of data being ingested.
  • Recognize and identify properties that may need to be enforced by a collection system.
  • Recognize and explain AWS Data Pipeline core concepts.

Intended Audience

This course is intended for students looking to increase their knowledge of data collection methods and techniques with big data solutions.

What You'll Learn

  • Introduction to Data Pipeline: In this lesson, we'll discuss the basics of Data Pipeline.
  • AWS Data Pipeline Architecture: In this lesson, we'll go into more detail about the architecture that underpins the AWS Data Pipeline Big Data Service.
  • AWS Data Pipeline Core Concepts: In this lesson, we'll discuss how we define data nodes, access, activities, schedules, and resources.
  • AWS Data Pipeline Reference Architecture: In this lesson, we'll look at a real-life scenario of how data pipeline can be used.

Okay, as we come to the end of this module on AWS Data Pipeline, let's have a quick look at an example of a Reference Architecture from AWS where AWS Data Pipeline can be used. If we look at this scenario, what we're looking at is sensor data being streamed from devices such as power meters or cell phones through using Amazon simple queuing services and to a Dynamode DB database. From there, AWS Data Pipeline is used to move the data from Dynamode DB to Amazon EMR and trigger an EMR job to execute some polling processes on that data. From there, the data can be moved from EMR into read-shift to enable our standard business intelligence tools to use it for reporting purposes. That brings us to the end of the AWS Data Pipeline module. I look forward to speaking with you again.

About the Author

Shane has been emerged in the world of data, analytics and business intelligence for over 20 years, and for the last few years he has been focusing on how Agile processes and cloud computing technologies can be used to accelerate the delivery of data and content to users.

He is an avid user of the AWS cloud platform to help deliver this capability with increased speed and decreased costs. In fact its often hard to shut him up when he is talking about the innovative solutions that AWS can help you to create, or how cool the latest AWS feature is.

Shane hails from the far end of the earth, Wellington New Zealand, a place famous for Hobbits and Kiwifruit. However your more likely to see him partake of a good long black or an even better craft beer.