image
Developing Serverless ETL with AWS Glue
Introduction
Difficulty
Beginner
Duration
32m
Students
4503
Ratings
4.2/5
Description

If you have data that needs to be subjected to analytics, then you will likely need to put that data through an extract, transform and load (ETL) process, AWS Glue is a fully managed service designed to do just this.  Through a series of simple configurable options, you can select your source data to be processed by AWS Glue allowing you to turn it into cataloged, searchable, and queryable data.  This course will take you through the fundamentals of AWS Glue to get you started with this service

Learning Objectives

The objectives of this course are to provide you with an understanding of:

  • Serverless ETL
  • The knowledge and architecture of a typical ETL project
  • The prerequisite setup of AWS parts to use AWS Glue for ETL
  • Knowledge of how to use AWS Glue to perform serverless ETL
  • How to edit ETL processes created from AWS Glue

Intended Audience

This course is ideal for:

  • Data warehouse engineers that are looking to learn more about serverless ETL and AWS Glue
  • Developers that want to learn more about ETL work using AWS Glue
  • Developer leads that want to learn more about the serverless ETL process
  • Project managers and owners that want to learn about data preparation 

Prerequisites

As a prerequisite to this course you should have familiarity with:

  • One ore more of the data storage destinations offered by AWS
  • Data warehousing principles
  • Serverless computing
  • Object-orientated programming (Python)

Feedback

We welcome all feedback and suggestions - please contact us at support@cloudacademy.com if you are unsure about where to start or if would like help getting started.

Transcript

Hello and welcome to this course where I shall discuss developing for serverless extract, transform and load operations using AWS Glue. First I will focus on the difference between serverless ETL and traditional ETL and provide some background for why AWS Glue is a great tool for a data engineer's arsenal. I will then show some diagrams for how AWS Glue is used and offer a demo of moving data from fictional source data to a couple of AWS destinations using AWS Glue. Lastly, I will discuss the benefits and tradeoffs for developing serverless ETL and using AWS Glue. 

This course has been written and created by Norm Warren and recorded by myself, Stuart Scott. Now, Norm Warren specializes in data development, data visualizations and data analysis using a variety of ETL tools in cloud and on-premise databases and more recently AWS and Microsoft Azure, Tableau and Microsoft BI. Feel free to connect with Norm with any questions using the details shown on the screen. Alternatively, you can always get in touch with us here at Cloud Academy by sending an email to support@cloudacademy.com. 

This segment for data warehouse engineers that want to learn about serverless ETL and AWS Glue, developers that want to learn more about ETL work using AWS Glue, and developer leads that want to learn about the serverless ETL process, and project managers and owners that want to learn more about data preparation. 

This course is divided into five sections and in these videos I will look at the following points: an overview of traditional ETL in comparison to AWS Glue, an overview of AWS Glue itself, I shall perform a demonstration by creating an ETL solution using AWS Glue, I'll also look at some of the use cases for using AWS Glue, and finally a summary which will highlight some of the key points throughout this course. 

This course will provide you with the following: an understanding of serverless ETL which means Extract, Transform, and Loading of data, knowledge of architecture of a typical ETL project between source data and destination databases, data warehouse or big data destinations, an understanding of prerequisite setup of AWS parts to use AWS Glue for ETL, knowledge for how to use AWS Glue to perform serverless ETL, and how to edit ETL processes created from Glue. 

There's a number of prerequisites to this course and these include an understanding of one or more of the data destinations offered by AWS, an awareness of data warehousing principles, it's helpful if you have an understanding of serverless computing, and if you'd like to know more about serverless then you can see our existing course here What Is Serverless Computing, and finally an understanding of object-oriented programming such as Python. 

Feedback on our courses here at Cloud Academy are valuable to both us as trainers and any students looking to take the same course in the future. If you have any feedback positive or negative, it would be greatly appreciated if you could send an email to support@cloudacademy.com.

About the Author
Students
4504
Courses
1

Move a metric, change products or behaviors...with data -that is what excites me. I am passionate about data and have worked to architect and develop data solutions using cloud and on-premise ETL and visualization tools. I am an evangelist for self-service data transformation, insights, and analytics. I love to be agile.

I extend my understanding to the community by giving presentations at Big Data Conferences, Code Camp, and other venues. I also write useful content in the form of white papers, two books on business intelligence, and blog posts.