hands-on lab

Implementing a Searchable Amazon S3 Data Lake

Beginner
1h 30m
425
3.6/5
Get guided in a real environmentPractice with a step-by-step scenario in a real, provisioned environment.
Learn and validateUse validations to check your solutions every step of the way.
See resultsTrack your knowledge and monitor your progress.
Lab description

AWS Glue is a service that data analytics professionals can use to catalog, transform, and integrate data from different sources. By consolidating integration capabilities into a single centralized service, AWS Glue gives you the ability to discover, cleanse, catalog, and transform data in a single place.

Learning how to use AWS Glue to work with data will help you become more effective at creating and using data lakes in the public AWS cloud.

In this lab, you will implement an AWS Lambda function that processes order data as it is uploaded to Amazon S3, and you will see how to configure AWS Glue to make searching the data more efficient.

Learning Objectives

Upon completion of this beginner-level lab, you will be able to:

  • Use an AWS Lambda to normalize JSON data
  • Use Amazon EventBridge to invoke an AWS Lambda function in response to an event
  • Configure an AWS Glue table to use a partition index
  • Search data stored in Amazon S3 with Amazon Athena

Intended Audience

  • Candidates for the AWS Certified Data Analytics Specialty certification
  • Cloud Architects
  • Data Engineers
  • DevOps Engineers
  • Machine Learning Engineers
  • Software Engineers

Prerequisites

Familiarity with the following will be beneficial but is not required:

  • AWS Glue
  • Data Lakes
  • AWS Lambda
  • Amazon EventBridge
  • Amazon Athena

The following content can be used to fulfill the prerequisites:

Updates

February 15th, 2023 - Updated the Lambda implementation step with a test event

Environment before
Environment after
About the author
Students
66,402
Labs
164
Courses
2
Learning paths
4

Andrew is a Labs Developer with previous experience in the Internet Service Provider, Audio Streaming, and CryptoCurrency industries. He has also been a DevOps Engineer and enjoys working with CI/CD and Kubernetes.

He holds multiple AWS certifications including Solutions Architect Associate and Professional.

Covered topics
Lab steps
Logging In to the Amazon Web Services Console
Implementing Event Processing for Amazon S3 with an AWS Lambda Function
Creating an Amazon EventBridge Rule
Adding a Partition Index to an AWS Glue Table
Searching Within Your Indexed Amazon S3 Data