Setting up a simple Data lake in AWS

Lab Steps

AWS Playground
Setting up a simple Data lake

Ready for the real environment experience?

Time Limit1h 55m


As organizations collect and analyze increasing amounts of data, traditional on-premises solutions for data storage, data management, and analytics struggle to keep pace. Data silos that aren’t built to work well together make it difficult to consolidate storage. This limits an organization’s capabilities in three ways:

• Organizations with data silos are less agile.

• Organizations with data silos derive fewer insights and get less value from their data.

• Organizations with data silos find it harder to adopt more sophisticated analytics tools and processes as needs evolve.

A data lake is designed to address these challenges. It’s a centralized, secure, and durable cloud-based storage platform that allows you to ingest and store structured and unstructured data, and transform these raw data assets as needed. The single platform combines storage, data governance, and analytics. You can use a complete portfolio of data exploration, reporting, analytics, machine learning, and visualization tools on the data.



AWS Glue is a fully managed extract, transform, and load (ETL) service that enables customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console.

Amazon Athena is an interactive query service that enables you to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

About the Author
Learning paths37

A world-leading tech and digital skills organization, we help many of the world’s leading companies to build their tech and digital capabilities via our range of world-class training courses, reskilling bootcamps, work-based learning programs, and apprenticeships. We also create bespoke solutions, blending elements to meet specific client needs.