1. Home
  2. Training Library
  3. Databases (SAP-C02)

What is the Difference between a Data Lake and a Data Warehouse?

Contents

keyboard_tab
Course Introduction
1
Introduction
PREVIEW2m 22s
RDS vs. EC2
7
RDS vs. EC2
PREVIEW9m 33s
DynamoDB Accelerator

The course is part of this learning path

Start course
Overview
Difficulty
Intermediate
Duration
4h 11m
Students
101
Ratings
5/5
starstarstarstarstar
Description

This section of the AWS Certified Solutions Architect - Professional learning path introduces you to the AWS database services relevant to the SAP-C02 exam. We then understand the service options available and learn how to select and apply AWS database services to meet specific design scenarios relevant to the AWS Certified Solutions Architect - Professional exam. 

Want more? Try a Lab Playground or do a Lab Challenge

Learning Objectives

  • Understand the various database services that can be used when building cloud solutions on AWS
  • Learn how to build databases using Amazon RDS, DynamoDB, Redshift, DocumentDB, Keyspaces, and QLDB
  • Learn how to create ElastiCache and Neptune clusters
  • Understand which AWS database service to choose based on your requirements
  • Discover how to use automation to deploy databases in AWS
  • Learn about data lakes and how to build a data lake in AWS
Transcript

What is the difference between a data lake and a data warehouse?

When first getting into this space there might be some confusion between data lakes and data warehouses. That is fairly common.

The main difference between a data lake and a data warehouse is specificity and structure. 

A data lake is a formless blob of information, it is a pool of knowledge where we try to capture any relevant data from our business so that we can perform analytics on it.

A data warehouse is a specialized tool that allows you to perform analysis on a portion of that data, so you can make meaningful decisions from it. Generally, it is a subset of the data from the data lake with a specialized purpose. Your data warehouse Is an optimized database that is dealing with normalized, transformed, and cleaned-up versions of the data from the data lake.

 

About the Author
Students
36431
Courses
26
Learning Paths
20

Danny has over 20 years of IT experience as a software developer, cloud engineer, and technical trainer. After attending a conference on cloud computing in 2009, he knew he wanted to build his career around what was still a very new, emerging technology at the time — and share this transformational knowledge with others. He has spoken to IT professional audiences at local, regional, and national user groups and conferences. He has delivered in-person classroom and virtual training, interactive webinars, and authored video training courses covering many different technologies, including Amazon Web Services. He currently has six active AWS certifications, including certifications at the Professional and Specialty level.