1. Home
  2. Training Library
  3. Big Data
  4. Courses
  5. Moving Beyond Spreadsheets

Practical Example: Structure

Developed with
Calculated Systems
play-arrow
Start course
Overview
DifficultyIntermediate
Duration42m
Students71
Ratings
5/5
starstarstarstarstar

Description

This course discusses some of the fundamental concepts of data management and looks at the differences between spreadsheets and databases for managing data. We'll look at some specific examples to understand when spreadsheets makes sense and when it makes sense to switch over to a database, which is sometimes a much better option for more complex datasets. 

Specifically, this course aims to give students a practical hands-on introduction to database concepts. In addition, we'll gain an understanding of how to select the right database and we'll go through the basics of setting up an RDS instance on Amazon. This course includes a practical example of a company that is looking to choose a database, to give you an understanding of how databases work in the real world.

If you have any feedback relating to this course, please contact us at support@cloudacademy.com.

Learning Objectives

  • Understand the difference between spreadsheets and databases and when to use one or the other
  • Learn about the different types of database available and the various features and characteristics to consider
  • Learn how to choose the right database
  • Learn how to deploy an Amazon Aurora instance

Intended Audience

This course is designed for anyone who wants to improve their knowledge of databases and understand when it makes sense to use them as opposed to a spreadsheet.

Prerequisites

To get the most out of this course, you should already have a basic understanding of simple data structures such as comma-separated values, as well as an understanding of cloud concepts in general.

Transcript

Hopefully, now you've written down some of your use case and let's use this graphic to guide us through picking the right database from a practical level. The first component we need to understand is the structure of the data. Now, we've briefly touched on data models and if it's flat, but we need to just decide, is this something that's representable in tables? Multiple tables? Is it may be a set of key value pairs? Maybe it's a set of documents with an associated amount of massive unstructured free text?

It's important to think about this because this will help guide you to whether or not you want a relational database or maybe a NoSQL database because your data formats are inconsistent. In terms of this coffee bean subscription service that we're launching and running, we have a lot of information coming in that's coming from customers.

The information that we need, though, is pretty straightforward and pretty structural, even though it's maybe coming in from email, from phone, or from direct in-person support channels, very importantly, it ties to a sale or an order and a product and then a customer feedback score and comment. So we can structure it in table format. So for the point of this coffee bean shop, we're gonna go with a SQL database.

About the Author
Students1179
Labs14
Courses6
Learning paths9

Calculated Systems was founded by experts in Hadoop, Google Cloud and AWS. Calculated Systems enables code-free capture, mapping and transformation of data in the cloud based on Apache NiFi, an open source project originally developed within the NSA. Calculated Systems accelerates time to market for new innovations while maintaining data integrity.  With cloud automation tools, deep industry expertise, and experience productionalizing workloads development cycles are cut down to a fraction of their normal time. The ability to quickly develop large scale data ingestion and processing  decreases the risk companies face in long development cycles. Calculated Systems is one of the industry leaders in Big Data transformation and education of these complex technologies.

Covered Topics