1. Home
  2. Training Library
  3. Databases (CLF-C01)

NoSQL Databases

Contents

keyboard_tab
Course Introduction
1
Introduction
PREVIEW2m 27s
Amazon Redshift
11
Amazon Redshift
PREVIEW8m 4s

The course is part of this learning path

Start course
Overview
Difficulty
Beginner
Duration
1h 43m
Students
1086
Ratings
4.7/5
starstarstarstarstar-half
Description

In this section of the Cloud Practitioner learning path, we introduce you to the various Database services currently available in AWS that are relevant to the CLF-C01 exam.

Learning Objectives

  • Identify and describe the various Database services available in AWS
  • Understand the differences between relational and NoSQL databases
  • Describe AWS-managed relational and NoSQL database services

Prerequisites

This course is designed for anyone who is new to cloud computing, so no prior experience with AWS is necessary. While it may be helpful to have a basic understanding of AWS and its services, as well as some exposure to AWS Cloud design, implementation, and operations, this is not required as all of the concepts we will introduce in this course will be explained and reinforced from the ground up.

Transcript

Relational databases are highly structured repositories of data. They use schemas to define how information is organized and that schema must exist before the database can even be created.

This fixed nature of data structures makes relational databases sub-optimal for analytical processes where data is semi-structured or unstructured.

While relational databases are highly-structured repositories of information, non-relational databases do not use a fixed table structure. They are schema-less.

Since it doesn’t use a predefined schema that is enforced by a database engine, a non-relational database can use structured, semi-structured, and unstructured data without difficulty.

NoSQL is a general term that refers to a particular type of database model.  It encompasses a wide variety of different models that don’t fit into the relational model.

Non-relational NoSQL-type databases have been around since the 1960s, but it wasn’t until the early 2000s that the NoSQL approach started to have broad appeal and a new generation of NoSQL systems began to hit the market.

Today, the term NoSQL describes a family of schema-less, non-relational, distributed data stores.

NoSQL databases are popular with developers because they do not require an upfront schema design; they are able to build code without waiting for a database to be designed and built.

It’s this flexibility--a dynamic approach to organizing data--that has been popular with companies needing to store unstructured or rapidly changing data.

The term NoSQL has two meanings. In the beginning, it described databases that used mechanisms other than SQL to manage data.  

There was “No SQL” used when accessing and manipulating data.

The definition has been expanded to mean, “Not Only SQL.”  Some systems use SQL along with other technologies and query languages.

There are people that argue that the one thing all NoSQL databases have in common is that they’re non-relational and that a better name would be, “NoREL.”  

Personally, I don’t think I have enough free time to care that much about it.

NoSQL databases, in general, share a few basic characteristics.  

They are non-relational, open-source, schema-less, horizontally scalable, and do not adhere to ACID constraints.

Most NoSQL databases access data using their own Application Programming Interface, API.  However, some NoSQL databases use a subset of SQL for data management.

In many cases, the non-relational model is a good fit for an application’s requirements.  

The data might be unstructured or semi-structured.  The amount of data might be impractical for a relational database.  Or, the data might be of one single type and doesn’t need the controls that come with a relational database.

Being open source is not a requirement of NoSQL databases.  It’s more of a NoSQL observation.  There are many relational and non-relational databases that open source projects.  However, the developers of NoSQL databases lean towards providing open-source solutions.

Most NoSQL databases have no fixed schema.  

Relational databases require a schema to be designed before the database is created.  NoSQL databases don’t.  Instead, schemas can be created dynamically as data is accessed or embedded into the data itself.

NoSQL databases have a reputation for being more flexible with the data they can accept and support agile and DevOps philosophies.

NoSQL databases are often run in clusters of computing nodes.

Data is partitioned across multiple computers so that each computer can perform a specific task independently of the others.

Each node performs its task without having to share CPU, memory, or storage with other nodes.

This is known as a shared-nothing architecture.

Most NoSQL databases relax ACID constraints found in relational databases.

NoSQL solutions were developed around the purpose of providing high availability and scalability in a distributed environment.

To do this, either consistency or durability has to be sacrificed.  By relaxing consistency, distributed systems can be highly available and durable.  

Using a NoSQL approach, inconsistent data is expected.  There’s no problem as long as it’s recognized and managed appropriately.

Currently, there is no standard query language that is supported by all NoSQL databases.  

Some NoSQL databases have their own query language.  Others use languages such as JavaScript, Java, Python, XQuery, and SPARQL.

NoSQL databases are a family of non-relational databases that include Key-Value Databases, Column Family Stores, Document Stores, and Graph Stores.

Key-Value databases are the simplest NoSQL data stores to use from an API perspective. Using a RESTful API, a client can get the value for the key, put a value for a key, or delete a key from the data store. 

A Document Store Database is a database that uses a document-oriented model to store information.  Each document contains semi-structured data that can be queried. Essentially, the schema for the data is built into the document, itself, and can change as needed. 

Here is an example of a simple document store.  It's written in JSON, JavaScript Object Notation.  What makes this different than a key-value store is that, for some of the values, there are nested key-value pairs that can be indexed and retrieved.

A Graph Store is a database that uses a graphical model to represent and store information.  It has two primary components, Vertices and Edges.

Those are some of the types of NoSQL databases that are available, and I'll eventually cover them in more detail, but why use them? What advantages do NoSQL databases have over relational databases?Scaling a NoSQL database is easier and less expensive than scaling a relational database because the scaling is horizontal instead of vertical. In general, for relational databases to scale, they must add memory, CPU, or storage.  This is vertical scaling.  However, NoSQL scaling is done by adding a compute or disk node. This is horizontal scaling. NoSQL databases generally trade consistency for performance and scalability.

Relational databases have four properties that support reliability.  These properties, commonly referred to as ACID, are atomicity, consistency, isolation, and durability.

Consistency refers to the database's state.  In a relational database, a transaction takes a database from one valid state to another valid state. With most NoSQL databases, it's possible for data to be inconsistent; a query might return old or stale data.

You might hear this phenomenon described as being eventually consistent.  Over time, data that is spread across storage nodes will replicate and become consistent. What makes this behavior acceptable is that developers can anticipate this eventual consistency and allow for it. That said, some NoSQL databases do support strong consistency.  

To review, NoSQL is a general term that refers loosely to a particular type of database model, or database management system.

NoSQL databases generally share a number of characteristics.  They are Non-relational, databases, Open-source, Schema-less, and Horizontally Scalable.

Additionally, NoSQL databases do not generally adhere to the ACID principles found in relational databases and most do not use SQL to access data.

This is a good time to discuss the types of fully-managed NoSQL databases available from AWS.  Or, it would be, but this is the end of this lecture.

In the next lecture, I'm going to describe, in some detail, the types of managed NoSQL database available on AWS.  It won't be overly technical.  It’s a discussion, really, about what’s possible and how to start thinking about your data.

Lectures

Course Introduction - The AWS Database Landscape - Relational Databases - Types of Managed NoSQL on AWS - Part 1 - Types of Managed NoSQL on AWS - Part 2 - Summary and Conclusion

About the Author
Students
208833
Labs
1
Courses
211
Learning Paths
164

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.