In this course, we will introduce you to some common types of databases. Different data problems can be solved in a wide variety of ways, which is why so many different types of databases exist. After learning about a few different types of databases, you’ll be ready to explore specific implementations further.
Learning Objectives
During this course, I’ll introduce you to the following:
- SQL databases
- Key-value databases
- Document-databases
- Graph databases
Intended Audience
This course is for novice:
- Software engineers
- Data engineers
- DevOps engineers
- Site reliability engineers
Prerequisites
- You should have at least a conceptual understanding of programming and be comfortable with data structures, data types, etc.
Let’s dive a bit deeper and take a look at some examples of SQL code. We’re going to keep this fairly high level. The intent here is to prime your brain with a bit of the SQL syntax. Don’t feel the need to try and memorize any of this. If you choose to dive deeper into learning SQL then you’ll cover the syntax more in-depth then.
Relational databases contain different types of objects behind the scenes. Tables are one type of database object and they’re created using a CREATE statement. Here we have two create statements. This code creates a table called animal, and a related table called need. Each column defines a name, data type, and optional constraints.
Notice the Animal and Name columns include constraints which specify that data in these columns cannot be null. Null being a term which indicates the absence of data. There are different types of constraints such as default, not-null, unique, and check. Each establishing a data integrity rule which must be followed.
We’re not going to go into detail about these constraints in this lesson since that would take us a bit too far into the weeds. However, there are two constraints that I do want to spend some time covering. These are called primary and foreign keys. These two constraints are used to identify unique rows.
A primary key represents the unique data which represents a single entity. The primary keys in these tables establish a unique identifier for each row. In this example, the primary key is a single column however, SQL supports composite keys made up of multiple columns. These ID columns will be set to an auto-incrementing number allowing the database to mint its own unique row identifiers.
Where primary keys are used to establish an identifier for unique rows, foreign keys are used to uniquely identify a row in a related table. This row in the create statement for the need table constrains the AnimalID column to reference the ID column in the animal table. Which means this column must be set to an ID that exists in the animal table. Ensuring referential integrity.
Running this code against an actual database engine would result in these two tables being created based on the rules defined in the code. Once created these tables can be interacted with in different ways. For example, we could add some data using an INSERT statement.
Here are four statements. The first two add data to the Animal table. Notice how the columns specified on the top lines match the index of the values specified on bottom lines.
These next two statements add data to the need table. Notice this interesting bit of code in the VALUES section which leverages sub-queries to fetch some existing data. Sub-queries are queries nested inside other queries. And it’s a rather useful feature of SQL.
Running this code would result in the specified data being added to the database. Retrieving data requires the use of the SELECT statement. Select statements allow specific columns to be requested in addition to using an asterisk as a wildcard to mean all columns.
SQL queries can be filtered using the WHERE keyword. Where clauses define one or more conditions that must be met. There are many different operators used in a wide range of use cases. These include data comparison, logical operators, etc.
In order to retrieve data from multiple tables additional code is required. Tables can be related to each other in different ways. For example, there could be:
-
a one-to-one relationship
-
a one-to-many relationship
-
a many-to-one relationship
-
a many-to-many relationship
This is referred to as cardinality. And in this example, the animal table has a one-to-many relationship with the need table. Because an individual animal might have multiple needs. Select statements include a mechanism for joining the data in different tables depending on the cardinality of the relationship. I’m not going to go into any further detail on joins, other than to show how we might retrieve all animals and their needs.
Notice it starts out requesting data from the animal table. Then it joins together with the need table, based on the ID in the animal table matching the AnimalID column in the need table. Running this returns one record for each of the combined rows. Since there are two ‘need’ records for Kara, there are two records listed.
Once data exists you can update it using an UPDATE statement. Update statements are capable of updating a row or rows of data. Notice the use of the WHERE keyword limits which row to update. Running this would change the weight for the row where the Name is ‘Ada.’
Since a row is a unit of data in a relational database when deleting data it’s done at the row level. Similar to an update statement we can use a where clause to limit the deleted rows. After running this the ‘Ada’ record would no longer exist in the database. Because all of the data integrity rules are built into the database we’re largely incapable of breaking those rules.
In our example, the animal column of the Animal table is required. Trying to add a record which doesn’t include data for the animal column will result in an error. Codifying data integrity into a database schema is one of the core features of SQL databases. Because the database itself is responsible for its own integrity.
SQL databases are kind of like an Excel spreadsheet with superpowers in that they can use functions to aggregate data. They’re capable of efficiently working with massive amounts of data and across multiple tables. For example in less than a second, a query might return query results from a table with a million rows.
In certain use cases, a SQL database might be a tool used to help with data processing and or visualization. Different types of data analysts commonly interact directly with SQL databases to explore data. Perhaps using a command line or graphical user interface to run queries and view the results.
Other use cases might use SQL databases as an application storage mechanism. For example: Imagine a product catalog consisting of different categories of products. A customer facing web application can fetch and display products from the database. In this case, the web application code communicates with the database. Though, being a SQL database, developers can always issue ad hoc queries directly against the database.
There are many different possible use cases for a SQL database. After all, they’re data storage and retrieval systems which include data integrity rules built into the design. Being able to codify a set of rules and have them enforced at the database level is a generally useful concept.
There are many different implementations of SQL databases. Postgres, MySQL, MSSQL, and SQLite to name a few. These are outside the scope of this course. However, now you have some search terms should you want to explore further on your own.
Okay, this seems like a natural stopping point. Here are your key takeaways for this lesson:
-
SQL is pronounced as either:
-
Ess Queue Ell
-
Sequel
-
SQL databases store relational data
-
Which consists of tables, columns, and rows
-
Tables are used to model specific concepts.
-
Columns describe an attribute of the concept being modeled.
-
They include names, data types, and optional constraints.
-
Rows are collections of columns representing an instance of a concept.
-
The structure of a database is called a schema.
-
It defines the tables, columns, constraints, etc.
-
SQL databases are programmed using the SQL domain-specific language
-
Defined as an industry standard it can:
-
Create, update, and delete database objects such as tables
-
Insert, update, and delete rows in a table
-
Query and filter data
-
Control access to the database and its objects
-
Manage transactions
-
Etc
-
Vendor-specific language extensions exist such as
-
PostGIS which provides the Postgres database with the ability to work with GIS data.
-
SQL databases have a wide range of use cases. From:
-
Running ad hoc queries to explore data
-
Serving as an application storage layer
Okay, that's going to be all for this lesson. Thanks so much for watching. And I’ll see you in another lesson!
Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.