Databases are among the most used applications in the cloud (or anywhere else, for that matter). Managing data is exactly what computers were invented for, so it should come as no surprise that a great deal of attention is focused on the many different Database Management Systems and Data Management tools that are available.
This course will cover AWS's database solutions. It is split in two parts, the first is dedicated to the two most important Database services in the Amazon family: RDS, a relational database supporting DBMS like MySQL, PostgreSQL, Oracle, and MSSQL; and DynamoDB, a powerful NoSQL DBMS adopting the key-value data model. The second part is about the more advanced AWS database systems like Elasticache, RedShift, and SimpleDB.
Who should take this course
For this beginner course, you'll require no special prerequisites. Nevertheless, some experience with Databases and at least a basic knowledge of the related jargon might be helpful. If you are completely new to the cloud, you might benefit from our Introduction to Cloud Computing course. You might also find the AWS general introduction course interesting interesting if you are not yet that familiar with the AWS cloud platform.
If you want to test your knowledge of the basic topics covered by this course, we strongly suggest you take our quiz questions. Another nice follow up to this course is our RDS lab, where you can get your hands dirty with a real RDS instance in the cloud, and Databases on AWS - part 2.
Welcome to the Databases On AWS Course. In this course we will get an overview of the Database Services on Amazon Web Services.
You will learn about the general structure of the Amazon Relational Database or RDS, Amazon DynamoDB, Amazon Redshift and Amazon SimpleDB. And then you will see the basic steps needed to work with these services. Amazon Web Services provides fully managed Relational and NoSQL Database Services as well as fully managed In-Memory Caching as a service, and a fully managed Petabyte-Scale Data Warehouse Service.
In addition you can operate your own database, in the Cloud on Amazon EC2 and Amazon EBS. If you've ever hosted your database on-premises you probably have experienced the difficulties of optimizing applications, scaling them, keeping them Highly Available, making backups, maintaining your servers and implementing infrastructure like racks, power and et cetera. You can host your database on AWS as either a Self managed or AWS managed.
With the Self managed option you will be responsible for upgrading, making backups and ensuring security of your database, also you have full control over the server's parameters, operating system and the database.
On the other hand, by choosing an AWS managed database, AWS will be responsible for upgrades, backups, security, et cetera. Thanks to the rich variety of database services on AWS, there are several solutions available based on your needs. If you need a Relational Database service with minimal administration, Amazon RDS is a fully managed service that offers a choice of MySQL, Oracle, SQL Server or PostgreSQL Database Engines, scale, compute and storage and Multi-AZ Availability.
If you need a fast, highly scaleable, NoSQL Database Service, Amazon DynamoDB is a fully managed service that offers extremely fast performance, seamless scaleability and reliability. If you need a fast, Petabyte-Scale Data Warehouse, Amazon Redshift is a fully managed service that makes it simple and a cost-effective to efficiently analyse all your data using your existing business intelligence tools.
If you need a NoSQL Database Service for smaller data sets, Amazon SimpleDB is a fully managed service that provides a reliable, scheme less database. Also you can manage a Relational Database on your own by choosing the Relational AMIs on Amazon EC2 and EBS that provides scale, compute and storage and complete control over instances.
Finally you can use ElastiCache which is a Web Service that makes it easy to deploy, operate and scale an In-Memory Cache in the Cloud. The service improves the performance of Web applications by allowing you to retrieve information from fast, managed, in-memory caches instead of relying entirely on slower, disk-based databases.
Before going to see the different services in AWS, it's good to have a short overview of databases.
A database is an organize collection of data.
Database Management Systems or DBMSs are specifically designed software applications that interact with the user, other applications and the database itself to capture and analyse data.
A general-purpose DBMS is a software system designed to allow the definition, creation, querying, update and administration of databases. Well-known DBMSs include MySQL, PostgreSQL, Oracle, et cetera.
Each database has a specific model. A Database Model is a type of data model that determines the logical structure of a database and fundamentally determines in which manner data can be stored, organized and manipulated. Several different models exist for databases but the most popular of them is the Relational Model which uses a Table-based Format.
A Relational Database is a collection of data items organized as a set of formally described tables which data can be accessed or re-assembled from in many different ways without having to re-organize the database tables. The standard user, an Application Program Interface to a Relational Database, is the Structured Query Language or SQL. SQL statements are used both for interactive queries, for information from a Relational Database and for gathering data for reports.
On the other hand, we have Non-Relational or NoSQL Database Models which provide a mechanism for storage and retrieval of data that has modelled and means other than the tabular relations used in Relational Databases. It means that data can be inserted into a NoSQL database without first defining a rigid database schema. Motivations for this approach include simplicity of design, horizontal scaling and final control over availability.
The NoSQL Database Model is used when the ability to store and retrieve great quantities of data is important, the data is not structured or the structure is changing with time or storing relationships between the elements is not important.
Generally speaking, the Relational Database Model is more functional with less performance and the other way around for the Non-Relational Database Model, less functional with more performance.