RDS: Relational Database Service
Going NoSQL with DynamoDB
Databases are among the most used applications in the cloud (or anywhere else, for that matter). Managing data is exactly what computers were invented for, so it should come as no surprise that a great deal of attention is focused on the many different Database Management Systems and Data Management tools that are available.
This course will cover AWS's database solutions. It is split in two parts, the first is dedicated to the two most important Database services in the Amazon family: RDS, a relational database supporting DBMS like MySQL, PostgreSQL, Oracle, and MSSQL; and DynamoDB, a powerful NoSQL DBMS adopting the key-value data model. The second part is about the more advanced AWS database systems like Elasticache, RedShift, and SimpleDB.
Who should take this course
For this beginner course, you'll require no special prerequisites. Nevertheless, some experience with Databases and at least a basic knowledge of the related jargon might be helpful. If you are completely new to the cloud, you might benefit from our Introduction to Cloud Computing course. You might also find the AWS general introduction course interesting interesting if you are not yet that familiar with the AWS cloud platform.
If you want to test your knowledge of the basic topics covered by this course, we strongly suggest you take our quiz questions. Another nice follow up to this course is our RDS lab, where you can get your hands dirty with a real RDS instance in the cloud, and Databases on AWS - part 2.
Amazon DynamoDB is a highly scaleable and fully managed NoSQL Database Service. It automatically partitions data over a number of servers to meet your request capacity. In addition DynamoDB automatically replicates your data synchronously across Multiple Availability Zones within an AWS region to ensure High Availability and data durability.
Just like many NoSQL services, DynamoDB uses a Table-based Data Model that doesn't require a fixed schema and enables data access mainly through primary keys. In addition the service defaults to strongly consistent reads and natively supports atomic courters allowing you to atomically increment or decrement numerical attributes with a single API call.
AWS recommends Amazon DynamoDB for customers who need to build highly scalable applications that require extremely high throughput and low latencies for both reads and writes, require the ability to scale to extremely large datasets while maintaining predictable performance, even if starting with a small dataset, mainly use primary keys to access their data and don't need complex query capabilities like transactions or joins, don't want the administrative burden of running your own highly-available, distributed database cluster.
Amazon DynamoDB displays key operational metrics for your table in the AWS management console. The service also integrates with Amazon CloudWatch so you can see your request throughput and latency for each table.
Amazon DynamoDB uses strong cryptographic methods to authenticate users and prevent unauthorised data access. It also integrates with IAM for fine-grained access control for users within your organization.
Amazon Elastic MapReduce allows you to perform complex analytics of large datasets using a hosted Hadoop framework and archive the results in Amazon S3 while keeping the original dataset in DynamoDB intact.
Amazon Redshift also complements DynamoDB with advanced business intelligence capabilities and a powerful SQL-based interface.
Finally you can use AWS data pipeline to automate data movement and transformation in to and out of Amazon DynamoDB. The built-in scheduling capabilities of AWS Data Pipeline lets you schedule and execute recurring jobs without having to write your own complex data transfer or transformation logic.
The DynamoDB Data Model concepts include tables, items and attributes.
A "Database" is a collection of "Tables". A "Table" is a collection of "Data Items". Each "Table" can have an unlimited number of "Data Items".
An "Item" is a collection of "Attributes". One of those "Attributes" is known as the "Primary Key" and it's mandatory. There is no explicit limitation of the number of attributes associated with an individual item but the aggregate size of an item including all the attribute names and attribute values is 64 kB.
Each "Attribute" is composed of a name and a value or set of values. Individual attributes have no explicit size limit but the total value of an item including all attribute names and values cannot exceed 64 kB.
In a Relational Database, a table has a pre-defined schema such as the table name, primary key, list of its column names and their data types.
All records stored in the table must have the same set of columns. Nevertheless DynamoDB is a NoSQL Database and except for the required primary key, a DynamoDB table is schema-less, meaning that items in the table don't need to have the same attributes or even the same number of attributes. When you create a table, in addition to the table name you must specify the primary key of the table.
DynamoDB supports the following two types of primary keys. “Hash Type Primary Key”, in this case the primary key is made of one attribute, a "Hash" attribute.
DynamoDB builds an unordered “Hash Index” on its “Primary Key” attribute. “Hash & Range Type Primary Key”, in this case the primary key is made of two attributes, a "Hash" attribute and a "Range" attribute.
DynamoDB builds an unordered “Hash Index" on the "Hash Primary Key" attribute and a "Sorted Range Index" on the "Range Primary Key" attribute. When you create a table with a “Hash & Range Key”, you can optionally define one or more secondary indexes on that table.
A "Secondary Index" lets you courter the data in the table using and alternate key. In addition it courters it against the primary key. DynamoDB supports two kinds of secondary indexes. A "Local Secondary Index" which is an index that has the same hash key as the table but a different range key; a "Global Secondary Index" which is an index with a hash and range key that can be different from those on the table.
Every “Secondary Index” is automatically maintained by DynamoDB. When you add, modify or delete items in the table any index on the table is also updated to reflect these changes. You can define up to five local secondary indexes and five global secondary indexes per table.
Computer Engineer and Cloud Expert