AWS Data Management
AWS Solutions Architect Associate Level Certification Course - Part 3 of 3
Having completed parts one and two of our AWS certification series, you should now be familiar with basic AWS services and some of the workings of AWS networking. This final course in our three-part certification exam preparation series focuses on data management and application and services deployment.
Who should take this course?
This is an advanced course that's aimed at people who already have some experience with AWS and a familiarity with the general principles of architecting cloud solutions.
Where will you go from here?
The self-testing quizzes of the AWS Solutions Architect Associate Level prep materialis a great follow up to this series...and a pretty good indicator of your readiness to take the AWS exam. Also, since you're studying for the AWS certification, check out our AWS Certifications Study Guide on our blog.
Amazon's DynamoDB is a NoSQL database service. Before we go any further, it's probably not a bad idea to explain what exactly a NoSQL database is, especially when contrasted with a relational or SQL database, like MySQL. SQL databases are designed with a strictly defined table structure built on schemas, that allow for particularly efficient complex queries, using the structured query language, and are well suited to highly transactional based applications. Due to their architecture, scaling SQL database instance will work best by adding greater CPU, disk read speed, and memory resources. NoSQL databases, like DynamoDB on the other hand, are designed more loosely around documents, relying on key value pairs for organization and excelling at hierarchical data solutions.
The greatest NoSQL power gains will likely be realized through horizontal increments, meaning adding more servers. So choosing one over the other will largely depend on the specific needs of your project. DynamoDB is fully managed and because providing large numbers of servers is just the kind of thing that Amazon does best, it's ideally optimized to manage AWS resources. DynamoDB will work well for just about any amount of data and any level of traffic capacity. Oh, and it's fast. Although you can completely manage DynamoDB using the AWS CLI and their SDKs for various programming languages, you can also perform some basic tasks from the dashboard. So that's where we'll start. First a word about structure. A DynamoDB database is made up of tables.
A table is made up of items, and each item is made up of attributes. Here's a sample item from AWS's documentation. ID is the primary key whose value is an integer, but each of the attributes, none of which by the way, is required, is essentially a name value pair. The value of name product category, for example, is book. Let's create a new table and give our table a name. We're right away faced with some very important choices. Everything that will happen to the table we're about to create will depend on this next step. We'll have to decide whether our primary key type will be just a hash or a hash and range. Since queries are impossible unless the primary key is of the hash and range type, we'll choose hash and range. We'll imagine that we're designing the database for a new bookstore, so our hash attribute name will be publisher, and range attribute name will be book title.
Both will be defined as string. Click continue. Now we have the opportunity to add an index. As it currently stands, users will be able to query our database only for publishers or from books from a specific publisher. We might like to add some greater organization to allow more complex queries. Therefore, if we wanted to allow searches within the database by author, and within author, by subject, so that for instance, you could display all the books your favorite author wrote on cloud computing, you might add a global secondary type index, whose index hash key was author, and whose range key was subject. The index name attribute is automatically populated. Click add index to table. We can also add in a local secondary index, whose index hash key is automatically publisher, but for which you can specify a new range key, say, publication date. Click add again. Let's click continue to set our provision through a put capacity.
To keep costs down, you might want to limit how many queries you'll allow at a given time. According to Amazon, a single unit of read capacity represents one strongly consistent read per second, for items as large as four kilobytes. A unit of write capacity represents one write per second for items as large as one kilobyte. Because we have added a global secondary index, we'll require at least two read capacity units and two write capacity units. But we could, and likely will require more. The throughput calculator can help decide what we need. We'll assume that the size of our items will be less than one kilobyte, and that we'll likely experience ten reads per second and ten writes per second. If we require only eventually consistent results than we're told, we'll require 15 read and 30 write capacity units. Click continue.
We could set SNS alarm notification for when traffic exceeds, say, 80% of our provisioned output, but we'll skip that for now. We can now review our settings and create. Now form a local console with AWS PHP STK installed, let's add some data to our table. We've created a file called Add Data php to the var www html directory on a local machine running Apache 2. The required line points to the absolute address in our local system, of the AWS auto loader.php file. All other references will be relative to that location. We'll use the DynamoDB client and credentials files exactly the way they came when we installed the STK. Note the backslashes used in these two addresses. Ignoring this will cause you grief, and I can tell you all about it. Next, we'll populate the credentials variable with access key ID and secret access key security credentials that we generated from the AWS dashboard.
Using the factory method, we'll let DynamoDB client use our credentials values and set our region as US East 1. Now comes the action. Using the put item function, we define our cable name as my table, the name we gave our table. We'll add the value cloud academy to the publisher key, and DynamoDB for Dummies to the book title key. All we've got to do now is make sure our syntax is correct. The var log Apache 2 error.log file should be your best friend on this. And then, load the file using a local browser. Now let's head back to the AWS dashboard, and click on explore table. Our new data is there.
David taught high school for twenty years, worked as a Linux system administrator for five years, and has been writing since he could hold a crayon between his fingers. His childhood bedroom wall has since been repainted.
Having worked directly with all kinds of technology, David derives great pleasure from completing projects that draw on as many tools from his toolkit as possible.
Besides being a Linux system administrator with a strong focus on virtualization and security tools, David writes technical documentation and user guides, and creates technology training videos.
His favorite technology tool is the one that should be just about ready for release tomorrow. Or Thursday.