image
Databases on AWS - part 2
SimpleDB as an easy alternative to DynamoDB
Difficulty
Beginner
Duration
47m
Students
460
Description

Databases are among the most used applications in the cloud (or anywhere else, for that matter). Managing data is exactly what computers were invented for, so it should come as no surprise that a great deal of attention is focused on the many different Database Management Systems and Data Management tools that are available.

This second part of our mini series covering AWS databases is about some of Amazon's advanced solutions. You will learn more about Redshift, the solution for massive petabyte-scale data warehouse, Elasticache, the Redis- and Memcached-based solution for in-memory cache, and SimpleDB, an easy alternative for NoSQL databases.

Who should take this course

For this beginner course, you'll require no special prerequisites. Nevertheless, some experience with Databases and at least a basic knowledge of the related jargon might be helpful. If you are completely new to the cloud, you might benefit from our introduction to cloud computing course. You might also find the AWS general introduction course interesting if you are not yet that familiar with the AWS cloud platform.

If you want to test your knowledge of the basic topics covered by this course, we strongly suggest you take our quiz questions. And of course, the first part of this course is a must if you want to learn more about the two major DB services on AWS: RDS and DynamoDB

Transcript

Amazon Simple DB is a highly available, scalable, and flexible non-relational data store that enables you to store and query data items using web service requests.

Amazon Simple DB provides a simple web services interface to create and store multiple data sets, query your data easily and return the results. Your data is automatically indexed, making it easy to quickly find the information that you need. Being a no SQL database, there's no need to predefine a schema or change a schema if new data is added later. Amazon Simple DB automatically creates multiple geographically distributed copies of each data item you store. This provides high availability and durability in the unlikely event that one replica fails.

Amazon Simple DB can failover to another replica in the system. As your business changes or application evolves, you can easily reflect these changes in Amazon Simple DB without worrying about breaking a rigid schema or needing to refactor code. Simply add another attribute to your Amazon Simple DB data set when needed.

Amazon Simple DB is designed to integrate easily with other AWS services, such as Amazon S3 and EC2, providing the infrastructure for creating web scale applications.

Amazon Simple DB provides an HTTPs end point to ensure secure, encrypted communication between your application or client and your domain. In addition, through integration with AWS identity and access management, you can establish user or group level control over access to specific simple DB domains and operations.

The data model used by Amazon Simple DB makes it easy to store, manage and query your structured data. Developers organize their data set into domains and can run queries across all the data stored in a particular domain. Domains are collections of items that are described by attribute-value pairs. For a better understanding, consider this spreadsheet model. The following components correspond to each part of a spreadsheet. Domains, represented by the domain worksheet tabs at the bottom of the spreadsheet, domains are similar to tables that contain similar data. You can execute queries against a domain, but cannot execute queries across different domains without programming the application level. Items, represented by the spreadsheet rows. Items represent individual objects that contain one or more attribute name-value pairs.

Attributes, represented by the spreadsheet columns, attributes represent categories of data assigned to items. And finally, values, represented by the spreadsheet cells. Values represent the instances of attributes for items. Unlike a spreadsheet, multiple values can be associated with a cell. Note that Amazon Simple DB does not require the presence of specific attributes. You can create a single domain that contains dissimilar item types.

Amazon Simple DB stores multiple geographically distributed copies of each domain to enable high availability and data durability. A successful write means that all copies of the domain will durably persist. Amazon Simple DB supports two read consistency options. Eventually consistent reads, the default. This option maximizes your read performance in terms of low latency and high throughput.

However, an eventually consistent read might not reflect the results of a recently completed write. Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data.

Consistent reads. In addition to eventual consistency, Amazon Simple DB also gives you the flexibility and control to request a consistent read if your application or an element of your application requires it. A consistent read returns a result that reflects all writes that received a successful response prior to the read. Amazon Simple DB is not a relational database and sacrifices complex transactions and relations in order to provide a unique functionality and performance characteristics.

However, Amazon Simple DB does offer transactional semantics, such as conditional puts and deletes which enables you to insert, replace, or delete values for one or more attributes of an item if the existing value of an attribute matches the value you specify. If the value does not match or is not present, the update is rejected. Conditional puts and deletes are useful for preventing lost updates when different sources write concurrently to the same item. Unlike Amazon S3, Amazon Simple DB does not store raw data, rather, it takes your data as input and expands it to create multiple indexes, thereby enabling you to quickly query that data.

Additionally, Amazon S3 and Amazon Simple DB use different types of physical storage. Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively. Amazon Simple DB stores small bits of data and uses less dense drives that are optimized for data access speed. In order to optimize your costs across AWS services, large objects or files should be stored in Amazon S3, while smaller data elements or file pointers, possibly to Amazon S3 objects, are best saved in Amazon Simple DB.

Amazon Simple DB currently enables individual domains to grow up to 10 gigabytes each. If your data set is larger than 10 gigabytes, simply take advantage of Amazon Simple DB's scale out architecture and spread your data over multiple domains. Since Amazon Simple DB is designed with parallelism in mind, spreading your data over more domains will also increase your write and read throughput potential. You are initially allocated a maximum of 250 domains. To use Amazon Simple DB, you should build your data set by choosing a region for your domain or domains to optimize for latency, minimize costs, or address regulatory requirements. Create and manage query domains and then create and manage the data set within each query domain. Also retrieve your data by using get attributes to retrieve a specific item, and use select to query your data for items that meet specific criteria. To get set up, you need to have an AWS account and get your AWS access key ID and secret access key and install the Amazon Simple DB scratchpad. The Amazon Simple DB is not an AWS console, so you need to download and use this simple html and javascript application that allows you to explore the Amazon Simple DB API without writing any code.

Also, instead of this web application, you can use other applications, like SDB Explorer SDB Tool, which are plugins for Simple DB and Firefox or SDB Navigator, which is a plugin for Google Chrome. After downloading and extracting the scratchpad files, navigate to web app folder within the folder where you extracted the scratchpad and open the index.html file with a web browser. As you can see, the welcome pad appears. Now you need to copy and paste your AWS access key ID and AWS secret access key in the specified boxes.

The first step in storing data within Amazon Simple DB, is to create one or more domains. As mentioned before, domains are similar to database tables, except that you can't perform functions across multiple domains, such as querying multiple domains or using foreign keys. As a consequence, you should plan on Amazon Simple DB data architecture that will meet the needs of your project. But note that although the Simple DB API does not perform queries across multiple domains, you can design your applications to perform queries across multiple domains.

To create and verify a domain, select create domain from the scratchpad explorer API list box. As you can see, the create domain page appears. Enter your preferred name for your domain in domain name box, like My Store and then click on the invoke request button. You can see, Amazon Simple DB returns a response. To verify the domain was created successfully, select list domains from the scratchpad explorer API list box. As you can see, the list domains page appears. After creating a domain, you're ready to start putting data into it. For this step, select put attributes from the scratchpad explorer API list box. As you can see, the put attributes page appears. Now, in the domain name box, write the name of the domain you created before. As you can see, we write My Store. And then in the item name box, type the name of the item, like item_01. Okay. Now you can add different attributes to this item in the attributes section. Write a name for the first attribute in the name box and specify its value in the value box.

Now you can add more attributes by clicking on the plus. And finally, click on invoke request. As you can see, we add some items in attributes for them in our domain.

After putting data into the domain, we can run queries against the domain to find items that match our criteria. According to the data we put in the domain, to query the domain for items, select, select from the scratchpad explorer API list box. In the select expression box, write a query in this format, then click on invoke request.

Amazon Simple DB returns items according to the entered data. Note that, at any time, you can modify items, delete attributes of items, or just delete items.

But if you delete the whole domain, all the data in that domain is permanently deleted and it's not going to be recoverable.