In this course, we outline the key storage options for big data solutions. We determine data access and retrieval patterns, and some of the use cases that suit particular data patterns such as evaluating mechanisms for capture, update, and retrieval of catalog entries. We learn how to determine appropriate data structure and storage formats, and how to determine and optimize the operational characteristics of a Big Data storage solution.
Learning Objectives
- Recognize and explain big data access and retrieval patterns.
- Recognize and explain appropriate data structure and storage formats.
- Recognize and explain the operational characteristics of a Big Data storage solution.
Intended Audience
This course is intended for students looking to increase their knowledge of the AWS storage options available for Big Data solutions.
Prerequisites
While there are no formal prerequisites for this course, students will benefit from having a basic understanding of cloud storage solutions. Our courses on AWS storage fundamentals and AWS Database Fundamentals will give you a solid foundation for taking this present course.
Updates
Amazon Aurora is now MySQL and PostgreSQL-compatible.
And so, just before we close out this module on Amazon DynamoDB, let's have a quick look at an example architecture from AWS. And in this scenario, what we're seeing is a model where we can use DynamoDB as part of the Big Data services to process time series data. So, in this scenario, what we're looking at is where we have data being streamed from sensors, such as power meters or industrial meters, or even satellites, and the data's being streamed in using simple queueing services, one of the number of the Amazon services, and the data effectively landing into the DynamoDB database.
And at the same time, we could be loading data in from applications such as Scarta, maybe it's a flow of samples to be used to join and process and from then, we can then move the data through into other services such as History, Amazon MapReduce, and through into Redshift. So you can see here how Dynamo is kind of a midpoint, right in the middle for receiving that data before streaming it on to other services that can use it. So that brings us to the end of the Amazon DynamoDB module.
As we've seen, Amazon DynamoDB is a NoSQL database in the cloud and it's suitable for anybody who needs a reliable and fully managed, NoSQL solution. The DynamoDB service is designed to provide automated storage scaling and low latency and is particularly useful when your application must read and store massive amounts of data and you need the speed and reliability behind that. So that's the end of the DynamoDB module. I look forward to talking to you soon.
Shane has been emerged in the world of data, analytics and business intelligence for over 20 years, and for the last few years he has been focusing on how Agile processes and cloud computing technologies can be used to accelerate the delivery of data and content to users.
He is an avid user of the AWS cloud platform to help deliver this capability with increased speed and decreased costs. In fact its often hard to shut him up when he is talking about the innovative solutions that AWS can help you to create, or how cool the latest AWS feature is.
Shane hails from the far end of the earth, Wellington New Zealand, a place famous for Hobbits and Kiwifruit. However your more likely to see him partake of a good long black or an even better craft beer.