Storage and Databases
Services at a glance
In this course we learn to recognize and explain AWS compute and storage fundamentals, and to recognise and explain the family of AWS services relevant to the certified developer exam. This course provides you with snapshots of each service, and covering just what you need to know, gives you a good, high-level starting point for exam preparation. It includes coverage of:
Amazon Simple Queue Service (SQS)
Amazon Simple Notification Service (SNS)
Amazon Simple Workflow Service (SWF)
Amazon Simple Email Service (SES)
Amazon API Gateway
Amazon Data Pipeline
AWS Elastic Beanstalk
Storage and database
Amazon Simple Storage Service (S3)
Amazon Elastic Block Store (EBS)
AWS Relational Database Service (RDS)
Other Database Services
Elastic Cloud Compute (EC2)
Elastic Load Balancing (ELB)
If you have thoughts or suggestions for this course, please contact Cloud Academy at firstname.lastname@example.org.
Dynamo DB is a NoSQL key-value store. Table scanning is made possible using secondary indexes based on your application search parameters. You can update streams which allow you to hook them into item label changes. In other use cases, consider Dynamo DB when your application model is schemaless and nonrelational. It can also serve as a persistent session storage mechanism for applications toapplications and take away service state.
Elasticache is a managed in-memory cache service for fast, reliable data access. The underlying engines behind ElastiCache are Memcached and Redis. With the Redis engine you can take advantage of multiple availability zones for high availability and scaling to read replicas. ElastiCache will automatically detect failed nodes and replace them without manual intervention. A typical use case for ElastiCache is low latency access of frequently retrieved data. So think cache database results of data within frequent changes for use in a heavily utilized web application for example. It can serve as temporary storage for compute-intensive workloads or when storing the results from IO intense queries or calculations.
RedShift is a fully managed petabyte-scale data warehouse optimized for fast delivery performance with large data sets. So if you're using HSMs, CloudHSM, and AWSQ management services, you can encrypt your data at rest. RedShift is fully compliant with a variety of compliance standards, including SOC 1, SOC 2, SOC 3, and PCI DSS Level 1. And you can query your data using standard SQL commands through ODBC or JDBC connections. And RedShift integrates with other services, including AWS data pipeline and Kinesis. You can use RedShift to archive large amounts of infrequently used data. When you want to execute analytical queries on large data sets, then RedShift is a really good service.
It's also an ideal use case for Elastic Map Reduce jobs that convert unstructured data into structured data. Elastic Map Reduce is a Managed Hadoop framework designed for quickly processing large amounts of data in a really cost-effective way. Now it can dynamically scale across EC2 instances based on how much you want to spend, right. So EMR offers self-healing and fault tolerant processing of your data. A common use case for using Elastic Map Reduce is to process user behavior data that you mighta collected from multiple sources.
And if you're already using Hadoop on premise, then migrating to EMR can offer improved cost and processing with less administration. Every request made to the EMR API is authenticated, so only authenticated users can create look up or terminate different job flows. When launching customer job flows, Amazon EMR sits up on Amazon EC2 security group, off the Master Node to only allow external access via SSH. To protect customer input and output data sets, Amazon Elastic Map Reduce transfers data to and from S3 using SSL. Right, Amazon Kinesis is a fully managed service for processing real-time data streams. It can capture terabytes of data, per hour, from over 100,000 different sources.
Output from Kinesis can be saved to storage such as S3, DynamoDB or RedShift, or ported to services such as EMR or Lambda among others. There's a Kinesis client library that can be used for integration with your other applications.
The Amazon Kinesis Client Library - or KCL- helps you consume and process data from an Amazon Kinesis stream. The Kinesis client library acts as an intermediary between your record processing logic and Streams.When you start a KCL application, it calls the KCL to instantiate a worker. This call provides the KCL with configuration information for the application, such as the stream name and AWS credentials.
This type of application is also referred to as a consumer. The Kinesis client library is different from the Streams API which you get in the AWS SDKs. The Streams API helps you manage Streams (including creating streams, resharding, and putting and getting records), while the KCL provides a layer of abstraction specifically for processing data in a consumer role.
So if you need a dashboard that shows updates in real time, Kinesis is a perfect solution since you can combine data sources from all over including social media.
About the Author
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data centre and network infrastructure design, to cloud architecture and implementation.
To date Stuart has created over 40 courses relating to Cloud, most within the AWS category with a heavy focus on security and compliance
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.