AWS Data Services
To be prepared for the AWS Certified Cloud Practitioner Exam, this course will enable you to demonstrate Amazon Simple Storage Service (S3), Amazon Glacier, Amazon Elastic Block Store (EBS) and Amazon CloudFront storage solutions, and help you identify when to apply AWS solutions to common business scenarios.
This course covers a range of different services, including:
- Amazon Simple Storage Service (S3)
- Amazon Elastic Block Storage (EBS)
- Amazon Glacier
- Amazon RDS
- Amazon DynamoDB, ElastiCache, and Redshift
- Amazon CloudFront
- AWS Import/Export Disk
- AWS Import/Export Snowball
- AWS Storage Gateway
By the end of this course, you should be able to:
- Describe the basic functions that each storage service performs within a cloud solution
- Recognize basic components and features of each storage service
- Identify which storage service would be most appropriate to a general use case
- Understand how each service utilizes the benefits of cloud computing, such as scalability or elasticity
This course is designed for:
- Anyone preparing for the AWS Certified Cloud Practitioner
- Managers, sales professionals, and other non-technical roles
Before taking this course, you should have a general understanding of basic cloud computing concepts.
If you have thoughts or suggestions for this course, please contact Cloud Academy at email@example.com.
Dynamo DB is a NoSQL key-value store. Table scanning is made possible using secondary indexes based on your application search parameters. You can update streams which allow you to hook them into item label changes. In other use cases, consider Dynamo DB when your application model is schemaless and nonrelational. It can also serve as a persistent session storage mechanism for applications to applications and take away service state.
Elasticache is a managed in-memory cache service for fast, reliable data access. The underlying engines behind ElastiCache are Memcached and Redis. With the Redis engine you can take advantage of multiple availability zones for high availability and scaling to read replicas. ElastiCache will automatically detect failed nodes and replace them without manual intervention. A typical use case for ElastiCache is low latency access of frequently retrieved data. So think cache database results of data within frequent changes for use in a heavily utilized web application for example. It can serve as temporary storage for compute-intensive workloads or when storing the results from IO intense queries or calculations.
RedShift is a fully managed petabyte-scale data warehouse optimized for fast delivery performance with large data sets. So if you're using HSMs, CloudHSM, and AWSQ management services, you can encrypt your data at rest. RedShift is fully compliant with a variety of compliance standards, including SOC 1, SOC 2, SOC 3, and PCI DSS Level 1. And you can query your data using standard SQL commands through ODBC or JDBC connections. And RedShift integrates with other services, including AWS data pipeline and Kinesis. You can use RedShift to archive large amounts of infrequently used data. When you want to execute analytical queries on large data sets, then RedShift is a really good service.
It's also an ideal use case for Elastic Map Reduce jobs that convert unstructured data into structured data. Elastic Map Reduce is a Managed Hadoop framework designed for quickly processing large amounts of data in a really cost-effective way. Now it can dynamically scale across EC2 instances based on how much you want to spend, right. So EMR offers self-healing and fault tolerant processing of your data. A common use case for using Elastic Map Reduce is to process user behavior data that you mighta collected from multiple sources.
And if you're already using Hadoop on premise, then migrating to EMR can offer improved cost and processing with less administration. Every request made to the EMR API is authenticated, so only authenticated users can create look up or terminate different job flows. When launching customer job flows, Amazon EMR sits up on Amazon EC2 security group, off the Master Node to only allow external access via SSH. To protect customer input and output data sets, Amazon Elastic Map Reduce transfers data to and from S3 using SSL. Right, Amazon Kinesis is a fully managed service for processing real-time data streams. It can capture terabytes of data, per hour, from over 100,000 different sources.
Output from Kinesis can be saved to storage such as S3, DynamoDB or RedShift, or ported to services such as EMR or Lambda among others. There's a Kinesis client library that can be used for integration with your other applications.
The Amazon Kinesis Client Library - or KCL- helps you consume and process data from an Amazon Kinesis stream. The Kinesis client library acts as an intermediary between your record processing logic and Streams.When you start a KCL application, it calls the KCL to instantiate a worker. This call provides the KCL with configuration information for the application, such as the stream name and AWS credentials.
This type of application is also referred to as a consumer. The Kinesis client library is different from the Streams API which you get in the AWS SDKs. The Streams API helps you manage Streams (including creating streams, resharding, and putting and getting records), while the KCL provides a layer of abstraction specifically for processing data in a consumer role.
So if you need a dashboard that shows updates in real time, Kinesis is a perfect solution since you can combine data sources from all over including social media.
Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built 70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+ years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.