1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. Solutions Architect Professional Level Certification for AWS (1 of 3)

Analytics

Contents

keyboard_tab
Introduction
1
Series Overview
PREVIEW1m 30s
2
Terminology
PREVIEW18m 9s
AWS Basic Services
3
Compute
13m 2s
4
Storage
9m 45s
5
Database
6m 33s
6
AWS Administration Services
8
Analytics
4m 35s
Conclusion
play-arrow
Start course
Overview
DifficultyAdvanced
Duration1h 33m
Students2044

Description

With the AWS Solutions Architect Professional level certification, Amazon sought to identify individual administrators who are truly platform experts. Unless you've got some significant experience with AWS deployments and good familiarity with the full range of Amazon services, you probably won't have a chance at passing.

However, with this series of Cloud Academy courses, cloud expert Kevin Felichko is committed to making your learning process as smooth and productive as possible. Once complete, Kevin's guide to the Solutions Architect Professional level exam will lead you through the theoretical and practical skills you'll need to master this material.

This first course covers the key conceptual terminology and services that form the base for Amazon cloud architecting. The second course will guide you through fully-realized practical deployments, from start to finish. And the final course (late August, 2015) will focus on exam preparation and, in particular, strategies for confronting the particularly complicated style of question you'll face.

Do you have questions on this course? Contact our cloud experts in our community forum.

Transcript

With the advent and the Cloud, Big Data Analytics has become more accessible and it has gotten a whole lot cheaper.

AWS has led the charge in offering services to make it easier than it has ever been to collect and crunch the data that is important to you. Let's examine how this is possible.

Elastic MapReduce is a managed Hadoop framework designed for quickly processing large amounts of data in a cost-effective way. It can dynamically scale across EC2 Instances based on how much you want to spend. EMR offers self-healing and fault-tolerant processing of your data.

A common use-case for using Elastic MapReduce is to process user behaviour that you collect from multiple sources. If you are already using Hadoop on-premise, then migrating to EMR can offer improved cost and processing with less administration.

EMR pricing is a combination of EC2 pricing plus the per hour cost of EMR. You can use EC2 pricing models such as On-Demand, Reserved and Spot pricing or a combination of them to help manager your costs.

Kinesis is a fully managed service for processing real-time data streams. It can capture terabytes of data, per hour, from over 100,000 different sources. Output from Kinesis can be saved to S3, Redshift, EMR, Lambda, and more. There's a Kinesis Client Library that can be used for integration with Kinesis into your applications. If you need a dashboard that shows updates in real time Kinesis is the perfect solution since you can combine data sources from all over including social media.

A social media stream into Kinesis can help you gauge customer satisfaction. You can also use leading indicators to help you pre-emptively deal with issues before they become too big to handle.

Kinesis charges per Shard, a Shard being a Throughput unit that handles 1 MB per second of data input and 2 MBs a second of data output. A single Shard can handle up to 1,000 PUT requests per second. You define the number of Shards to use for a data stream. On top of the Shards, you are charged per one million PUT requests; prices vary per region.

The AWS Data Pipeline is a service for reliably processing and moving data between compute and storage services. You can use Data Pipeline to move data between an on-premise data source to a cloud data source. It executes in a fault-tolerant way. There are templates for executing common transformation tasks.

Data Pipeline is useful, for example, when you need to move data from an on-premise system to RDS on a regular basis or you want to move DynamoDB data through EMR on a nightly basis in order to calculate billing for complex calculation rules. It is a critical service for any scenario where you need to move large batches of data.

Data Pipeline costs are based on frequency and location. If you run every hour or 12 hours, you are charged the high frequency rates. 'Once per day' would be considered a low frequency rate. On-premise activities are charged at higher rates than AWS-based activities; prices will vary by region.

AWS Analytics play a growing role in many reasons that companies move portions of their infrastructure to the Cloud.

It is important for you to know the purpose and reasons to use each of the services for the exam. Knowing the best pricing options will be critical to your success as well, for example, knowing that EMR can use a mixture of EC2 pricing models can help you find the most cost-effective solution out of four very similar solutions.

In our next lesson, we are going to take a look at Application Services which can help bring Fault Tolerance and High Availability to your application with ease.

 

About the Author

Kevin is a seasoned technologist with 15+ years experience mostly in software development.Recently, he has led several migrations from traditional data centers to AWS resulting in over $100K a year in savings. His new projects take advantage of cloud computing from the start which enables a faster time to market.

He enjoys sharing his experience and knowledge with others while constantly learning new things. He has been building elegant, high-performing software across many industries since high school. He currently writes apps in node.js and iOS apps in Objective C and designs complex architectures for AWS deployments.

Kevin currently serves as Chief Technology Officer for PropertyRoom.com, where he leads a small, agile team.