This Learning Path prepares you for the AWS Big Data Specialty Certification. The AWS Certified Big Data - Specialty Exam validates technical skills and experience in designing and implementing AWS services to derive value from data. We cover the six domains of the Big Data Specialty exam outline with Courses, Labs and Quizzes. We start with an introduction to analytics and database fundamentals. We then learn more relevant detail on Big Data collection, storage, processing, analysis, visualization and security.
This Big Data Learning Path (Specialty Certification) lasts almost 22 hours, and is made up of 14 Courses, 3 Quizzes and 2 Laboratories.
For domain one we explain the various data collection methods and techniques for determining the operational characteristics of a collection system. We explore how to define a collection system able to handle the frequency of data change and the type of data being ingested. We identify how to enforce data properties such as order, data structure, and metadata, and to ensure the durability and availability for our collection approach.
Domain two of the Big Data Specialty learning path focuses on storage. In this group of courses, we outline the key storage options for big data solutions. We determine data access and retrieval patterns, and some of the use cases that suit particular data patterns such as evaluating mechanisms for capture, update, and retrieval of catalog entries. We learn how to determine appropriate data structure and storage formats, and how to determine and optimize the operational characteristics of a Big Data storage solution.
Data Processing Technologies
In domain three of the Big Data Specialty Learning Path we learn how to identify the appropriate data processing technologies needed for big data scenarios. We explore how to design and architect a data processing solution, and explore and define the operational characteristics of big data processing. We delve in to the various processing services available focusing on Amazon Kinesis, Elastic Map Reduce and Amazon Rekognition.
For domain four of the Big Data Specialty Learning Path we learn how to determine the tools and techniques required for data analysis. We explore how to design and architect an analytical solution, and how to optimize the operational characteristics of the Analysis System using tools such as Amazon Athena and Kinesis.
In domain five we learn how to determine the appropriate techniques for delivering the results/output of a query or analysis. We examine how to design and create a visualization platform using AWS services, and how to optimize visualization services to present results in an effective and accessible manner using Amazon Quicksight.
Data security comprises 20% of the certification curriculum so it is important students have a thorough understanding of security best practices for Big Data solutions. In this course we examine how to determine encryption requirements and how to implement encryption services. We examine how to choose the appropriate technology to enforce data governance, and Identify how to ensure data integrity while working with Big Data solutions.
- Before beginning this Learning Path, we would recommend:
- Having a current AWS Certified Cloud Practitioner, or any other AWS Associate level certification
- At least a couple of years experience in data analytics
- An understanding of Big Data and its core principles and best practices.
What is big data?
In basic terms, Big Data is made up of larger and complicated data sets, from new sources in particular. As these sets are so sizeable, they cannot be managed by traditional data processing software. The upside of the large volumes of data is that they can be used to tackle business problems that otherwise would not be able to be addressed.
Is Google Analytics big data?
Google Analytics is considered part of the big data umbrella as it processes all of the ‘big data’ in e.g. a website and creates simplified reports on areas such as: views, bounce rate, visitors etc. Google Analytics can be considered one of the pioneers of the big data space and it remains a strong player in the field.
What are data solutions?
Data solutions can help to facilitate, manage and store a business’ valuable information. Data solutions range from computer programmes to personnel staffing and include distribution systems, modelling software, and business intelligence.
What is AWS data lake?
In essence, a ‘lake’ holds a huge amount of raw and unformatted data for as long as necessary until it is needed. Using a flat architecture to store data it takes a difference approach to a hierarchical data warehouse that models itself on storing data in files and folders.
Learning Path Steps
This learning path prepares you for the 3 hour AWS Big Data Specialty Certification exam. This learning path provides you an in-depth understanding of AWS big data services available and how to use those AWS services together to create Big Data solutions. ...
In this course, we will explore the Analytics tools provided by AWS, including Elastic Map Reduce (EMR), Data Pipeline, Elasticsearch, Kinesis, Amazon Machine Learning and QuickSight which is still in preview mode. We will start with an overview of Data Sc...
Overview This course will provide you with an introduction to the cloud database services offered by AWS. In this course we will first we explore the fundamentals of cloud databases, outline the cloud databases provided by AWS before exploring how to get st...
Knowledge Check: Database Fundamentals for AWS
Course Description: In course one of the AWS Big Data Specialty Data Collection learning path we explain the various data collection methods and techniques for determining the operational characteristics of a collection system. We explore how to define a c...
Update: Amazon Aurora is now MySQL and PostgreSQL-compatible. Course Description: Course two of the Big Data Specialty learning path focuses on storage. In this course we outline the key storage options for big data solutions. We determine data access an...
Learn how to use the Amazon Redshift service. Create a cluster with a database, copy data from S3, query data using SQL, and resize the cluster.
Course Description This course provides an introduction to working with Amazon DynamoDB, a fully-managed NoSQL database service provided by Amazon Web Services. We begin with a description of DynamoDB and compare it to other database platforms. The course ...
Big Data Speciality - Collection and Storage
Course Description: In this course for the Big Data Specialty Certification, we learn how to identify the appropriate data processing technologies needed for big data scenarios. We explore how to design and architect a data processing solution, and explore...
Course Description: This course will provide you with a good foundation to better understand Amazon Kinesis, along with helping you to get started with building streamed solutions. In this course we'll put a heavier emphasis on hands-on demos along with br...
Big Data Speciality - Processing
Broad introductory lab on Amazon Elastic MapReduce (EMR). Get started today!
Introduction to the Principles and Practice of Amazon Machine Learning
When we saw how incredibly popular our blog post on Amazon Machine Learning was, we asked data and code guru James Counts to create this fantastic in-depth introduction to the principles and practice of Amazon Machine Learning so we could completely satisfy...
In this course, we will perform an in-depth review of the Amazon Athena service. We will review and explain fundamental AWS Athena storage and querying concepts. We will highlight suitable use cases in which Athena can be applied effectively. You will be in...
In this Kinesis Analytics course, we will perform an in-depth review of the Amazon Kinesis Analytics service. We review where and when to use this service to best effect. You will be introduced to the key features and core components of the Kinesis Analytic...
Learn how to implement object detection on every new image uploaded on Amazon S3.
Big Data Speciality - Analysis
In this course we learn how to determine the appropriate techniques for delivering the results/output of a query or analysis. We examine how to design and create a visualization using AWS services, and how to optimize visualization services to present resu...
Resources mentioned throughout this course: Cloud Academy Courses: Amazon Web Services: Key Management Services (KMS) Working with Amazon Kinesis Getting started with AWS CloudHSM AWS Resources: Configuring HDFS Transparent Encryption in Amazon ...
In this lab, you'll learn about Amazon Key Management Service to encrypt S3 and EBS Data at an intermediate level. Get started today!
Amazon S3 Security Features
This learning path has enabled you to recognize and explain the AWS big data services that are available and how to use those AWS services together to create Big Data solutions. We covered the six domains of the big data specialty exam outline with courses...
Removed the "Amazon Machine Learning for Human Activity Recognition" Lab due to AWS phasing out support for the Amazon Machine Learning service
About the Author
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.