This Learning Path prepares you for the AWS Big Data Specialty Certification. The AWS Certified Big Data - Specialty Exam validates technical skills and experience in designing and implementing AWS services to derive value from data. We cover the six domains of the Big Data Specialty exam outline with Courses, Labs and Quizzes. We start with an introduction to analytics and database fundamentals. We then learn more relevant detail on Big Data collection, storage, processing, analysis, visualization and security.
This Big Data Learning Path (Specialty Certification) lasts almost 22 hours, and is made up of 14 Courses, 3 Quizzes and 2 Laboratories.
For domain one we explain the various data collection methods and techniques for determining the operational characteristics of a collection system. We explore how to define a collection system able to handle the frequency of data change and the type of data being ingested. We identify how to enforce data properties such as order, data structure, and metadata, and to ensure the durability and availability for our collection approach.
Domain two of the Big Data Specialty learning path focuses on storage. In this group of courses, we outline the key storage options for big data solutions. We determine data access and retrieval patterns, and some of the use cases that suit particular data patterns such as evaluating mechanisms for capture, update, and retrieval of catalog entries. We learn how to determine appropriate data structure and storage formats, and how to determine and optimize the operational characteristics of a Big Data storage solution.
Data Processing Technologies
In domain three of the Big Data Specialty Learning Path we learn how to identify the appropriate data processing technologies needed for big data scenarios. We explore how to design and architect a data processing solution, and explore and define the operational characteristics of big data processing. We delve in to the various processing services available focusing on Amazon Kinesis, Elastic Map Reduce and Amazon Rekognition.
For domain four of the Big Data Specialty Learning Path we learn how to determine the tools and techniques required for data analysis. We explore how to design and architect an analytical solution, and how to optimize the operational characteristics of the Analysis System using tools such as Amazon Athena and Kinesis.
In domain five we learn how to determine the appropriate techniques for delivering the results/output of a query or analysis. We examine how to design and create a visualization platform using AWS services, and how to optimize visualization services to present results in an effective and accessible manner using Amazon Quicksight.
Data security comprises 20% of the certification curriculum so it is important students have a thorough understanding of security best practices for Big Data solutions. In this course we examine how to determine encryption requirements and how to implement encryption services. We examine how to choose the appropriate technology to enforce data governance, and Identify how to ensure data integrity while working with Big Data solutions.
- Before beginning this Learning Path, we would recommend:
- Having a current AWS Certified Cloud Practitioner, or any other AWS Associate level certification
- At least a couple of years experience in data analytics
- An understanding of Big Data and its core principles and best practices.
What is big data?
In basic terms, Big Data is made up of larger and complicated data sets, from new sources in particular. As these sets are so sizeable, they cannot be managed by traditional data processing software. The upside of the large volumes of data is that they can be used to tackle business problems that otherwise would not be able to be addressed.
Is Google Analytics big data?
Google Analytics is considered part of the big data umbrella as it processes all of the ‘big data’ in e.g. a website and creates simplified reports on areas such as: views, bounce rate, visitors etc. Google Analytics can be considered one of the pioneers of the big data space and it remains a strong player in the field.
What are data solutions?
Data solutions can help to facilitate, manage and store a business’ valuable information. Data solutions range from computer programmes to personnel staffing and include distribution systems, modelling software, and business intelligence.
What is AWS data lake?
In essence, a ‘lake’ holds a huge amount of raw and unformatted data for as long as necessary until it is needed. Using a flat architecture to store data it takes a difference approach to a hierarchical data warehouse that models itself on storing data in files and folders.
Learning Path Steps
This course introduces the AWS Big Data – Specialty Certification Preparation learning path.
In this course, we will explore the analytics tools provided by AWS, including Elastic Map Reduce (EMR), Data Pipeline, Elasticsearch, Kinesis, Amazon Machine Learning, and QuickSight.
This course provides you with an introduction to the cloud database services offered by AWS and how to use them.
Knowledge Check: Database Fundamentals for AWS
This course will help you to increase your knowledge of data collection methods and techniques with big data solutions in AWS.
In this course, we outline the key storage options for big data solutions in AWS.
Learn how to use the Amazon Redshift service. Create a cluster with a database, copy data from S3, query data using SQL, and resize the cluster.
In this course, you'll learn the fundamentals of Amazon DynamoDB, including table design, reading, writing, and working with large tables.
In this course, you'll learn how to identify the appropriate data processing technologies needed for big data scenarios.
Broad introductory lab on Amazon Elastic MapReduce (EMR). Get started today!
Introduction to the Principles and Practice of Amazon Machine Learning
This course provides an in-depth introduction to the principles and practice of Amazon Machine Learning.
This course explores the AWS Athena service, reviewing fundamental AWS Athena storage and querying concepts.
In this introductory course, you will learn to recognize and explain the core components of Amazon Kinesis and where those services can be applied.
In this course, you'll learn about the key features and core components of Kinesis Analytics, and what an end-to-end real-time data streaming example looks like.
Learn how to implement object detection on every new image uploaded on Amazon S3.
In this course, you'll how to recognize the best techniques for delivering results of a query, and how to create data visualization.
This course looks at how to secure your big data within AWS by implementing different data encryption options.
Using Amazon Key Management Service to Encrypt S3 and EBS Data
In this lab, you'll learn about Amazon Key Management Service to encrypt S3 and EBS Data at an intermediate level. Get started today!
This course concludes the AWS Big Data – Specialty Certification Preparation for AWS learning path.
About the Author
Head of Content
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.