This Learning Path prepares you for the AWS Big Data Specialty Certification. The AWS Certified Big Data - Specialty Exam validates technical skills and experience in designing and implementing AWS services to derive value from data. We cover the six domains of the Big Data Specialty exam outline with Courses, Labs and Quizzes. We start with an introduction to analytics and database fundamentals. We then learn more relevant detail on Big Data collection, storage, processing, analysis, visualization and security.
This Big Data Learning Path (Specialty Certification) lasts almost 22 hours, and is made up of 14 Courses, 3 Quizzes and 2 Laboratories.
For domain one we explain the various data collection methods and techniques for determining the operational characteristics of a collection system. We explore how to define a collection system able to handle the frequency of data change and the type of data being ingested. We identify how to enforce data properties such as order, data structure, and metadata, and to ensure the durability and availability for our collection approach.
Domain two of the Big Data Specialty learning path focuses on storage. In this group of courses, we outline the key storage options for big data solutions. We determine data access and retrieval patterns, and some of the use cases that suit particular data patterns such as evaluating mechanisms for capture, update, and retrieval of catalog entries. We learn how to determine appropriate data structure and storage formats, and how to determine and optimize the operational characteristics of a Big Data storage solution.
Data Processing Technologies
In domain three of the Big Data Specialty Learning Path we learn how to identify the appropriate data processing technologies needed for big data scenarios. We explore how to design and architect a data processing solution, and explore and define the operational characteristics of big data processing. We delve in to the various processing services available focusing on Amazon Kinesis, Elastic Map Reduce and Amazon Rekognition.
For domain four of the Big Data Specialty Learning Path we learn how to determine the tools and techniques required for data analysis. We explore how to design and architect an analytical solution, and how to optimize the operational characteristics of the Analysis System using tools such as Amazon Athena and Kinesis.
In domain five we learn how to determine the appropriate techniques for delivering the results/output of a query or analysis. We examine how to design and create a visualization platform using AWS services, and how to optimize visualization services to present results in an effective and accessible manner using Amazon Quicksight.
Data security comprises 20% of the certification curriculum so it is important students have a thorough understanding of security best practices for Big Data solutions. In this course we examine how to determine encryption requirements and how to implement encryption services. We examine how to choose the appropriate technology to enforce data governance, and Identify how to ensure data integrity while working with Big Data solutions.
- Before beginning this Learning Path, we would recommend:
- Having a current AWS Certified Cloud Practitioner, or any other AWS Associate level certification
- At least a couple of years experience in data analytics
- An understanding of Big Data and its core principles and best practices.
What is big data?
In basic terms, Big Data is made up of larger and complicated data sets, from new sources in particular. As these sets are so sizeable, they cannot be managed by traditional data processing software. The upside of the large volumes of data is that they can be used to tackle business problems that otherwise would not be able to be addressed.
Is Google Analytics big data?
Google Analytics is considered part of the big data umbrella as it processes all of the ‘big data’ in e.g. a website and creates simplified reports on areas such as: views, bounce rate, visitors etc. Google Analytics can be considered one of the pioneers of the big data space and it remains a strong player in the field.
What are data solutions?
Data solutions can help to facilitate, manage and store a business’ valuable information. Data solutions range from computer programmes to personnel staffing and include distribution systems, modelling software, and business intelligence.
What is AWS data lake?
In essence, a ‘lake’ holds a huge amount of raw and unformatted data for as long as necessary until it is needed. Using a flat architecture to store data it takes a difference approach to a hierarchical data warehouse that models itself on storing data in files and folders.
Learning Path Steps
Knowledge Check: Database Fundamentals for AWS
Learn how to use the Amazon Redshift service. Create a cluster with a database, copy data from S3, query data using SQL, and resize the cluster.
Big Data Speciality - Collection and Storage
Big Data Speciality - Processing
Broad introductory lab on Amazon Elastic MapReduce (EMR). Get started today!
Introduction to the Principles and Practice of Amazon Machine Learning
Learn how to implement object detection on every new image uploaded on Amazon S3.
Big Data Speciality - Analysis
In this lab, you'll learn about Amazon Key Management Service to encrypt S3 and EBS Data at an intermediate level. Get started today!
Amazon S3 Security Features
About the Author
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.