1. Home
  2. Training Library
  3. Serverless, Component Decoupling, and Solution Architectures (SAP-C02)

Kinesis Data Firehose


Course Introduction
Utilizing Managed Services and Serverless Architectures to Minimize Cost
Decoupled Architecture
Amazon API Gateway
Advanced API Gateway
PREVIEW11m 29s
Amazon Elastic Map Reduce
Introduction to EMR
Amazon EventBridge
Design considerations

The course is part of this learning path

The second type of consumer to a Kinesis Data Stream can be an Amazon Kinesis Data Firehose delivery stream.  As the name suggest a Firehose delivery stream can pick up large data sets, transform and load them to destinations like Amazon S3, DynamoDB, Amazon EMR, OpenSearch, Splunk, DataDog, NewRelic, Dynatrace, SumoLogic, LogicMonitor, MongoDB, HTTP endpoints and Amazon Redshift  as destinations.   

Kinesis firehose manages all the infrastructure, storage, networking and configuration required to ingest and store your data to a destination. It’s fully managed which means you do not have to provision, deploy, maintain hardware, software or write any application to manage the process. It scales automatically and like many other AWS storage services it replicates data across three facilities in a region. 

Kinesis Firehose buffers the input stream to a predefined size and for a predefined time before loading it to destinations. The Buffer Size is in MBs and go from 1MB to 128MB for S3, from 1MB to 100MB for OpenSearch and 0.2MB up to 3MB for lambda functions. 

The Buffer Interval is in seconds and goes from 60 to 900- seconds.

Kinesis Firehose will store data for up to 24 hours if the delivery destination is unavailable unless the source is a  Kinesis DataStream in which case it will be retained according to the data stream configuration not firehose. 

In the case of putting data to Amazon RedShift, Kinesis Firehose uses Amazon S3 as the first step before loading data to your RedShift  Cluster

Kinesis data Firehose does not use shards and is fully automated in terms of scalability. Kinesis firehose can compress and encrypt data before delivering it to storage destinations. 

For Amazon S3, OpenSearch and Splunk destinations, if data is transformed you can optionally back up the source data to another S3 bucket.   

Firehose operates fast BUT NOT in real time. You should expect latency of 60 seconds or more when using Kinesis Firehose to store to destinations.  Also, for Kinesis Firehose you pay for the amount of data going through it. 

Kinesis Data Firehose is usually the delivery service used to get Kinesis Data stream records to AWS Storage services. Message producers to Kinesis data firehose are not limited to kinesis data streams and any application can produce messages for kinesis firehose to deliver to AWS Storage services. The Kinesis Agent is a pre-fabricated Java application which once installed and configured collects and send data to your delivery stream. You can install the Kinesis Agent on linux systems for web servers, log servers and database servers.  The agent is also available on GitHub.  The Amazon Linux, Red Hat and Microsoft windows operating systems are supported.

Both Kinesis Data Streams and Kinesis Firehose are part of the Kinesis streaming data platform which includes Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, and Kinesis Video Streams.

4h 43m

This section of the AWS Certified Solutions Architect - Professional learning path introduces common AWS solution architectures relevant to the AWS Certified Solutions Architect - Professional exam and the services that support them. These services form a core component of running resilient and performant architectures. 

Want more? Try a Lab Playground or do a Lab Challenge!

Learning Objectives

  • Learn how to utilize managed services and serverless architectures to minimize cost
  • Understand how to use AWS services to process streaming data
  • Discover AWS services that support mobile app development
  • Understand when to utilize serverless services within your AWS solutions
  • Learn which AWS services to use when building a decoupled architecture
About the Author
Learning Paths

Danny has over 20 years of IT experience as a software developer, cloud engineer, and technical trainer. After attending a conference on cloud computing in 2009, he knew he wanted to build his career around what was still a very new, emerging technology at the time — and share this transformational knowledge with others. He has spoken to IT professional audiences at local, regional, and national user groups and conferences. He has delivered in-person classroom and virtual training, interactive webinars, and authored video training courses covering many different technologies, including Amazon Web Services. He currently has six active AWS certifications, including certifications at the Professional and Specialty level.