Utilizing Managed Services and Serverless Architectures to Minimize Cost
Amazon API Gateway
Amazon Elastic Map Reduce
Design a Multi-Tier Solution
When To Go Serverless
Which services should I use to build a decoupled architecture?
The course is part of this learning path
This section of the AWS Certified Solutions Architect - Professional learning path introduces common AWS solution architectures relevant to the AWS Certified Solutions Architect - Professional exam and the services that support them. These services form a core component of running resilient and performant architectures.
- Learn how to utilize managed services and serverless architectures to minimize cost
- Understand how to use AWS services to process streaming data
- Discover AWS services that support mobile app development
- Understand when to utilize serverless services within your AWS solutions
- Learn which AWS services to use when building a decoupled architecture
Hello! I'm Stephen Cole, a trainer here at Cloud Academy, and I'd like to welcome you to this lecture about Shard Capacity and Scaling with Amazon Kinesis Data Streams. I want to start with a review of some of Kinesis Data Streams' limits.
For writes, Kinesis Data Streams has a hard limit. Per shard, it supports a write rate of up to 1,000 records per second up to a maximum of 1 megabyte per second. When using the Standard Consumer for reads, each shard can support up to 5 transactions per second, up to a maximum read rate of 2 megabytes per second.
If either of these limits are exceeded on a shard a ProvisionedThroughputExceededException will be returned and stream throughput will be temporarily throttled. Hot shards have a large number of reads and writes per second while cold shards are the opposite; they have a low number of reads and write per second.
A hot shard can cause an entire stream to be throttled. The problem with cold shards is that they are wasting money. How does a shard become hot? Recall that, every second, each shard can support up to 1,000 writes up to a maximum of 1 megabyte.
This means that, if there are--on average--1,000 writes per second per shard, the Data Record size can be no greater than one kilobyte. Any larger, the 1 megabyte per second limit will be exceeded. When writing to a stream, if the average Data Record size is 2 kilobytes, to avoid becoming a hot shard and risk throttling, there can be no more than 500 writes per shard per second.
The total throughput of a stream is the sum of the capacities of its shards. It's an additive process. Increasing the number of shards results in higher processing speed and capacity of the data stream. Likewise, removing shards will lower the processing speed and the capacity of a stream.
If data throughput increases to a point that it moves past the available shard capacity for either reads or writes, requests will be throttled. Kinesis Data Streams can expand as needed and shrink to avoid wasting money, but it is not automatic nor is it managed by AWS. There is no Auto Scaling available like there is for Amazon EC2 instances for example. Scaling a Kinesis Data Stream up or down involves a process called resharding.
To add more throughput to a Kinesis Data Stream, add one or more shards. This is also referred to as Shard Splitting. It will increase the stream's capacity by 1 megabyte per second per shard. Shard Splitting can be used to divide a hot shard.
When a shard is split, it is closed and the existing Data Records will be deleted when they expire. Closing a shard prevents Producers from writing data to it but the Data Records remain available to Consumers until they expire.
This is what the process looks like. Here are three shards. They are numbered 1, 2, and 3. Shard 2 gets hot and needs to be split. The split operation will close shard 2, the Parent Shard, and create Shards 4 and 5, as Child Shards. Shard 4 and Shard 5 take the place of Shard 2. Even though the Parent Shard, Shard 2, is closed for writing, Consumers can access data in it until it expires. After a resharding operation, data records that were flowing to the Parent shard are rerouted to the Child shards.
The opposite of splitting shards to remove or decrease the number of shards is called Merging Shards. Merging Shards will decrease the number of available shards and remove capacity from a Kinesis Data Stream.
Shards that have low utilization are referred to as being cold. Merge cold shards to reduce costs. As with shard splitting, when Merging Shards there are Parent Shards and a Child Shard. The Parent Shards have their status changed to Closed and Write operations that would have gone to the Parent Shards are rerouted to the Child Shard.
Data Records are available for reading from the Parent Shards until they expire. After the shards expire, they will be deleted. I'm going to illustrate the process.
After my shard splitting operation earlier, my stream has 4 shards. In order, they are 1, 4, 5, and 3. Shards 5 and 3 have gone cold and I want to merge them to save costs. When I do this, a new shard is created, Shard 6.
Shard 5 and Shard 3--the Parent Shards--are closed for write operations and will be deleted when the Data Records contained inside it expire. Shard 6--the Child Shard--will receive Data Records intended for the Parent Shards as well as its own Data Records.
Resharding is a process that is managed programmatically. The API calls to change the number of shards are UpdateShardCount, splitShard, and mergeShards. UpdateShardCount is a stream-level API call and will update the shard count of a specified stream to a chosen number of shards.
Updating the shard count is an asynchronous operation that can happen while a stream is being actively used. Kinesis Data Streams sets the status of the stream to UPDATING and, once the update is complete, Kinesis Data Streams sets the status of the stream back to ACTIVE.
Data Records can be written to and read from a Data Stream when it is in an UPDATING state. To update the shard count, Kinesis Data Streams performs splits or merges on individual shards as requested. This can cause temporary shards to be created. These temporary shards count towards the total shard limit for an account.
When using UpdateShardCount, the recommendation from AWS is to specify a target shard count that is a multiple of 25%. Choosing a target percent value--such as 25, 50, 75, or 100 percent--the resharding process will complete faster than other target percentages.
Using UpdateShardCount has some limitations. You cannot reshard a stream more than ten times in a 24-hour period. When scaling up, the maximum number of shards that can be requested is double of what's currently in the stream. If you think about it, this makes sense. A shard-splitting operation turns one shard into two.
Similarly, when removing shards, you cannot scale below half of the current shard count in a stream. Also, it is not possible to scale past the shard limit of an account. These are soft limits and It can be raised by reaching out to AWS tech support.
Since one of the current limits of UpdateShardCount is a maximum total of 10,000 shards, I have inferred that this is AWS's hard limit. However, this has not been explicitly stated in the documentation. That said, the AWS online documentation lists another limitation stating that, if a stream has more than 10,000 shards, it can only be scaled down if the number of shards is less than 10,000.
I'm confident that 10,000 shards is a Kinesis Data Streams hard limit. Thinking about that limit for a moment; 10,000 shards is a significant amount of streaming throughput.
I want to go back to scaling operations. If a Kinesis Data Stream needs to be scaled up or down because there are either too many or too few shards, the stream needs to be resharded. I've mentioned that, after a resharding operation, data intended for a Parent Shard is rerouted to the Child Shard. I want to look to take a closer look at that.
Since Kinesis Data Streams is a real-time data streaming service, applications should assume that data is flowing continuously through the shards in a stream. Any Data Records that were in the parent shards before the reshard operation remain in those shards and can be read by Consumer Applications.
While resharding, a parent shard transitions from an OPEN state, to a CLOSED state, and then to an EXPIRED state. Before a reshard operation, a parent shard is in the OPEN state. Data records can be both added to and retrieved from the shard.
After a reshard operation, the parent shard transitions to a CLOSED state and data records are no longer added to the shard. However, Data Records are still available until they expire. Data records that would have been added to the parent shard are now added to a child shard instead.
Once the parent stream's retention period has expired, the Data Records contained in the parent shard are no longer accessible. At this point, the shard itself transitions to an EXPIRED state. After the reshard has occurred it is possible to read data from the child shards immediately. However, the parent shards that remain after resharding could contain data that hasn't been processed by a Consumer yet.
If data is read from a child shard before the parent shard has been completely consumed, data could be out of order. This is something that is not managed by AWS and must be considered when creating Consumer Applications.
To reiterate, if the order of data is important, be sure to completely consume any parent shards before reading from the child shards. It is possible to split and merge shards in an active stream. However, there are a number of limitations to be aware of when doing it.
Only one split or merge operation can happen at a time. It is not possible to run multiple resharding operations concurrently. Each split or merge request takes some time to complete.
According to the Amazon documentation, if a Kinesis Data Stream has 1,000 shards and capacity needs to be doubled to a total of 2,000 shards, it will take 30,000 seconds to complete. This is more than 8 hours. The lesson here is that, If a large amount of capacity is going to be required, provision it in advance.
Auto Scaling is not a feature of Kinesis Data Streams. It can be implemented, programmatically, using AWS Lambda in conjunction with Amazon CloudWatch and AWS Application Auto Scaling.
Before creating a stream, you will need to determine how many shards to provision. There are a number of variables to consider but, in general, there are two values that determine how many shards need to be provisioned.
Those numbers are, in megabytes, the amount of data being ingested and the amount of data being consumed. Of those numbers, the larger one will determine the number of shards. To determine these values, there is other data that needs to be collected.
I'll explain and walk through the process at the same time. First, estimate the average size of the Data Record that will be written to the data stream. This value is in kilobytes and rounded up to the nearest kilobyte. Looking at my data, the average size is about 500 bytes. Rounding up to the nearest kilobyte makes the value 1,024 bytes or 1 kilobyte. This is the average size of data in kilobytes.
Next, estimate the number of records written to the data stream per second. For my example stream, this is 5,000. This is the number of records per second. Then, decide on how many Kinesis Consumer Applications will process data from the stream. These Consumers are accessing the stream simultaneously and independently. For my application, I have 10. This is the number of consumers.
To calculate the bandwidth that will be ingested by a stream, multiply the number of records per second by the average size of data in kilobytes. 5,000 times 1 is 5,000. This means the incoming write bandwidth in kilobytes is 5,000. This is the amount of data being ingested every second.
To calculate the amount of data being consumed from the stream, multiply the incoming write bandwidth in kilobytes by the number of consumers. 5,000 times 10 is 50,000. The outgoing read bandwidth in kilobytes is 50,000. This is how much data is being consumed.
To determine the number of shards needed, divide the incoming write bandwidth in kilobytes by 1,000 and divide the outgoing read bandwidth in kilobytes by 2,000. The larger of these two numbers is 25. So, the total number of shards required is 25.
Thankfully, AWS has a tool that can do this calculation for you. It also provides extra information about costs as well. You can use your favorite search engine and enter the words kinesis data streams pricing calculator. Currently, the link is https://aws.amazon.com/kinesis/data-streams/pricing/
Entering the same information I used earlier returns the information needed plus how it is calculated. Please note that, at the time this content was written, the course information was accurate. Prices can--and do--change over time. Currently, the pricing calculator shows that I need 24.4 shards and this rounds up to a total of 25 shards.
For billing purposes, there are 730 hours in a month. 25 shards times 730 hours in a month results in 18,250 shard hours. At a cost of one and a half cents per shard hour, 18,250 times $0.015 comes out to $273.75 in US dollars. The PUT payload is calculated using 5,000 records that are 1 kilobyte in size.
I'm not going to do the calculations out loud, but I'll show them to you here. The cost of putting data into my streams is $183.96 in US dollars. 5,000 records that are 1 kilobyte in size requires a Kinesis Data Stream that has 25 shards and will cost approximately $457.71 in US dollars per month.
Remember, storing data in a Kinesis Data Stream for more than 7 days is possible and will add to these charges. That's what I wanted to cover in this lecture. Here's a quick review of the material:
- Kinesis Data Streams throughput is based on the number of shards it has.
- For writes per shard, the limit is 1,000 records per second up to a maximum of 1 megabyte per second.
- When using the Standard Consumer, each shard can support up to 5 transactions per second up to a maximum read rate of 2 megabytes per second.
- Hot shards have a large number of reads and writes per second while cold shards are the opposite; they have a low number of reads and write per second.
- A hot shard can cause throttling.
- Cold shards waste money.
- The total available throughput of a stream is the sum of the capacities of its shards.
- Kinesis Data Streams can scale but not in real time.
- The scaling process is called resharding.
- Adding a shard is done using a process called Shard Splitting.
- Removing a shard is Shard Merging.
- When a shard is split or merged there are Parent Shards and Child Shards. The Parent Shards are closed for writing but are available for reading until they expire.
- When a Parent Shard is closed, write operations intended for it are rerouted to the Child Shard.
- The process to generally increase or decrease the number of shards in a stream is done using the UpdateShardCount API call.
- It is possible to split or merge individual shards. Those API calls are splitShard and mergeShards, respectively.
- Shards can be in one of three states, Open, Closed, and Expired.
- Shard splitting can be done on a stream that is active.
- However, only one split or merge operation can happen at a time and each operation takes time to complete.
- Auto Scaling is not a feature of Kinesis Data Streams.
- Costs are based on the number of shards in a stream and the amount of data PUT into a stream.
That's it for now. I'm Stephen Cole with Cloud Academy and thank you for watching!
Danny has over 20 years of IT experience as a software developer, cloud engineer, and technical trainer. After attending a conference on cloud computing in 2009, he knew he wanted to build his career around what was still a very new, emerging technology at the time — and share this transformational knowledge with others. He has spoken to IT professional audiences at local, regional, and national user groups and conferences. He has delivered in-person classroom and virtual training, interactive webinars, and authored video training courses covering many different technologies, including Amazon Web Services. He currently has six active AWS certifications, including certifications at the Professional and Specialty level.