Determine how to secure application tiers - AWS Service Encryption
Domain 3: Specify Secure Applications and Architectures
Hello, and welcome to this lecture where I'm going to be looking at how Amazon Kinesis utilizes encryption mechanisms. I will be looking at both Kinesis Firehose and Kinesis Streams.
If you are new to Amazon Kinesis, you may find it useful to take our existing course covering AWS Kinesis found here.
Let me start by providing a high level overview of the differences between each of these services. Amazon Firehose. This service is used to deliver real-time streaming data to different services and destinations within AWS, many of which can be used for big data such as S3 Redshift and Amazon Elasticsearch.
The service is fully managed by AWS, taking a lot of the administration of maintenance out of your hands. Firehose is used to receive data from your data producers where it then automatically delivers the data to your chosen destination. Amazon Streams. This service essentially collects and processes huge amounts of data in real time and makes it available for consumption.
This data can come from a variety of different sources. For example, log data from the infrastructure, social media, web clicks during feeds, market data, etc. So now we have a high-level overview of each of these. We need to understand how they implement encryption of any data process in stored should it be required.
When clients are sending data to Kinesis in transit, the data can be sent over HTTPS, which is HTTP with SSL encryption. However, once it enters the Kinesis service, it is then unencrypted by default. Using both Kinesis Streams and Firehose encryption, you can assure your streams remain encrypted up until the data is sent to its final destination.
As we know, Amazon Firehose is used to send data to a final destination. If Amazon S3 is used as a destination, Firehose can implement encryption using SSE-KMS on S3. Access to this key in the desired S3 bucket can be given to Firehose via an IAM role to enable this data encryption to take place. Once this role has been created, the relevant permissions must be assigned, which must include the following KMS actions against the CMK used, kms:Decrypt and kms:GenerateDataKey.
You can apply the following policy as a trusted entity on the role itself, ensuring you replace the account ID with your own, which would give Kinesis Firehose the relevant access. If you have configured Kinesis Firehose to use Redshift as a destination, then Firehose still copies the data to S3 first as an intermediary location.
In this instance, the same KMS permissions mentioned previously should be implemented to enforce encryption of the data at rest and before it is sent to your Redshift cluster from S3, plus the relevant permissions required for Redshift. Similarly, with Elasticsearch as a destination, S3 can also be used to backup all of the data it sends to Elasticsearch.
And so again, it would need the same KMS permissions plus the relevant permissions for Elasticsearch. Let's now take a look at the encryption for Amazon Kinesis Streams.
Since July 2017, Amazon Streams now has the ability to implement SSE encryption using KMS to encrypt data as it enters the stream directly from the producers.
As a part of this process, it's important to ensure that both producer and consumer applications have permissions to use the KMS key. Otherwise encryption and decryption will not be possible, and you will receive an unauthorized KMS master key permission error.
Put simply, a producer is something that adds data to a Kinesis stream, such as a web service sending log data, encryption happens at the producer level.
The Consumer is usually a Kinesis application that processes data from within the Kinesis stream. Decryption happens at the consumer level.
Your producers must have the following permissions against the CMK used, kms:GenerateDataKey, and the following against the Kinesis stream, kinesis:PutRecord and kinesis:PutRecords.
Your consumers on the other hand will require the following against the CMK, kms:Decrypt, and the following against the Kinesis stream, kinesis:GetRecords and kinesis:DescribeStream. Utilizing SSE with KMS for Kinesis Streams essentially encrypts a data entering a stream before it is saved to the Kinesis Streams storage layer and then decrypted after it's accessed from the storage layer, giving full at-rest encryption within the stream.
Kinesis SSE encryption will typically call upon KMS to generate a new data key every five minutes. So, if you had your stream running for a month or more, thousands of data keys would be generated within this time frame. You may be wondering if by applying this encryption using the producers and then decrypting the data using the consumers, if any latency is added to the performance. And the simple answer is yes. It does add a small overhead, which impacts the performance of PutRecord and PutRecords and GetRecords by less than a hundred microseconds.
Before we finish this lecture, I just want to mention that AWS has released a blog post that shows how to implement encryption from client to destination by building a real-time streaming application using Kinesis, in which your records are encrypted while at rest and in transit, which you may want to take a look at here.
That now brings us to the end of this lecture. Coming up next, I shall be looking at encryption when using the Amazon Redshift service.
About the Author
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.