In this course, you'll gain a solid understanding of the key concepts for Domain Five of the AWS Solutions Architect Professional certification: Data Storage. We will explore AWS storage services and how we can implement those in the most effective and efficient manner.
By the end of this course, you'll have the tools and knowledge you need to successfully accomplish the following requirements for this domain, including:
- Demonstrate ability to make architectural trade off decisions involving storage options.
- Demonstrate ability to make architectural trade off decisions involving database options.
- Demonstrate ability to implement the most appropriate data storage architecture.
- Determine use of synchronous versus asynchronous replication
This course is intended for students seeking to acquire the AWS Solutions Architect Professional certification. It is necessary to have acquired the Associate level of this certification. You should also have at least two years of real-world experience developing AWS architectures.
As stated previously, you will need to have completed the AWS Solutions Architect Associate certification, and we recommend reviewing the relevant learning path in order to be well-prepared for the material in this one.
This Course Includes
- Expert-led instruction and exploration of important concepts.
- Complete coverage of critical Domain Five concepts for the AWS Solutions Architect - Professional certification exam.
What You Will Learn
- The various data storage services available within the AWS ecosystem.
- Domain Five relevant skills and knowledge for passing the exam.
- Synchronous vs asynchornous replication.
Let's start with our core services. So Amazon Simple Storage Service or S3 provides 11 9's durability and 4 9's availability. You can put pretty much any object you want into Amazon S3. It's an object storage. It scales automatically. The maximum file size you can upload to Amazon S3 is five terabytes. Objects are stored in buckets. There's three storage types: Standard Storage Class which offers the highest availability and lowest latency, Standard Infrequent Access Class. The third level of storage class is what's called Amazon S3 Reduced Redundancy Class and that provides the same 99.99% availability but less durability so only 4 9's durability over a given year. Points to remember. Each bucket name has to be unique. Five terabyte maximum file size. You can't change the region or the S3 part of the access point name. Buckets can't be renamed. You can delete a bucket and then reuse the name after a period of time. By default, you can create up to 100 buckets per region. Bucket ownership is not transferable and life cycle configuration on MFA-enabled buckets is not supported. Elastic Block Store, EBS volumes are replicated within an availability zone, not throughout a region as is S3. EBS Snapshots are stored in Amazon S3 so point-in-time snapshots increase durability by protecting against hardware or loss of services in one availability zone and EBS is persistent storage rather than ephemeral storage. The Amazon Glacier, it's low cost object storage, annual average durability of 11 9's for an archive, redundantly stores data in multiple facilities and on multiple devices within each facility. Glacier stores objects in vaults. There's no maximum or minimum limit to the total amount of data that can be stored in Amazon Glacier and your individual archives can be up to 40 terabytes and common use cases define Amazon S3 life cycle rules to automatically archive sets of Amazon S3 objects to Amazon Glacier to reduce storage costs. Dynamo DB is a no SQL key value data store. Elasticache is a managed-in-memory cache which allows you to give fast, reliable data access and the underlying engines behind Elasticache are Memcached and Redis. For RedShift, it's a fully managed petabyte-scale data warehouse. Elastic Map Reduce is a managed Hadoop framework. Amazon Kinesis is a fully managed service for processing real-time data streams. You can output from Kinesis to Amazon S3, Amazon RedShift, Amazon EMR and also to Lambda. Okay, imagine you have a customer who has an analysis application that analyzes uploaded sports images taken from various social media sources. Now, this application analyzes each uploaded file looking to pattern match faces and brand logos and the content. Pretty clever app. The application creates a log and an output for each of these images that it analyzes, plus any of the rendered images. The analysis service writes the logs and outputs to a disk store. The number of files uploaded is high and bursty as it tends to spike during large sporting events where you have a lot of people watching and uploading files. Now at present, the customer has a server on EC2 with a large EBS volume to host these input files, logs and outputs. The problem a customer has is that it can take up to 15 hours a day to complete the analysis process and you've been asked to recommend a better solution. So what services could be used to reduce the processing time and improve the availability of this solution? So we workshop some options. One of the clear requirements that's standing out is that we need to provide a way to enable the processing queue to scale up to meet demand so we agreed quickly that implementing Amazon's Simple Queue Service should be considered as SQS enables us to decouple our services and so better manage how we deal with burst activity. We are thinking the processing app could request processing jobs from an SQS queue and autoscaling could dynamically scale the EC2 instances running the analysis app up or down based on the size of their queue. The upload service can add uploaded files to store and create a work request message in the queue. The processing app can pull items from that queue which is going to be working in parallel. The number of EC2 worker instances can be dynamically scaled up or down via autoscaling based on the size of the Amazon SQS queue. So the benefit of using SQS is that EC2 workers will be able to process requests in parallel thus improving the vertical scaling ability of the application. Now, we'll need to enable Amazon Cloud Watch to monitor the number of job requests or queued messages and to create an autoscaling group to add or delete EC2 worker services automatically based on the parameters we set in our Cloud Watch alarms. Now, this will enable us to improve the effectiveness of the solution by coordinating the number of EC2 workers with the number of requests. So what else can be done to speed up this processing? Now, it turns out the customer chose to use an EBS volume to gain the most IO speed from the app which makes sense. One other team suggests moving the output files to Amazon S3 to increase redundancy but someone else points out that Amazon S3 is not going to provide a fast enough response for this type of use case so we do need to look at EBS options that can reduce this processing time. Now remember, we've got two options with EBS. We've got our SSD-backed volumes optimized for transactional workloads involving frequent read and write operations with small IO size and then we've got our HDD-backed volumes which are optimized for large streaming workloads where throughput is a better performance measure than IOPS. Now, we need high IO performance for this application so we do opt for the SSD-backed provisioned IOPS or throughput optimized HDD? Now, provisioned IOPS SSD volumes are designed to meet the needs of IO-intensive workloads that are sensitive to storage performance and consistency. A provisioned IOPS SSD volume allows you to specify a consistent IOPS rate when you create the volume. An IO volume can range in size from four gigabytes to 16 terabytes and you can provision up to 20,000 IOPS per volume which is pretty cool. The maximum ratio of provisioned IOPS to requested volume size in gigabytes is 50 to 1. For example, a 100 gigabyte volume can be provisioned with up to 5,000 IOPS. Any volume 400 gigabytes in size or greater allows provisioning of up to the 20,000 IOPS maximum. Now, throughput-optimized HDD volumes provides low-cost magnetic storage that defines performance in terms of throughput rather than IOPS. This volume type is a great fit for large sequential workloads such as EMR, data warehouses or log processing. So I think our best option with this use case would be to recommend shifting up to a provisioned IOPS EBS volume. So we need to keep in mind too that bandwidth really matters when we're dealing with performance in EBS. So if we selected a c3.2xLarge we could expect to get throughput of around 125 megabytes a second which we could use to push data to and from our EBS volume for this solution. The consideration, however, is that bandwidth is going to be shared. It's not just EBS that will be accessing over that bandwidth number. It's also accessing other EC2 instances, S3 and other in and outbound traffic so this is shared network bandwidth allocation for all the things that the instance might be talking to. Now, if we went up to a c3.8xLarge that provides us with a 10 gigabit per second instance type. Now we've got a pipe for that instance of 10 gigabits so we get a lot more bandwidth. However, it's still sharing that bandwidth with other services and this is where we should consider using EBS-optimized instances. So the EBS optimized is an option to give dedicated network bandwidth to your instance that goes directly to EBS and this feature is enabled by default on most of the new EBS families and for the older instances, you actually need to enable the EBS optimized checkbox in the console. So, okay, if we enable EBS Optimization and we get another dedicated pipe of one gigabit per second which is great so now we basically just doubled our bandwidth so we've got 125 megabytes for everything else except EBS and now we've got a dedicated gig pipe for EBS itself. Okay, so what if these IO files take up quite a lot of room and we need up to a two terabyte volume? Now, if we create a two terabyte GB2 volume, that's going to give us around 6,000 IOPS with a max volume throughput of around 160 megabytes a second. So if we attach that to a c3.xLarge optimized instance that instance has dedicated bandwidth of 500 megabits a second. However, that is only going to provide us with 63.5 megabytes a second and a maximum of 4,000 IOPS at 16k. Hmm. So this means we'll end up with a difference between what the volume can actually take and what your instance can actually push across the bandwidth which is not a great fit for our solution. So if we scale up our instance size to say a c3.2xLarge we would then have dedicated bandwidth of one gigabyte per second so now we could support up to 125 megabytes per second so that instance would be perhaps a better fit for that volume size if we think we need to be able to push the full throughput to that volume as we'll now have the bandwidth to support them. So we need to ensure we choose the right instance type for provisioned IOPS and if maximum IOPS performance is an issue, we really need to make sure we recommend the right instance and it might mean going up a size to gain as much from our dedicated throughput as possible. So what about if we take an m4.10xLarge which has a full 10 gigabytes dedicated network bandwidth and when you enable EBS optimized on that, you get a second pipe of four gigabytes so we have significant bandwidth available if we went for a large instance of that size. With four gigabytes dedicated, we would get around 500 megabytes a second of bandwidth to EBS. So if we attach an eight terabyte ST1 volume and dial around 1K of provisioned IOPS, we'd be getting 320 megabytes per second of throughput, plus whatever burst throughput support is currently provided by the volume of course and around 16K of IOPS performance. That's fantastic, but of course it still is leaving some head room as our dedicated bandwidth can support up to 32,000 IOPS thanks to that EBS optimization. So another performance option here is to consider striping volumes as RAID 0. That means taking two volumes so we consolidate their combined throughput, meaning we could now push up to 32K. You need to use the volumes of the same size if you're doing this and striping EBS is a little different from how we might do things in a data center where we might consider RAID to increase redundancy and performance and available storage. With cloud storage, it's only worth considering RAID where we need additional IOPS performance or to create a volume of more than 16 terabytes. Why? Well, EBS is a managed service and so it's already providing us with way better durability within its AZ over what we might expect if we managed our own disks. So going for RAID 4 or 5 will increase the volume of reads and writes you'll be putting around your bandwidth without really creating a lot of additional value. So if we go back to our recommendation, it's to introduce Amazon Simple Queue Service, autoscaling and Cloud Watch to enable the worker EC2 instances to coordinate the number of EC2 workers with the number of requests. We also are gonna recommend migrating our EBS volume from what we expect as a standard EBS to an SSD-backed provisioned IOPS EBS volume and perhaps recommend an instance to match. Based on the expected IOPS, we'll determine as needed if we were given that information. So a key benefit of choosing provisioned IOPS is that we get Cloud Watch monitoring at one-minute intervals by default. Now, with our basic Cloud Watch monitoring for EBS, data is available automatically in five-minute periods at no charge so that includes data for the root device volumes for EBS-backed instances. Now, the minute we enable provisioned IOPS, we automatically get one-minute metrics to Cloud Watch. So what we look for from our Cloud Watch are metrics to identify where any bottlenecks or performance issues were. We look at our IO latency and if it was higher than what we were expecting, we check our volume queue length to make sure that our application is not trying to drive more IOPS than we'd provisioned. Now, any Cloud Watch metric that starts with volume is generally an EBS metric. So the other metrics that we would look to use are VolumeReadBytes and VolumeWriteByes. They provide information on the IO operations in a specified period of time. The sum statistic reports the total number of bytes transferred during a period and the average statistic reports gives you an average size of each IO operation during the period. The other one we might look at is VolumeReadOps and that's the total number of IO operations in a specified period of time. And to calculate the average IO operations per second or the IOPS for that period, we divide the total operations for the period by the number of seconds in that period and the units as a count. So that's the type of metrics that we look to use in having optimized EBS. We would have that one-minute interval by default which is very useful
About the Author
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.