Increasing Your Security Posture when Using Amazon S3
S3 Encryption Mechanisms
Amazon S3 Lifecycle Configurations
Introduction to Amazon EFS
EFS in Practice
Amazon Elastic Block Store (EBS)
AWS Storage Gateway
Performance Factors Across AWS Storage Services
The course is part of this learning path
This section of the AWS Certified Solutions Architect - Professional learning path introduces you to the core storage concepts and services relevant to the SAP-C02 exam. We start with an introduction to AWS storage services, understand the options available, and learn how to select and apply AWS storage services to meet specific requirements.
- Obtain an in-depth understanding of Amazon S3 - Simple Storage Service
- Learn how to improve your security posture in S3
- Get both a theoretical and practical understanding of EFS
- Learn how to create an EFS file system, manage EFS security, and import data in EFS
- Learn about EC2 storage and Elastic Block Store
- Learn about the different performance factors associated with AWS storage services
Hello, and welcome to this lecture, where I will discuss performance factors with AWS block storage. In AWS, we use block-level storage devices like EBS volumes for our low latency and high-performance workloads, including file systems and structured database storage. But what exactly do we mean by block-level storage? Well, you can think of a block as a fixed-size chunk of data whose size is based on how much data your block device can read or write in a single I/O request.
Now if you need to store some data on your EBS volume that’s larger than this block size, that data will be divided into blocks of equal size and stored on the underlying physical volume within those blocks. And this is all done in a way that’s optimized for both fast access and retrieval, with each data block being assigned a unique identifier. Now, these blocks don’t need to be continuous or in any meaningful sequence on the underlying volume, as the block storage system leverages a lookup table that gets updated with the locations of these unique identifiers whenever a block is written. This allows data to be quickly retrieved and merged whenever it is requested.
Now block storage volumes can be integrated with many different operating systems, but first, they need to be formatted in a way that can be understood by that particular OS. One advantage of block storage is that whenever data is updated, such as a minor update to an existing file, only the new or changed blocks within that file actually need to be rewritten. Unchanged blocks can be left as-is, which helps to further enhance performance and speed.
Now there is one other type of block storage in AWS, which is kind of a special case, and that’s the instance store volumes that come with our EC2 instances. So these are the disks that are physically attached to the real machines in the AWS data centers that host our EC2 instances. And because of this, they perform exceptionally fast. But keep in mind that any data on an instance store volume is lost whenever your EC2 instance stops, hibernates, or terminates. So because of that, it’s really only useful as fast temporary storage, such as a scratch volume or a cache.
Now I want to talk a little more specifically about EBS volume performance and all of the different factors that can impact it. So first and foremost, you’ll want to make sure that you’re using the appropriate EBS volume type for your workloads. And as a quick refresher, EBS offers throughput-optimized (st1) and cold (sc1) magnetic hard disk drives, along with general purpose (gp2 or gp3) and provisioned IOPS (io1 or io2) SSD-backed block storage. If you’re interested in a quick refresher on these volume types and the best use cases for them, I encourage you to check out this course. But as a general rule, your SSD-backed storage will perform better for small, random I/O operations and magnetic hard disk drives will perform better for large, sequential operations.
Now, remember that your EBS volumes are not physically attached to EC2 instances. Instead, they’re being accessed over the network. And because of this, there could be occasions when the I/O traffic for your EBS volumes might actually compete with your other network traffic and cause additional latency. To alleviate this, you should use what’s called an EBS-optimized instance type. EBS-optimized instance types have been configured by AWS to separate your I/O traffic by providing dedicated network capacity exclusively for I/O operations. This helps your EBS volumes achieve at least 90% of their provisioned IOPS performance 99% of the time in a given year for gp2 and gp3 SSD-backed volume types, along with all magnetic hard disk drives. And this number increases to 99.9% of the time in a given year for io1 and io2 SSD-backed volume types. To learn more about EBS-optimized instances, including which instance types are supported as well as how to enable EBS optimization at instance launch or for an already existing instance, please check out the AWS documentation here.
So just to wrap up our discussion on block storage and EBS performance, I want to mention a few other factors that can also affect the performance of your EBS volumes. This includes the impact of creating snapshots, as well as initializing new volumes from snapshots. So when you create a snapshot of an st1 or sc1 volume while it’s in use, the performance of that volume will be degraded while the snapshot is in progress. Now for any volume type, whenever you initialize a new volume that’s based on a snapshot, you’re going to take a bit of a performance hit until each block on the volume has been accessed at least once. And this is because your EBS snapshots are stored as objects in S3.
So whenever you initialize a new volume that’s based on a snapshot, all of the storage blocks for that volume have to be downloaded from S3 and written to the new volume before you can access them. And this can take a significant amount of time. Now you can mitigate this in one of two ways: first, by using a process called initialization, which on a Linux machine involves running either the dd or fio utilities to force a read of all blocks on the device, or second, by enabling what’s known as EBS fast snapshot restore, which can be done on a per-snapshot basis within a specific availability zone. To learn more about fast snapshot restore, please check out the AWS documentation here.
Now if you’re really interested in increasing your I/O throughput, you can join multiple EBS volumes together in a RAID 0 configuration. This allows them to work together simultaneously as a single logical drive and also has the advantage of massively increasing read and write performance. But using RAID 0 does carry a risk of data loss, as there’s no mirroring or redundancy of data across the volumes. If one of your volumes fails, you could lose data.
And finally, you can test and validate the performance of your EBS volumes by benchmarking them. This involves simulating I/O workloads by using a benchmarking tool like fio on Linux to help you determine things like the optimum queue length for your volumes. For more information about how you can benchmark your EBS volumes, check out the AWS documentation here. So that wraps up our discussion of different performance factors with AWS block storage.
Danny has over 20 years of IT experience as a software developer, cloud engineer, and technical trainer. After attending a conference on cloud computing in 2009, he knew he wanted to build his career around what was still a very new, emerging technology at the time — and share this transformational knowledge with others. He has spoken to IT professional audiences at local, regional, and national user groups and conferences. He has delivered in-person classroom and virtual training, interactive webinars, and authored video training courses covering many different technologies, including Amazon Web Services. He currently has six active AWS certifications, including certifications at the Professional and Specialty level.