Storage and Databases
Services at a glance
In this course we learn to recognize and explain AWS compute and storage fundamentals, and to recognise and explain the family of AWS services relevant to the certified developer exam. This course provides you with snapshots of each service, and covering just what you need to know, gives you a good, high-level starting point for exam preparation. It includes coverage of:
Amazon Simple Queue Service (SQS)
Amazon Simple Notification Service (SNS)
Amazon Simple Workflow Service (SWF)
Amazon Simple Email Service (SES)
Amazon API Gateway
Amazon Data Pipeline
AWS Elastic Beanstalk
Storage and database
Amazon Simple Storage Service (S3)
Amazon Elastic Block Store (EBS)
AWS Relational Database Service (RDS)
Other Database Services
Elastic Cloud Compute (EC2)
Elastic Load Balancing (ELB)
If you have thoughts or suggestions for this course, please contact Cloud Academy at email@example.com.
Amazon Simple Storage Service, or S3, is a object store which is provided as a service. What that means is Amazon manages the sizing, infrastructure and durability of the Amazon S3 service, which you can then use and pay for on a pay-as-you-go model. You don't have to buy any hardware upfront to store objects in S3. There is no setup cost, and there is no minimum usage fee.
Amazon S3 is highly durable, with the standard storage class of Amazon S3 providing 11 nines of durability for objects. That's 99.999999999. And four nines, 99.99% availability for objects over a given year. S3 objects reside in the region of your choice. Objects you store in Amazon S3 are replicated across multiple availability zones within that region by AWS to provide this high level of availability and durability.
Amazon S3 organizes content in buckets. Individual Amazon S3 objects can range in size from a minimum of zero bytes to a maximum of five terabytes, and the largest object that can be uploaded in a single put is five gigabytes. By default, you can have up to 100 buckets in a region. More buckets can be added if required by logging a ticket with AWS support. You can store a virtually unlimited number of objects in an S3 bucket. Video files, images, backup archives are all common use cases for Amazon S3. Amazon S3 is a managed service so buckets expand dynamically depending on how many objects you add or remove. You don't need to request more space or change your plan when you add or remove objects.
The link to Amazon S3 can be found under Storage and Content Delivery in the AWS web console. Here we are in the Amazon S3 console. We can create a new bucket by simply clicking on the Create Bucket button. Amazon S3 bucket names need to be globally unique, so we need to enter a name for our bucket that hasn't already been used in Amazon S3 elsewhere. Amazon S3 bucket names need to be unique as the bucket name will form part of a unique resource locator, or URL, that we will use to locate and access objects stored in this bucket. You cannot change the name of a bucket once it has been created, so you do need to think carefully when you name your buckets. The bucket name can be between three and 63 characters long. It can contain only lowercase characters, numbers, periods, and dashes. The bucket name cannot contain underscores, end with a dash, have consecutive periods, or use dashes adjacent to periods or end with a period. And the bucket cannot be formatted as an IP address.
AWS has multiple regions and you can select the region in which you'd like your bucket to reside. You can choose any region. AWS has multiple availability zones within each region. AWS replicates S3 objects automatically for you across availability zones in your chosen region. AWS does not replicate objects across regions unless you implicitly enable cross-region replication.
So let's quickly review some of the Amazon S3 features. Amazon S3 is elastic object storage which means it resizes itself based on what you upload or remove. Amazon S3 standard storage class provides 11 nines of durability and four nines of availability over a given year for objects. Objects in Amazon S3 are replicated by AWS to multiple availability zones within a region. Objects are not replicated to other regions unless explicitly set up by you to do so. You can add as many objects as you want to an S3 bucket.
Bucket names need to be unique and cannot be changed once created. Individual Amazon S3 objects can range in size from a minimum of zero bytes to a maximum of five terabytes, and the largest object that can be uploaded in a single put is five gigabytes. Let's upload a file from here in the AWS web console. First, click on the name to open our new bucket. Click on Actions and then Upload. Click Add Files. Select a file from your local system. You can select multiple files for upload. Click on Actions and then Upload. You can upload files of up to five terabytes. The Set Details button allows us to set the storage class. This is where you can select to use Amazon S3 reduce redundancy or Amazon S3 standard infrequent access storage class. We'll discuss these soon. Click Start Upload. The progress bar shows the status. When the upload is complete, the files will appear in our bucket list on the left. You can create folders inside buckets to help organize your objects inside Amazon S3. To create a folder, select Create Folder. Choose a unique name for your folder. Folders can have specific policies within a bucket. Your object has upload to Amazon S3 and is now stored in the Amazon S3 bucket. Objects stored in Amazon S3 are secure by default. Only bucket and object owners can access objects stored in Amazon S3. We can view the security policy for a bucket from the Properties panel to the right. Amazon S3 provides four different access control mechanisms. Identify and access management policies or IIM, Access control lists, bucket policies, and query string authentication. Identity and access management or IAM enables you to create and manage multiple uses under a single AWS account. With IAM policies, you can grant or restrict a user's access to your Amazon S3 bucket or to our bucket objects. You can use ACLs to selectively add permissions on individual objects. Amazon S3 bucket policies can be used to add or deny permissions across some or all of the objects within a single bucket. With query string authentication, you have the ability to share Amazon S3 objects through URLs that are valid for a specific period of time. You can tightly control access to individual buckets using permissions. Let's look at the permissions associated with our new bucket. The owner, that's me, is currently allowed to list, update, delete and view and edit permissions. Let's look at the Properties panel. You can make your file available to more people using the Permissions option. Here we can select and set IIM user rights. We could create a new permissions rule by clicking Add More Permissions and selecting a user from the drop down grantee box. Let's select everyone and then clock Upload and Delete. Anyone will now have control over all the contents of this bucket. You can add a custom bucket policy. When opening this option, you are presented with a blank screen which may be a little daunting. I recommend just opening the sample bucket policies link to see example policies. For now, let's make this bucket publicly available so we can use our files for a web service. A common use case for Amazon S3 is static hosting of website content. The Enable Website Hosting makes a bucket available as an HTTP endpoint. Objects stored in this Amazon S3 bucket, which are made public, can then be accessed from a web browser. You don't need a web server to present static HTML or image files which is really great. The address specified here is the public DNS name for the asset which you would specify in your webpage tag or code. Amazon S3's URLs are unique and cannot be changed so always keep that in mind when you are naming your bucket. Now expand static website hosting and select Enable Web Hosting. Enter the name of our new index HTML file for our index document and click Save. Now note the endpoint above. Until we reroute traffic from a custom domain to this, this will be the way users can access our website page. Click on Actions and select Make Public. Now notice if we don't do this first, the website page is not displayed. Now click on the Properties tab to the right and note the link's address which contains the URL endpoint users can use to access your file. This link will get the object if requested by a browser or application. Let's now visit the page once we've made it public. It's working fine. You can turn on service side encryption to further protect objects stored in Amazon S3. Encryption is set per object or folder. When Server-side encryption is enabled, Amazon encrypts the object when writing it Server-side encryption does not change how Amazon S3 requests are made or served. The same URL will work for encrypted and non-encrypted content. There are three options to how you encrypt objects. Server-side encryption using Amazon managed SSC keys, Amazon S3 Server-side encryption uses one of the strongest block service available, 256-bit advanced encryption standard, or AAES 256, to encrypt your data. Another option is Server-side encryption using KMS. KMS-managed keys also provides you with an audit trail of when your key was used and by whom. Additionally you have the option to create and manage encryption keys yourself or use a default key that is unique to you, the service you're using and the region you're working in. Server-side encryption with customer provided keys. You manage the encryption keys and Amazon S3 manages the encryption as it writes to disk and decryption when you access your objects.
Amazon S3 currently provides three levels of durability for objects stored in Amazon S3, standard, infrequent access, and reduced redundancy storage classes. The storage class options can be viewed and configured by selecting the Amazon S3 object or folder properties. Then selecting the details sub tab. Object storage type is set per object or folder rather than per bucket. So objects in a bucket can have a blend of storage types. We also saw earlier when we were uploading our file, we could select a storage class for our object at that point.
The Amazon S3 standard storage class SLA offers the lowest latency and highest throughput of the Amazon S3 storage classes. Amazon S3 standard storage offers 99.99% availability for objects over a given year. It provides 11 nines of durability on objects. Amazon S3 standard infrequent access storage class provides the same 11 nines of durability as Amazon S3's standard storage class but is provisioned for objects that are accessed infrequently but still need to be returned quickly when they are requested, so has a lower availability of 99.9%. Amazon S3 standard infrequent access class has a 128 kilobyte minimum file size and a minimum storage duration of 30 days.
Standard IA is designed for larger objects and it has a minimum object size of 128 kilobytes as a guideline. You can upload smaller objects than that, but objects that are smaller than 128 kilobyte in size will incur storage charges as if the object were 128 kilobytes. So for example if you have a 6 kilobyte object in S3 standard IA, you'll incur a standard IA storage charges for 6 kilobytes, and an additional minimum object size fee equivalent to 122 kilobytes at the S3 standard IA storage price. So best to check the Amazon simple monthly calculator or the Amazon S3 pricing page for the latest pricing on S3 storage levels.
The main customer benefit of Amazon S3 standard infrequent access class is cost. Amazon S3 standard infrequent access class is ideal for older backups, archives, or data sets for disaster recovery perhaps. Amazon S3 standard infrequent access class provides life cycle management and supports SSL encryption.
Amazon S3 also offers a reduced redundancy storage class for objects stored in Amazon S3. Amazon S3 reduced redundancy storage provides the same 99.99% availability, however, Amazon S3 reduced redundancy storage class enables customers to reduce their costs by storing non-critical reproducible data in lower levels of redundancy that Amazon S3 standard storage class offers. The Amazon S3 reduced redundancy storage class stores objects on multiple devices across multiple facilities providing significantly more durability than any typical disk archive for example but does not replicate objects as many times as the Amazon S3 standard storage class does. So Amazon S3 reduced redundancy storage class is around 30% cheaper than the more durable Amazon S3 standard storage class, a perfect scenario for that non-critical data archive.Reduced redundancy storage is ideal for storing things like preview images, transcoded media, or other processed data that can be easily reproduced or that may be stored elsewhere as well.
Amazon S3 enables versioning on bucket items. This feature allows you to save, retrieve, and restore all versions of an object stored in a bucket. Clicking the enable versioning button will enable versioning on the bucket meaning every new version of objects will be archived. Once enabled, object versions can be viewed by clicking the version's hide or show option on the menu bar. Once enabled, versioning can only be suspended. It cannot be disabled. This function is useful for items that change often or that may need to be restored often. In the All Buckets page, select the bucket you'd like to work with and then, if it isn't already, click on the Properties tab on the right, expand versioning, read the AWS warning, click on Enable Versioning. The Enable Versioning button has been replaced by a Suspend versioning button. Suspending versioning will not affect existing objects but will prevent further duplication. We'll enter our bucket and upload a file from my local computer. Now I've entered the local file and saved it to create a new version. Now let's upload the updated file which has the exact file name as before. Notice the two version tabs at the top. We'll click on the Show tab and both versions of a file are now visible and are available for any purpose. Amazon S3 provides an archive automation feature called lifecycle. With Amazon S3 lifecycle, you can set how Amazon S3 manages objects over a period of time. Lifecycle rules can help you manage objects as the number of items increases. Let's now expand the lifecycle tab in the Properties window. Click on Add Rule. We could apply this rule to all the objects in this bucket or to all or some of the objects in a particular folder. If we were to target specific files, we would have to specify a prefix which means a text string that would allow S3 to find only those files we want to be subject to this policy. Therefore we might want only files in the videos folder or perhaps only those files that begin with a date. For now we'll go for the whole bucket. Click on Configure Rule, buckets and their contents can have multiple lifecycle rules to create a highly customized storage environment. This type of lifecycle is very useful for handling large bodies of data like log files or older archives. You might have log files or older versions of images for example, and after three months prefer to have those versions moved to Amazon S3 reduced redundancy storage or out to Amazon Glacier. When moving from standard or reduced redundancy storage classes to Amazon S3 standard IA, objects need to be larger than 128 kilobytes and they have to have been stored for more than 30 days. To configure a lifecycle rule, click the lifecycle option. Select if the policy applies to the whole bucket or select the prefix you can use to identify which items this rule apply to. Keeping track of access activity involving your Amazon S3 buckets can be an effective security and performance tool. The Amazon S3 access logs are built for this purpose. To activate logging for a particular bucket, click once on the Edit icon next to your bucket, and then on the Properties tab at the right, if it's not already selected. Expand the logging item and select Enable. Choose a target bucket that is the bucket into which you'd like your logs to be saved. This doesn't have to be the same bucket you're monitoring. Now any user or API request to your bucket, whether successful or not, would be logged in your target bucket. Besides access notification, Amazon S3 events can also be configured to trigger notifications to Amazon simple notification service, SNS. Amazon simple queue service, SQS, or to a lambda function to programmatically alert users of processes within your AWS project. This allows Amazon S3 events to trigger activity within your AWS workflows. Expand events in the properties frame of your bucket window and then type a name for your new event. Click once in the events box to display a menu of choices. We'll choose RRS object lost, an event in the reduced redundancy storage object class. Notifications can be sent to an SNS topic, SQSQ or lambda function. We'll select SNS topic. We'll take just a quick detour to create a new SNS topic. From the main AWS dashboard, click on SNS, simple notification service, and then on create topic. Enter a descriptive name for your topic and a display name. They can be the same. In order for this topic to be useful, we'll have to create a subscription. You can have the topic notification sent to HTTP, email, SQS, or to an application endpoint. We'll an email address which we'll need to confirm once it arrives at our email client. We now have an active topic. Back on the Amazon S3 events page, we can now select our topic. Select Add SNS Topic ARN which stands for Amazon Resource Name. From the SNS topic box, notice of any RRS objects lost will now be automatically sent to our SNS subscription. Cross region replication is a bucket level feature that enables automatic asynchronous copying of objects across buckets in different AWS regions. Replicating objects across regions can help organizations make compliance requirements, minimize latency, or be part of a disaster recovery design. For cross-region replication to work, both buckets need to have versioning enabled. They must be in separate regions and Amazon S3 must have permission to replicate objects from the source bucket to the destination bucket.
Amazon S3 transfer acceleration accelerates Amazon S3 data transfers by making use of optimized network protocols and the AWS Edge infrastructure. Acceleration improvements are typically in the range of 50 to 500% for cross-country transfer of large objects. The function uses the AWS network infrastructure. To enable this feature, the system creates an S3 accelerate endpoint. You can compare your data transfer speed by region first using the Amazon S3 speed comparison tool. This is a great way to test if there's value in enabling transfer acceleration feature on your account. The requester pays function allows you to make the requester responsible for any data costs for their file request. Requester pays can reduce your costs when you have a large number of partners or agents perhaps who regularly request the download objects from your S3 bucket for example. Requester pays also works for authenticated requests.
Okay, things to remember with Amazon S3. Each Amazon S3 bucket name has to be unique. This name will be part of the public DNS name if you use Amazon S3 to host images or static content. So names need to be DNS compliant in all regions. You can't change the region or S3 part of the name. The name's need to be at least three and no more than 63 characters long. It can contain lowercase letters, numbers and hyphens. It cannot be formatted as an IP address. If you plan to you SSL or transfer acceleration, do not use periods in your bucket names. Individual Amazon S3 objects can range in size from a minimum of zero bytes to a maximum of five terabytes, and the largest object that can be uploaded in a single put is five gigabytes. The number of objects per bucket is unlimited, and the number of objects you have in a bucket doesn't impact performance. By default you can create up to 100 buckets per region. Buckets cannot be renamed. If a bucket is empty, you can delete it. After the bucket is deleted, you can then reuse the name, but bucket ownership is not transferable. You can't create a bucket within a bucket. Use folders instead. Take note of the Amazon S3 website endpoint if you're using static hosting because it may differ from the Amazon S3 endpoint. Ensure that you make your files public when you use a static hosting option. You can add lifecycle configuration to non-version buckets and version enabled buckets. Lifecycle configuration on MFA enabled buckets is not supported.
About the Author
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 80+ courses relating to Cloud, mostly within the AWS category and with a heavy focus on security and compliance.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.