AWS Data Services
The course is part of this learning path
To be prepared for the AWS Certified Cloud Practitioner Exam, this course will enable you to demonstrate Amazon Simple Storage Service (S3), Amazon Glacier, Amazon Elastic Block Store (EBS) and Amazon CloudFront storage solutions, and help you identify when to apply AWS solutions to common business scenarios.
This course covers a range of different services, including:
- Amazon Simple Storage Service (S3)
- Amazon Elastic Block Storage (EBS)
- Amazon Glacier
- Amazon RDS
- Amazon DynamoDB, ElastiCache, and Redshift
- Amazon CloudFront
- AWS Import/Export Disk
- AWS Import/Export Snowball
- AWS Storage Gateway
By the end of this course, you should be able to:
- Describe the basic functions that each storage service performs within a cloud solution
- Recognize basic components and features of each storage service
- Identify which storage service would be most appropriate to a general use case
- Understand how each service utilizes the benefits of cloud computing, such as scalability or elasticity
This course is designed for:
- Anyone preparing for the AWS Certified Cloud Practitioner
- Managers, sales professionals, and other non-technical roles
Before taking this course, you should have a general understanding of basic cloud computing concepts.
If you have thoughts or suggestions for this course, please contact Cloud Academy at email@example.com.
Hello and welcome to this final lecture of the course. Within this lecture, I want to highlight at high level some of the main points from each of the storage services that I have introduced, starting with Amazon S3.
Within this lecture, I explained that Amazon S3 is a fully managed object based storage service that is highly available, highly durable, very cost effective and widely accessible. It has almost unlimited storage capabilities and the smallest file size that it supports is zero bytes, and the largest file size is five terabytes. Data is uploaded within S3 to a specific region and duplicated across multiple Availability Zones automatically. Objects have a durability of eleven nines and availability of four nines. And objects must be stored within buckets or folders within a bucket. S3 has three storage classes, Standard, Standard Infrequent Access, and Reduced Redundancy. Security features of S3 include bucket policies, access control lists, data encryption both server-side encryption and client-side encryption and SSL is supported for data in transit to S3. Data management features include versioning and lifecycle rules, and S3 is often used for data backup, static content for websites and large data sets. But it can be used for a wide variety of solutions as you see fit. S3 offers integration with other services such as EBS for snapshots, Cloud Trow to store logs, or as an for CloudFront distribution. Pricing for S3 is primarily based on the amount of storage used, plus request and data transfer costs.
Following on from Amazon S3, I then covered Amazon Glacier, which works closely with S3 but provides a very different function. The key points taken from this lecture were, that Amazon Glacier is an extremely low cost, long term, durable storage solution which is often referred to as Cold Storage, ideally suited for long term back up and archival requirements. It has 11 nines of durability, making this just as durable as Amazon S3, but it's much cheaper than S3. It doesn't however provide instant access of data retrieval. The data structure is centered around Vaults and Archives, and a Glacier Vault simply acts as a container for Glacier Archives. Within Vaults, data is stored as an archive, but you have unlimited archives within your Glacier Vaults. The dashboard within the console only allows you to create Vaults, and if you want to move data into or out of Glacier, you have to be using the Glacier web service API, or one of the AWS SDKs. There are three different retrieval options: Expedited, Standard, and Bulk. And data is encrypted by default using the AES-256 encryption algorithm. Access control can be governed through IAM Policies, Vault Access policies, and Vault Lock policies. And there is a flat pricing structure for data stored in Glacier regardless of the amount of storage used. However, there are still request, data transfer, and additional costs relating to the amount of data retrievals made. And Glacier is designed to archive data for extended period of time in Cold Storage, for a very small cost.
I then looked at Block Storage, the Office Persistence Storage. This is in the form of Elastic Block Store, EBS volumes. EBS also provides block storage to your EC2 instances, but unlike instant store volumes, EBS offers persistent and durable data storage. EBS volumes can be attached and detached from your instances, and they're primarily used for data that is rapidly changing. A single EBS volume can only ever be attached to a single EC2 instance. However, multiple EBS volumes can be attached to a single instance. EBS snapshots offer an incremental point in time backup of the entire volume and are then stored on Amazon S3. It's also possible to create a new volume from an existing snapshot. All writes are replicated multiple times within a single availability zone, EBS volumes are only available in a single availability zone. There are four types of EBS volumes available, two of which are SSD backed, and two which are HDD backed. Depending on the volume type will depend on it's cost. You are charged for the storage provisioned to you per month and billed on a per second basis. An EBS snapshot stored in S3 will also incur S3 storage costs. EBS encrypts data both at rest, and when in transit, if required, and encrypted volumes will also produce encrypted snapshots.
Next up was the Elastic File System, EFS. In this lecture, we learned that EFS provides a file level storage service which is a fully managed, highly available and durable service that allows you to create shared file systems. It is highly scalable, capable of meeting demands by thousands of EC2 instances concurrently, and it has a seemingly limitless capacity, similar to S3. There is no need to provision a set size of data storage like you need to with EBS, and this makes an ideal storage option for applications that scale across multiple instances allowing for power low access of data. EFS is a regional service, and it's being designed to maintain a high level of throughput, and very low latency access response. Mount targets allow to connect to the EFS File System from your EC2 instances, using a configured mount target IP address. But this is only compatible with NFS version four and version four point one. EFS does not currently support the Windows Operating System. You must ensure that Linux EC2 instance has the NFS client installed for the mounting process, and the NFS Client four point one is recommended for this procedure. EFS can be configured to running two different performance mode of operations: General Purpose which is the default, and Max I/O. Encryption at rest can be enabled for the use of KMS, but encryption in transit is not currently supported by the service. The File Sync feature can be used to migrate data to EFS via an agent, and the pricing is charged at per gigabyte months, and there is no charge for data transfer or requests.
Following EFS, I introduced Amazon CloudFront, which is a content delivery network service. In this lecture, I explained that Amazon CloudFront is a content delivery network service, which provides a means of distributing your source data of your web traffic closer to the end user requesting the content, via AWS Edge locations as cached data. As a result, it doesn't provide durability of your data. AWS Edge locations are sites deployed in highly populated areas across the globe to cache data and reduce latency for end user access. CloudFront uses distributions to control which source data it needs to distribute and to where. And there are two delivery methods that exist to distribute data, via Web Distribution or RTMP Distribution. A CloudFront distribution requires an origin containing your source data, such as S3. And data can be distributed using the following edge location options: US, Canada, and Europe; US, Canada, Europe, and Asia; or All Edge locations. CloudFront can interact with the Web Application Firewall Service for additional security and web application protection. Additional encryption security can also be added by specifying an SSL certificate that must be used within the distribution. And pricing is primarily based on data transfer costs and HTTP requests.
Next up, was the first lecture covering how we can migrate data into an out of AWS storage services. And here I looked at the AWS Storage Gateway. Storage Gateway allows you to provide a gateway between your own data center storage systems, such as your SAN, NAS, or DAS, and Amazon S3 or Glacier on AWS. The Storage Gateway is a software appliance, downloaded as a VM, and installed within your own data center. The Storage Gateway offers File, Volume, and Tape Gateway configurations. So file gateways, allow you to securely store your files as objects within S3, and you can then mount or map drives to an S3 bucket, as if the share was held locally on your own corporate network. A local on premise caches allow provisioned for accessing your most recently accessed files. Volume Gateways, these are configured as a Stored Volume gateway, or Cached Volume Gateway. The Stored Volume Gateways are used as a way to backup your local storage volumes to Amazon S3, while since you're in your entire data library, also remains locally on premise for very low latency data access. And they're also presented as iSCSI volumes. Cached Volume Gateways, here the primary data storage is actually Amazon S3 rather than your own local storage solution. And Cached Volume Gateways utilize your local data storage as a buffer, and a cache for recently accessed data. And these also presented as iSCSI volumes. Lastly, Virtual Tape Libraries, these allow you to backup data to S3 from your own corporate data center and leverage Amazon Glacier for data archiving. The Virtual Tape Library, is essentially a cloud-based tape backup solution, and the pricing for this service is based upon storage usage, requests, and data transfer.
Our final lecture looking at storage was based on AWS Snowball. This lecture explained the following points. The services used to securely transfer large amounts of data in and out of AWS using a physical appliance, known as a snowball. The snowball appliance comes as either a 50 terabyte or 80 terabyte storage device, and is fully dust, water, and tamper resistant. It's been designed to allow for high speed data transfers. By default, all data transferred to the snowball appliance is automatically encrypted. It also features end-to-end tracking using an E Ink shipping label. The AWS Snowball is HIPAA compliant, allowing you to transfer protected health information. And it's the responsibility of AWS, to ensure the data held on the snowball appliance is deleted and removed when finished with. Snowball appliances can be aggregated together, and as a general rule, if your data retrieval will take longer than a week using your existing connection method, then you should consider using AWS Snowball. Pricing is based on normal Amazon S3 data charges, plus additional costs for the data transfer job and shipping.
That now brings me to the end of this lecture and to the end of this course. You should now have a greater understanding of the range of storage services offered by AWS, and the differences between them, and when to use them depending on your use case. If you have any feedback on this course, positive or negative, please do contact our set support at cloudacademy.com. Your feedback is greatly appreciated. Thank you for your time, and good luck with your continued learning of cloud computing. Thank you.
About the Author
Head of Content
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.