Getting the tools ready
Data management automation
Data management is a key part of the infrastructure of most organizations, especially those dealing with large data stores. For example, imagine a team involved in scientifical analysis of data: they probably require a system to store the raw data in, another to analyze chunks of data quickly and cost-efficiently, and long-term archival to keep both the raw data and the result of their computation. In cases like that, it's important to deploy an automated system that can move data efficiently with integrated automatic backups.
In this course, the experienced System Administrator and Cloud Expert David Clinton will talk about implementing such a data management and backup system using EBS, S3 and Glacier, and taking advantage of the S3 LifeCycle feature and of DataPipiline for the automation of data transfers among the various pieces of the infrastructure. This system can be enabled easily and cheaply, as is shown in the last lecture of the course.
Who should take this course
As a beginner-to-intermediate course, some basic knoweldge of AWS is expected. A basic knowledge of programming is also needed to follow along the Glacier lecture. In any case, even those who are totally newcomers to these topics should be able to grasp at least the key concepts.
If you want to learn more about the AWS solutions discussed in this course, you might want to check our other AWS courses. Also, if you want to test your knowledge on the basic topics covered in this course, we strongly suggest to take our AWS questions. You will learn more about every single services cited in this course.
If you have thoughts or suggestions for this course, please contact Cloud Academy at email@example.com.
Hi and welcome to cloudacademy.com's video series on data management. In this series, we will introduce you to three unique tools that Amazon offers for handling your data, specifically EBS, S3 and Glacier.
Now, of course, your EC3 Instance has its own space, and you can store and access data using that space as much as you like.
But it may not be the most cost effective or efficient way to manage data in every scenario. Let's imagine that we've got large volumes of scientific data to analyze. You may want to introduce your raw data to your system using EBS. That is Amazon's elastic block storage service, which is accessible through the EC2 item on the dashboard. This storage is cheaper and more flexible than EC2. In fact, you can think of it as a USB device because it has many of the same qualities and features that you might find on a USB device attached to your computer. After you process the data, however, you may like to move it and store it, at least temporarily, on S3. S3 stands for Amazon's Simple Storage Service. It's not quite as easy to access the storage from your Instance, it's a little slower. However, that's offset by the fact that it can be associated with multiple Instances. In fact, the data can be accessed from just about any web connected device. However, once the data's been on your system for a while, you may want to make room and save money by shifting that data to Glacier.
Glacier is a storage data service that is extraordinarily cheap. But , of course, the trade off is a very high latency. It can take hours to access the service.
Nonetheless, for the right use scenario, Glacier could well be the service you want. We'll look at implementing and managing all three of these services in the videos.
About the Author
David taught high school for twenty years, worked as a Linux system administrator for five years, and has been writing since he could hold a crayon between his fingers. His childhood bedroom wall has since been repainted.
Having worked directly with all kinds of technology, David derives great pleasure from completing projects that draw on as many tools from his toolkit as possible.
Besides being a Linux system administrator with a strong focus on virtualization and security tools, David writes technical documentation and user guides, and creates technology training videos.
His favorite technology tool is the one that should be just about ready for release tomorrow. Or Thursday.