Getting the tools ready
Data management automation
Data management is a key part of the infrastructure of most organizations, especially those dealing with large data stores. For example, imagine a team involved in scientifical analysis of data: they probably require a system to store the raw data in, another to analyze chunks of data quickly and cost-efficiently, and long-term archival to keep both the raw data and the result of their computation. In cases like that, it's important to deploy an automated system that can move data efficiently with integrated automatic backups.
In this course, the experienced System Administrator and Cloud Expert David Clinton will talk about implementing such a data management and backup system using EBS, S3 and Glacier, and taking advantage of the S3 LifeCycle feature and of DataPipiline for the automation of data transfers among the various pieces of the infrastructure. This system can be enabled easily and cheaply, as is shown in the last lecture of the course.
Who should take this course
As a beginner-to-intermediate course, some basic knoweldge of AWS is expected. A basic knowledge of programming is also needed to follow along the Glacier lecture. In any case, even those who are totally newcomers to these topics should be able to grasp at least the key concepts.
If you want to learn more about the AWS solutions discussed in this course, you might want to check our other AWS courses. Also, if you want to test your knowledge on the basic topics covered in this course, we strongly suggest to take our AWS questions. You will learn more about every single services cited in this course.
If you have thoughts or suggestions for this course, please contact Cloud Academy at email@example.com.
Hi, and welcome to Cloudacademy.com's video series on data management in the cloud and, specifically, on AWS, Amazon Web Services, instances. In this video, we're gonna discuss EBS devices, elastic block storage devices. These are, as we mentioned in the introduction, devices that work pretty much the same way a USB drive might, that you can plug into a particular computer and access its data, save right to the device, and copy from the device, and have it available even when the computer happens to be off or unavailable.
Let's create an EBS. We'll click on EC2. Before we create an EBS volume, let's take a quick look at our existing instance, just to make a note of the instance ID. We can either see it here or down below. Get the whole ID. We'll need that ID a little later, when we come to associate our device with a particular instance.
Now, let's move to the elastic block store and click on "Volumes. " There's an existing volume here already. However, you don't want to play with this. This is the hard drive, effectively, of your running EC2 instance. Let's create a new volume.
You have a choice between general purpose SSD, or provisioned IOPs SSD. The provisioned IOPs will provide a higher guaranteed level of in/out operations. For certain purposes, that's necessary. We'll just stick with general purpose for now.
You can set the size, in gigabytes, of your device. We'll leave it at 100 for now. Or why not? Let's make it a little smaller. We don't need anything big.
We'll drop it down to 10 gigabytes. We're within the range of 30 to 3000 IOPs. We can't control exactly how fast things are gonna operate at a given time, but that's part of the fun of general purpose SSD.
We are in the availability zone, AP northeast 1A. We can change that if we had to. If we wanted this volume to adopt the format and contents of another volume elsewhere, we could search for a snapshot ID. If we'd like to encrypt the data on this volume, we could do that also.
Meantime, we'll leave it the way it is and create it. It's not available. Let's take away the highlight of the existing hard drive and just focus on this new volume we've created. Let's click on "Actions" and attach volume. We're going to attach the volume to our instance. Let's give this device a name. Let's say dev-SDF.
Now, why do I choose dev-SDF? Slash-dev is a directory on my instance or in any Linux-based system, which is a virtual directory, containing references to all the devices that are associated with the system. SD would be the fourth device, the fourth sata device, associated with the system.
F is the lowest designation you can give to a device in the Amazon EBS system. Let's see if that works.
However, when we actually come to the command line interface in our instance, we might see that this device is not recognized as dev-SDF.
It might have another designation, and we will deal with that when we get to it. So now that we are in our EC2 instance, let's see if the device we've just created, this EBS volume we've created, is actually recognized and associated with the system from the inside. We're gonna type LSBLK which stands for list block devices.
We see XVDA, which is the hard drive, effectively, the hard drive of our system. The third device is XVDF. That's how Ubuntu will describe what we described as slash-dev-slash-SDF. It will describe it as slash-dev-slash-XVDF. So in fact, the volume now exists and is associated with our EC2 instance.
Now let's see what kind of a volume it is. Is it formatted? Does it have a file system associated with it? So we'll use the command file, which will display details about a file or a file system. We will point it to this device, which is in the directory dev-slash, inside the directory XVDF, or actually it is the XVDF device in directory dev.
We see it's just data. It's raw, unformatted, and still needs a fair amount of work. Let's now create a... Or I should say let's format this device, pseudo, make FSNKFS, which means, "Let's make a file system, " dash-T, which means a type, EXT4.
We're going to create this file system as an EXT4, which is a very secure, very robust file system, which journals. That is, in case something fails, it will have written the details of its . . . Some logging details will have been written to the file system itself, and it makes recovery, in the event of a crash, a lot easier.
We will apply this format to the device, dev-slash-XVDF. I will note, before we pull the trigger on this, that if there was any data on this volume until now, it will cease to exist once we format it. So be very careful before you use the make-FS command.
Looks like it's done. We now have a 10-gigabyte volume, which effectively is no different than a USB device plugged into our computer, but we still can't really access it. We can't copy and write to it or copy from it.
So we have to mount it. Mounting sounds a lot more complicated than it actually is, if you haven't yet done it yourself. Mounting requires that we tell the computer to associate this device with a particular location on the computer. So let's create a directory, using make-directory, in the directory media. You don't have to use this directory, but it's where I usually place my USB devices, so in media.
I will create a new directory called "Drive" in the media directory. It already exists. It exists because when I was playing around to make sure that everything worked, I created it then. So we don't have to create the directory drive in media.
Now that we know, with complete clarity, that the media-slash-drive directory exists, let's mount our device, using pseudo and the command "mount, " which is effectively saying to the computer, Associate this device with that location, the device in dev-slash-XVDF, and associated that with that, with media drive.
Now, we have a file system mounted, saved to this device, and the device is mounted to media-slash-drive. Let's go there.
Typing it correctly, of course, see what's there. Right now, there's a standard lost-and-found directory, which I believe contains some of the journaled logs that a next-formatted device will create, but it's got no files of our own.
Let's create a file, using the command touch, which will just create a file and not actually enter it or edit it.
Let's type LS to list the contents of this directory again. We see now, there's a file called stuff. So the device exists. It's been formatted and mounted, and we're able to write to it and copy from it.
The only thing we might still want to do is to ensure that this device is mounted each time we restart the instance or, in computer terms, each time we restart the computer. That we do by editing the FS tab file. That we do with pseudo nano. Nano is the text editor, which I prefer to use.
To the file, we want to edit the file in the ETC directory, called FS tab. There's one entry there already. That's our hard drive or what passes for a hard drive in this virtual world. We want to add dev-XVDF-space or tab. We want to add this device, dev-XVDF, to mount it, in other words, to the directory media-slash-drive-space or tab. We want to mount that as X4, EXT4, and we might like also do default-comma-no fail-space-zero-space-two, which is a series of options that one could choose for mounting on boot, a device.
We'll exit and save, typing Y to save FS tab over the original copy. It's not a bad idea, just to make sure that it works before you reboot, to pseudo mount-dash-A, which will remount all the devices mentioned in FS tab, just to make sure that everything is running properly.
David taught high school for twenty years, worked as a Linux system administrator for five years, and has been writing since he could hold a crayon between his fingers. His childhood bedroom wall has since been repainted.
Having worked directly with all kinds of technology, David derives great pleasure from completing projects that draw on as many tools from his toolkit as possible.
Besides being a Linux system administrator with a strong focus on virtualization and security tools, David writes technical documentation and user guides, and creates technology training videos.
His favorite technology tool is the one that should be just about ready for release tomorrow. Or Thursday.