Searching in Files and Using Pipes

Start course
Overview
Difficulty
Intermediate
Duration
2h 10m
Students
13
Description

As the title suggests, this course looks at intermediate-level skills for those who already know a bit about Linux but want to enhance that knowledge. In this course, we build upon some of the topics covered in our Linux Fundamentals course, including files and shell scripting, but also introduce new concepts such as wildcards, job control, switching users, and installing software. 

This course is part of the Linux Administration Bootcamp learning path, designed to get you up and running with Linux.

Learning Objectives

  • Learn what wildcards are and how and when to use them
  • Understand input, output, and redirection
  • Work with files and shell scripting
  • Implement processes and job control, and switch between users
  • Install software using RPM- and Debian-based systems

Intended Audience

  • Anyone with basic knowledge of Linux who wants to learn more
  • Professionals who want to learn more about Linux to enhance their career prospects

Prerequisites

This is an intermediate-level course so some knowledge of Linux is expected. If you're just starting out, then try our Linux Fundamentals course first.

Transcript

In this lesson, we'll be covering how to search the contents of files as well as what pipes are and how to use them. To look for texts within a file, use the grep command. The typical format of the command is grep, space, a search pattern space, the file that you're searching for the pattern in. Here are some options that you can use with the grep command. Dash I will perform a case in sensitive search. dash C will display the number of recurrences that it finds in a file, dash N will proceed the output with line numbers and dash V Will do an invert match. This returns lines that do not match the search pattern.

Let's look at the contents of this file. Let's search in this file with the grep command. You can see that it will return all occurrences, that it finds that match that pattern. So, O is contained in Facebook and O is also contained in Jason so grep returns those two lines. If we want to do an inverse match, we'll do a dash V or invert the match and we'll display all the lines that do not contain the letter O. You can see that it's matching case, to ignore case use dash I. So now it will return the line with lowercase user.

You can also combine options. We'll do C for count and ignore case. And grep says there's only one line that matches user And using dash in, we'll display which line number that the match occurs on. There are some clues as to what a file might contain. For instance, some files will have an extension. If a file ends in dot TXT chances are it's a text file. If a file has executable permission set it might be an executable program.

An easy way to determine the type of a file is to run the file command against that file. If it's a text file, it will say that it's text in the output of the file command. A binary file is a file that is in machine readable format and not a human-readable format. If you run grep against a binary file it will simply display whether or not that information was found in the file but it will not display the text.

To look at the textual data within a binary file, use the strings command. The vertical bar is called a pipe. You can change two commands together with a pipe. Pipe takes the standard output from the proceeding command and passes it as the standard input to the following command. If the first command displays error messages those will not be passed to the second command by default.

Error messages are displayed on standard error. If you want standard error to be passed in as input to the command following the pipe, you can redirect standard error to standard input. How to do that was covered in the IO redirection lesson. Here's a common pattern that you'll see often in Linux.

There are several commands that allow you to specify a file on the command line that is to be used for input. With grep for example, you can run grep followed by the pattern then followed by the file to use as input. If you don't supply a file name then the command will use standard input. Grep pattern file is the same as cat file, pipe grep pattern That's cat displays the contents of the file on standard output. The pipe takes the standard output from the cat command and passes it in as a standard input to the grep command. Since no file name was specified for grep to act on, it takes standard input as its input.

The cut command allows you to select portions of a file or to cut out pieces of a file. Like grep, if file is omitted, cut will use standard input. The dash D option for cut allows you to specify a field delimiter. Dash F tells cut which field to display. Let's look for the stream, John in this file. And we can see that grep tells us two pieces of information that it does find that string in that file. And that that file is a binary file. The file command reports that it's an audio file and it doesn't say anything about texts. So it's not a text file.

You can see that there are a lot of strings that are just not human readable. To just display the string that are human readable, we'll use the strings command. And then we can run our search on this strings giant steps. And now we can see the actual data that the grep command initially found. The output of the strings command is sent to the input of the grep command because we're using a pipe to chain those two together. 

Pipes aren't limited to two commands. You can chain as many commands together as you want. So let's take the output that is formed by those two commands and then pipe it into, let's say head dash one to display the first line of output from grep. Just gonna hit the Up Arrow key to repeat that let's use the cut command. Let's use space as a delimiter And let's print the second field. And that will just give us Coltrane.

Let's say we want to find all users named Bob in the etc password file. The etc password file just contains a list of all the accounts on the Linux system. We want to print each account name and their real name, but no other data. We want to print them in alphabetical order by account name. And we want to display them in a tabular format.

Let's look for Bob in etc password. And what we want to do is print the account name and real name. However, there's other information in that file. So we can use the cut command to get the information we want. For example, it looks like the fields there are fields that are separated by a colon so we can use dash D and use the colon as a field delimiter and the first field and the fifth field to contain the information that we want. So we'll use dash F one comma five.

We also want to sort them in alphabetical order. So the sort command can take care of that for us. And we want to get rid of the colon character so we can use a command called tr that translates character. So we'll translate the colon character into a space character and we want to display this information in a table format. So we can use the column command with a dash T for table. And that gets us the result we desire. 

Another common use of pipes is to control how output is displayed to your screen. If a command produces a significant amount of output it can scroll off your screen before you have a chance to view it. To control the output use a pager utility such as more or less. You've already used these commands already on files directly, but keep in mind that they can take a redirected input as well.

Let's look at etc password and the contents of that file scroll past our screen so we can use the less command to page through that file. And Q to quit. So cat etc password pipe less is really the same as less etc password. And that's kind of how we've been using it up to this point. However, if you do something like this let's look for bin in etc password that scrolls past our screen. So we can pipe that to less. So this is a situation where you couldn't use less directly on a file. However, using the standard output of a previous command as the standard input for the less command is helpful.

In this lesson, we learned how to search the contents of files using the grep command, how to determine a files type using the file command, how to cut out pieces of a file using the cut command, how to translate characters with a tr command, and how to format output into columns with the column command. We also talked about the commands more and less, which are pagers. The important thing to note here is that there are many small commands that do one thing well and that you can chain them together using pipes to do something very powerful and useful.

About the Author
Avatar
Jason Cannon
Founder, Linux Training Academy
Students
210
Courses
40
Learning Paths
4

Jason is the founder of the Linux Training Academy as well as the author of "Linux for Beginners" and "Command Line Kung Fu." He has over 20 years of professional Linux experience, having worked for industry leaders such as Hewlett-Packard, Xerox, UPS, FireEye, and Amazon.com. Nothing gives him more satisfaction than knowing he has helped thousands of IT professionals level up their careers through his many books and courses.

Covered Topics