Linux Shell Scripting
The course is part of this learning path
In this course, we'll cover a range of topics designed to help you enhance your Linux scripts. We'll start off by looking at case statements, which are used to make a decision based on the value of a given variable. We'll cover functions before moving and then move on to how to process command-line options using the shell built-in getopts.
In the second part of the course, we'll look at managing users including how to disable, delete, and archive users on a Linux system. We'll then do a walkthrough exercise showing you how to delete a user, which you can follow along with.
This course is part of the Linux Shell Scripting learning path. To follow along with this course, you can download all the necessary resources here.
- Learn about case statements and functions to make your scripts more efficient
- Process command line options using getopts
- Manage users in Linux
- Anyone who wants to learn Linux shell scripting
- Linux system administrators, developers, or programmers
To get the most out of this course, you should have a basic understanding of the Linux command line.
In this lesson you'll learn how to use functions in your scripts. First off a function is simply a group of commands that you call using a single name in your script. You can think of functions as being little scripts inside of your main script. You can call or execute a function just like you would any other script or command on a Linux system. There are a couple of reasons why you would want to use functions. The first reason is that you don't want to have duplicate code in multiple places in your script. Using functions allows your code to remain DRY. DRY is a computer programming and application development principle that stands for Don't Repeat Yourself. Functions allow you to write a block of code once and use it many times. Instead of repeating several lines of code each time you need to perform a particular task simply call the function that contains that code. This helps in reducing the length of your scripts and also gives you a single place to change, test, troubleshoot and document a given task. All of this makes your scripts easier to maintain. By the way, there is a counter principle called WET which stands for Write Everything Twice or We Enjoy Typing or Waste Everyone's Time. If you catch yourself typing out the same bit of code or you find yourself copying and pasting existing code in your script it's time to DRY up that WET code and use a function. The second reason to use functions is to break up large tasks into a series of smaller tasks. This makes your code easier to maintain. You can zoom into a function that only does one small thing and make your changes around that small thing there in that one place. For example if you are using a case statement to make a decision then instead of writing a whole long block of code inside the case statement you could call a function that actually does the real work. All right so enough of philosophizing let's get to work and start creating some functions. So first off here, as always, I have my terminal open. I'm going to change into our class folder of shell class. We're still working on the locals users of vagrant project and now I'm going to bring up this VM with Vagrant up. One way to create a function is to use the function built in command. Here you start with a word function, give your function a name then surround a series of commands inside curly braces. The other way you can create a function is to give your function a name follow it with a set of parentheses then surround a series of commands inside curly braces. Now let's jump into a script and try it out. Let's create a function called log that will end up logging messages to the system log and optionally displaying those messages to the user executing the script. We'll iterate over this bit of code and keep growing as we go but we'll start out here. We'll create a function by providing the name then following that name with parentheses and then we're gonna create a block of code surrounded in curly braces here. And we'll just do this for now. Echo you called the log function. And end it with a closing curly brace. Now to call that we'll just use the word log and that will execute it just like any other command on a Linux system. Now remember that scripts get read and executed from the top down. That means you have to define a function before you can use it. That's why we define the log of function at the top of your script here. When a function is defined, it is read into memory and it's then available for use. The code in the function is just remembered initially and it's only executed when it gets called. Notice that when we called the function we didn't use the parentheses. You may have seen this type of syntax and style and other programming languages where you actually use the parentheses but it doesn't work in shell scripts, simply place the name of a function on a line and it will execute that function. So let's try it out. Pretty sure enough. We get the output that says you called the log function. Okay that's not very interesting yet but we'll get there soon enough. So let's get back into our script here. I wanna show you the other way you can define a function. So how you can do that is use the keyword function. Follow that by your function name and then you don't need the parentheses here and we'll save our changes and test this as well. So as you can see, it also works. So either method works. I'm gonna go back and use this particular style here. Now let's make our log function display to the screen whatever is passed to it. So I'll type out the code here and then we'll talk through it in just a second. So what we're gonna do here is this. I've introduced a new shell built in called local. What does is makes a variable local in scope to the function. That means the value of that variable is only accessible inside that function. So in this example, if you try to display the contents of the message variable before or after the function it would be blank and unset that's because it only exists inside the function. By the way the local command can only be used inside a function. That's the only place that makes sense. And the reason why is that until now we've been using global variables. And that means that the variable and its contents or its a value can be accessed anywhere in the script including in any function. Now we haven't been using functions until now. So it really didn't matter until this point. By the way, if you define a global variable in a function by omitting the local keyword in front of it that global variable is not available outside that function until the function is called and executed. It's not a best practice to do that anyway. So it shouldn't really be an issue. However, when you are using functions, just be aware of the scope of each variable and know that it can be a place that can deserve some attention when troubleshooting your scripts. It's a best practice to use local variables inside your functions. That way you won't accidentally reuse the same variable name and another function and end up with a tricky bug. However, you'll find plenty of shell scripts that do not adhere to this recommendation. Most of the time, it will not cause a problem. Just be sure to use unique variable names throughout your script. And you'll be fine. Now let's talk about the value that is assigned to the local variable named message. It's our old friend dollar sign add signed back from our positional parameters lesson. As you recall, dollar sign add sign expands to all the positional parameters, starting from one. Functions, act like a script within your script and you can accept arguments which can be then accessed as positional parameters inside the function. For example, the first argument pass to a function is dollar sign one. The second is dollar sign two and so on. The only difference between a function and a script here is that dollar sign zero is still the name of the shell script itself and not the name of the function even if you're using dollar sign zero in the function. And that's really okay anyway, because you'd never really want to use that functionality anyway. Okay. Let's try this better code. When we execute our script, the function is called first time with hello it's gets displayed on the screen. The second time the function is called with this is fun. And that also gets displayed to the screen. Now let's update our function to only display the message that was passed to it if a global variable named VERBOSE is set to true. So we'll check to see if VERBOSE is true and if it is then we'll echo that message to the screen and the fi there closes out our if statement so let's do this, let's call our function log with hello and then let's go ahead and now set VERBOSE equal to true. So the first time we call our function, VERBOSE is not set. The second time we call it, it is set to true. Now this function is actually starting to become useful. If we didn't have this function, then we would have to test for the value of the VERBOSE variable every single time we needed to decide if we wanted to display something to the screen or not. This means we would end up with a lot of duplicate code throughout our script. In any case, what should happen is that the text hello will not be displayed to the screen because VERBOSE has not been set to true at that point. And the text, this is fun will be displayed because as I just explained, VERBOSE is set to true before log is called with that particular text passed into it. So let's try that out. Okay. Sure enough. Hello was not displayed to the screen but this is fun was just as we expected. If you've ever had any formal training in computer programming, you've probably heard that global variables are evil and you should avoid them at all costs. Really what you should try to avoid at all costs are the problems that can arise by using global variables, such as different functions changing the value of a global variable and so on. Now I'm a pretty pragmatic person. So I'd like to do what is best for a given situation. And that vein. Let me give you an example of why avoiding global variables can be less than ideal especially with our little use case here. We could change this function such that it doesn't rely on the VERBOSE global variable and instead make it an argument that has to be passed into the function. So what we can do is this. We can add local VERBOSE and set that to the value that has passed into it here. The first value that's passed into it and then we'll use shift here. And then at that point, everything else stays the same. So what this does is it assigns the first value passed into the function, into the local variable name to VERBOSE. Then the shift command shifts all those positional parameters which removes the value stored in dollar sign one makes dollar sign two dollar sign one makes dollar sign three dollar sign two and so on. So more or less we chop off the value we assigned to VERBOSE. And what we're left with is the message that was passed into the function. Now we have to update how we call this function. So here we're to do log true to tell it that we're going to be VERBOSE and we can get rid of this piece of code here. And we're also gonna do this. We'll say true here as well. Now, every time we use the log function we need to tell it if we want VERBOSE to be true or false of course we could set a variable and use that like this. So we could do this. We could say VERBOSITY is true and then we'll do this. Okay? And practice the VERBOSITY is going to remain the same for our entire script. Because of that I could argue that passing it into the function every single time is causing us to repeat ourselves without any real practical gain. And this example, I think using a global variable is fine. However, if you disagree with me, that's also fine. In my opinion, both methods are correct so long as they get the job done successfully. Now let's just show that this a little piece of code works. So we'll save our changes and execute our script. So sure enough, it executes the function and displays what we passed to it, to the screen. If we were doing something like differentiating informational messages from morning messages from critical messages then passing in that log level would make sense because it can change throughout our script. So again, use what works and what makes sense to you. Just make sure that it actually works. Test your scripts. Now let's revert our code and handled the main concern of using global variables within functions, which is that a function might actually change the value of that global variable. So we'll do this. The read only shell built in makes a variable unchangeable or unmodifiable. This is the shell's version of a constant variable which is a term you might be familiar with if you've programmed in other languages there are two ways to use the read only shell built-in. One is just like we have it here on the screen which is to use the word read only followed by the variable and then include the variable assignment there right after it. The other way is to perform the variable assignment first and then use read only followed by just the variable name. There's no need to include the assignment because it's already performed at that point. This way you can separate the setting of the variable and marking it as a read only. Now that the VERBOSE variable is read only it cannot be changed for the duration of the script. That means it cannot be changed inside or outside of function that eliminates the concern about a function changing the value of a global variable. Okay we've taken care of the first bit of our function that controls whether or not we display a message to standard output. Now let's write the last little piece that will control sending messages to the system's log. To do that we'll use a command line utility named logger. Let's actually take a look at that on the command line. A lager is a executable so we can run man against that to get some information about that particular command. Mainly you just apply logger a message and that message is recorded in the systems log by default on a Centos or RHEL system those messages will go to the var log messages file. You can read all the details here if you'd like but I'm just gonna point out the one bit of information we're going to use most immediately and that is the - t option, which allows you to tag your syslog message. So I'll move down here to -t. This tag is typically set to the name of your program or the program that is writing to the system log file. So let's try that out. Now we'll hit q to get out of this help and we'll do this logger hello from the command line. All right. And we want to look at the var log messages file. And that file happens to have a permission such that we need root to read that file. And as you can see, they're the last line of that log file is what we set to it with the logger command. The format of these syslog messages is a timestamp followed by the computer name, followed by the tag which is again, most often the name of the program, writing to syslog in this case it's our username since we didn't specify a tag and finally the message itself. So this time let's add a tag we'll do logger -t my-script tagging on. So this time, you can see that the tag reads my-script and then we see the message that we passed into that. by the way syslog can be configured to send messages off the server and to a centralized a syslog location for example, now I cover the details of exactly how to do this in my Linux, in the real world course anyway, that is a good security measure because if someone were to have root access to one system any logs that they generated are not only stored locally where they could potentially change them or delete them but they're also stored on a remote system where hopefully that person or that attacker hasn't broken into that system as well. It's just something to keep in mind. It would be a reason to use the standard syslog system built into Linux, instead of, for example creating your own logging service system from scratch. Okay, let's add this logger functionality to our script. To logger-t, give it the name of our script and then we'll send the message to the log file. Now, regardless of whether the message is displayed on standard out, it will get recorded in the system log. So let's try it out here. Okay we execute our script our messages are displayed to the screen. Now let's see if they made it to the log file. And sure enough they have. Now let's turn VERBOSE mode off and see what that's like. Okay let's change this from true to false and execute our script. So there's no output to standard output. There's nothing on our screen. Let's see if it made its way to the log file however. Okay. It says, hello, and this is fine. So let's do that again here. And then yep it keeps adding to the log file. So that is working. Now let's go ahead and turn VERBOSE mode back on. Before we write our next function, let's add a quick comment to the top of this function, explaining what it does. Now, we do this for our scripts and so our functions are like little scripts within a script. So let's do that here too. So I'll go here and say... This function sends a message to syslog and to standard output if VERBOSE is true okay? If you wanted to add more information, you can do that. Some people like to include things like any global variables that are used inside the function or by the function, any arguments that the function accepts and anything that function might return. Now, we haven't covered returns yet, but we'll get to that soon enough. Moving on to our next a function. As a reminder functions have to be defined before they are used so a good practice is to put all your functions at the top of your script. The only things I would consider putting before the functions are constant global variables and sourcing script libraries. Again, do not call a function before it's defined. Let's say you have a script that is going to update several files. You've lived alive long enough to know that sometimes things can go wrong. So you want to make a backup of every single file before you modify it or more accurately before your script modifies it. That way if you have an error in your script or something unexpected happens you have a good copy of each file to go back to. So let's call this particular function backup_file. We'll leave a comment here. All right, we'll get to this return statement here in just a second. We're going to use a variable in our function. So we'll make that local. We'll do a quick check to see if that file I got passed in exists. Again, we're setting a variable. So we're going to use the local keyword here. Looks like I miss my clothing parenthesis here. Let me do that. You're probably already familiar with the tmp directory. However, here I'm using /var/tmp/ because the files in var tmp will survive a reboot while those in /tmp are not guaranteed to survive a reboot. Typically the files in tmp are cleared on boot and are also cleared more often on a running system. I think the default for CentOS 7 is that files in tmp are deleted every 10 days while the files in vartmp are deleted every 30 days that said I wanna make sure that if I break a file and then do something like reboot the system that I don't lose that file hence the reason for me choosing the location of /var/tmp. All right continuing with our variable here, as you know the base name command removes the path to the file and just leaves us with a file name. And we don't care about its original path because we're just gonna put a copy of it in var tmp anyway. Also here, we're using command substitution with the date command, and you've used the date command with a couple of different formats before but now I'm introducing the capital F format. Now capital F you can think of is standing for full date and it displays the year, month, day using numbers. I love using this format because it makes sorting by date very easy. Once we call this function a couple of times you'll see exactly what I mean. Also, I attmpted to account for multiple backups for the same file on the same day by appending the nanoseconds when the date command was called, that's represented by the percent sign followed by N. We could have used a shell built in variable here to something like dollar sign random or $BASHPID or even $, which expands to the PID or process ID of the current running script. So now in this function, let's call our other function here. So after we use the log function to record what we're going to do here, or communicate our intentions here then we're going to use the CP command to actually perform a copy. Now, the - p option to the copy command is short for preserve and it preserves the files mode, ownership, and timestamps. If you don't use the - p option, then the copy of the original file will have the current timestamp. I personally like to preserve the original timestamp on the copy, just in case I need to put that copy back in place because it's a backup the contents of that file haven't changed. So in my opinion it makes sense to keep it's timestamps the way they were instead of updating the timestamp to today. In the grand scheme of things, this is really minor but it's just something to think about. Each function like each command returns and exit status. By default, the function returns the exit status of the last commanded executes. In the case where the file that was passed to the function actually exists we'll just let cp return whatever exit status is appropriate and use that as the exit status for our function. If for some reason the cp command fails it will return a non-zero exit status and thus our backup file function will return that same non zero exit status. Here, if the file doesn't exist that means we obviously can't make a copy of it a backup copy of it. So our function more or less fails and that's why we're going to return a non zero exit status. Now let's go a head and call this function in our code here back up file let's back up the etc password file and let's see it in action. Okay we have the output display to our screen because we have verbose mode on. And then we also see the output that was generated inside the backup file function, which in turn called the log function to tell us about the backup that it made. So let's see if these messages made it to the log file. And sure enough, they have. Now we have a record of exactly what happened during that run of that script and exactly when it happened and all that good information there. Now let's look at the file we copied here into var tmp. Okay you can see that the file that was in the syslog message here is this file here. Obviously I was playing around with this script yesterday, doing some tests. And so that's why we have these other password files there. Now let's look at etc password. Now let's see what happens when we execute the copy command without the - p option. So we'll do cp etc passwd and we'll copy that into var tmp. And we'll look at the contents of our tmp. Now, right now, the date is January 12th and the date of that file is also January 12th. But if you look at etc password it's really January 7th, it's the original timestamp on that file. So again, that was the difference with the -p option. Okay, we're getting really close to the end. Just one more little block of code to go. Let's jump back into our script. So as we discussed earlier a function has a return code or exit status just like any other command on a Linux system. Now let's check that status. So let's do it in the positive. If the return code is zero, that means it's a success. Or what we can do then is say, Hey, the file backup succeeded. And then if it was not zero then we'll say file backup failed. And we'll exit the entire script there. So again, if the backup function succeeds, we're gonna get that log file message that says file backup succeeded. If it fails we'll log file, backup, failed and exit the entire script itself with a non-zero exit status. Now, by the way, in sign a function it's super important to use return and not exit. If the exit command gets executed it exits the entire script no matter if that exit command is inside a function or not. So exit, remember the exit is for the entire script return is just for the function. So now let's try this little extra code out. Right it looks like our backup succeeded. Let's look at the messages file. Okay our file backup succeeded message ended up in the messages file too. If we were to pass the backup file function a file that didn't exist we would trigger an error or if the disc was full or if the file system was in a read only state, when the backup was attmpted it would also trigger an error state and exit the script. By the way, this bit of code was just to show you that you can check the exit status of a function just like you can any other command. However, in this particular case you would want to check if the backup succeeded or not every single time you performed a backup. And that means in practice you would most likely add this particular check or this bit of code inside the backup file function itself. If you wanted to over-engineer this thing you could totally create three functions to do the same bit of work. One function would perform the backup. One would be used to check if the backup actually succeeded and a third function would be the one that you would actually call in your script which would in turn call the other two functions it would be like a wrapper around those functions and it would look something like this. So we'll just declare a function here, backup file. And let's say we'll call our other function that actually does the backup. And we'll call that perform backup. And we'll just pass that right along here. Then we'd want it to check the backup. This is actually going to fail when I call it because those other functions don't exist right? We just kind of made this up on the fly. So I'll do this backup file and see password, for example. And sure enough, when we call it an executes perform backup but that command isn't found. 'Cause again, it doesn't exist. And then it calls a check back up again it doesn't exist. So anyway, as you might have noticed you can define a function at the command line like I did right here. And you really already know this by now, right? Like a shell script is just automating what you can do at the command line anyway. So let's look at this type -a backup_file. It tells us it is a function and it displays the code that is contained in that function or that function definition. This way you can write functions and have them right inside your current running shell. I feel like we've covered a lot about functions. Don't you? Before we wrap this up, let's do a little refresher of the main points about working with functions. In this lesson you learned about the concept of DRY which stands for Don't Repeat Yourself. If you find yourself writing the same bit of code in multiple places in your script, it's a good sign that you should use a function. You also learned that by default, all variables are global and scope and you learn how to use the local keyword to define local variables that are only available within the function that they are defined in. You also learn how to use positional parameters within your functions and finally, we covered how to use exit statuses with your functions by using the return command.
Jason is the founder of the Linux Training Academy as well as the author of "Linux for Beginners" and "Command Line Kung Fu." He has over 20 years of professional Linux experience, having worked for industry leaders such as Hewlett-Packard, Xerox, UPS, FireEye, and Amazon.com. Nothing gives him more satisfaction than knowing he has helped thousands of IT professionals level up their careers through his many books and courses.