Jan 31, 2009

BEST BASH tips and tricks, Tutorials...

The bash shell is just amazing. There are so many tasks that can be simplified using its handy features. This tutorial tells about some of those features, explains what exactly they do and learns you how to use them.

Difficulty: Basic - Medium

Running a command from your history

Sometimes you know that you ran a command a while ago and you want to run it again. You know a bit of the command, but you don't exactly know all options, or when you executed the command. Of course, you could just keep pressing the Up Arrow until you encounter the command again, but there is a better way. You can search the bash history in an interactive mode by pressing Ctrl + r(recursively) . This will put bash in history mode, allowing you to type a part of the command you're looking for. In the meanwhile, it will show the most recent occasion where the string you're typing was used. If it is showing you a too recent command, you can go further back in history by pressing Ctrl + r again and again. Once you found the command you were looking for, press enter to run it. If you can't find what you're looking for and you want to try it again or if you want to get out of history mode for an other reason, just press Ctrl + c. By the way, Ctrl + c can be used in many other cases to cancel the current operation and/or start with a fresh new line.

Repeating an argument

You can repeat the last argument of the previous command in multiple ways. Have a look at this example:

[Narayan@localhost ~]$ mkdir /path/to/exampledir
[Narayan@localhost ~]$ cd !$

The second command might look a little strange, but it will just cd to /path/to/exampledir. The "!$" syntax repeats the last argument of the previous command. You can also insert the last argument of the previous command on the fly, which enables you to edit it before executing the command. The keyboard shortcut for this functionality is Esc + . (a period). You can also repeatedly press these keys to get the last argument of commands before the previous one.

Some keyboard shortcuts for editing

There are some pretty useful keyboard shortcuts for editing in bash. They might appear familiar to Emacs((recursively 4 )(Emacs makes a computer slow)) users:

Ctrl + a => Return to the start of the command you're typing
Ctrl + e => Go to the end of the command you're typing
Ctrl + u => Cut everything before the cursor to a special clipboard
Ctrl + k => Cut everything after the cursor to a special clipboard
Ctrl + y => Paste from the special clipboard that Ctrl + u and Ctrl + k save their data to
Ctrl + t => Swap the two characters before the cursor (you can actually use this to transport a character from the left to the right, try it!)
Ctrl + w => Delete the word / argument left of the cursor
Ctrl + l => Clear the screen
Dealing with jobs

If you've just started a huge process (like backupping a lot of files) using an ssh terminal and you suddenly remember that you need to do something else on the same server, you might want to get the huge process to the background. You can do this by pressing Ctrl + z, which will suspend the process, and then executing the bg command:

[Narayan@localhost ~]$ bg
[1]+ hugeprocess &

This will make the huge process continue happily in the background, allowing you to do what you need to do. If you want to background another process with the huge one still running, just use the same steps. And if you want to get a process back to the foreground again, execute fg:

[Narayan@localhost ~]$ fg
hugeprocess

But what if you want to foreground an older process that's still running? In a case like that, use the jobs command to see which processes bash is managing:

[Narayan@localhost ~]$ jobs
[1]- Running hugeprocess &
[2]+ Running anotherprocess &

Note: A "+" after the job id means that that job is the 'current job', the one that will be affected if bg or fg is executed without any arguments. A "-" after the job id means that that job is the 'previous job'. You can refer to the previous job with "%-".

Use the job id (the number on the left), preceded by a "%", to specify which process to foreground / background, like this:

[Narayan@localhost ~]$ fg %3

And:
[Narayan@localhost ~]$ bg %7

The above snippets would foreground job [3] and background job [7].

Using several ways of substitution

There are multiple ways to embed a command in an other one. You could use the following way (which is called command substitution):

[Narayan@localhost ~]$ du -h -a -c $(find . -name *.conf 2>&-)

The above command is quite a mouthful of options and syntax, so I'll explain it.

The du command calculates the actual size of files. The -h option makes du print the sizes in human-readable format, the -a tells du to calculate the size of all files, and the -c option tells du to produce a grand total. So, "du -h -a -c" will show the sizes of all files passed to it in a human-readable form and it will produce a grand total.
As you might have guessed, "$(find . -name *.conf 2>&-)" takes care of giving du some files to calculate the sizes of. This part is wrapped between "&(" and ")" to tell bash that it should run the command and return the command's output (in this case as an argument for du). The find command searches for files named .conf in the current directory and all accessible subdirectories. The "." indicates the current directory, the -name option allows to specify the filename of the file to search for, and "*.conf" is an expression that matches any string ending with the character sequence ".conf".
The only thing left to explain is the "2>&-". This part of the syntax makes bash discard the errors that find produces, so du won't get any non-filename input. There is a huge amount of explanation about this syntax near the end of the tutorial (look for "2>&1" and further).
And there's another way to substitute, called process substitution:

[Narayan@localhost ~]$ diff <(ps axo comm) <(ssh user@host ps axo comm)

The command in the snippet above will compare the running processes on the local system and a remote system with an ssh server. Let's have a closer look at it:

First of all, diff. The diff command can be used to compare two files. I won't tell much about it here, as there is an extensive tutorial about diff and patch on this site.
Next, the "<(" and ")". These strings indicate that bash should substitute the command between them as a process. This will create a named pipe (usually in /dev/fd) that, in our case, will be given to diff as a file to compare.
Now the "ps axo comm". The ps command is used to list processes currently running on the system. The "a" option tells ps to list all processes with a tty, the "x" tells ps to list processes without a tty, too, and "o comm" tells ps to list the commands only ("o" indicates the starting of a user-defined output declaration, and "comm" indicates that ps should print the COMMAND column).
The "ssh user@host ps axo comm" will run "ps axo comm" on a remote system with an ssh server. For more detailed information about ssh, see this site's tutorial about ssh and scp(my favorite).
Let's have a look at the whole snippet now:

After interpreting the line, bash will run "ps axo comm" and redirect the output to a named pipe,
then it will execute "ssh user@host ps axo comm" and redirect the output to another named pipe,
and then it will execute diff with the filenames of the named pipes as argument.
The diff command will read the output from the pipes and compare them, and return the differences to the terminal so you can quickly see what differences there are in running processes (if you're familiar with diff's output, that is).
This way, you have done in one line what would normally require at least two: comparing the outputs of two processes.

And there even is another way, called xargs. This command can feed arguments, usually imported through a pipe, to a command. See the next chapter for more information about pipes. We'll now focus on xargs itself. Have a look at this example:

[Narayan@localhost ~]$ find . -name *.conf -print0 | xargs -0 grep -l -Z mem_limit | xargs -0 -i cp {} {}.bak

Note: the "-l" after grep is an L, not an i.

The command in the snippet above will make a backup of all .conf files in the current directory and accessible subdirectories that contain the string "mem_limit".

The find command is used to find all files in the current directory (the ".") and accessible subdirectories with a filename (the "-name" option) that ends with ".conf" ("*.conf" means ".conf"). It returns a list of them, with null characters as separators ("-print0" tells find to do so).
The output of find is piped (the "|" operator, see the next chapter for more information) to xargs. The "-0" option tells xargs that the names are separated by null characters, and "grep -l -Z mem_limit" is the command that the list of files will be feeded to as arguments. The grep command will search the files it gets from xargs for the string "mem_limit", returning a list of files (the -l option tells grep not to return the contents of the files, but just the filenames), again separated by null characters (the "-Z" option causes grep to do this).
The output of grep is also piped, to "xargs -0 -i cp {} {}.bak". We know what xargs does, except for the "-i" option. The "-i" option tells xargs to replace every occasion of the specified string with the argument it gets through the pipe. If no string is specified (as in our case), xargs will assume that it should replace the string "{}". Next, the "cp {} {}.bak". The "{}" will be replaced by xargs with the argument, so, if xargs got the file "sample.conf" through the pipe, cp will copy the file "sample.conf" to the file "sample.conf.bak", effectively making a backup of it.
These substitutions can, once mastered, provide short and quick solutions for complicated problems.
Piping data through commands

One of the most powerful features is the ability to pipe data through commands. You could see this as letting bash take the output of a command, then feed it to an other command, take the output of that, feed it to another and so on. This is a simple example of using a pipe:

[Narayan@localhost ~]$ ps aux | grep init

If you don't know the commands yet: "ps aux" lists all processes executed by a valid user that are currently running on your system (the "a" means that processes of other users than the current user should also be listed, the "u" means that only processes executed by a valid user should be shown, and the "x" means that background processes (without a tty) should also be listed). The "grep init" searches the output of "ps aux" for the string "init". It does so because bash pipes the output of "ps aux" to "grep init", and bash does that because of the "|" operator.

The "|" operator makes bash redirect all data that the command left of it returns to the stdout (more about that later) to the stdin of the command right of it. There are a lot of commands that support taking data from the stdin, and almost every program supports returning data using the stdout.

The stdin and stdout are part of the standard streams; they were introduced with UNIX and are channels over which data can be transported. There are three standard streams (the third one is stderr, which should be used to report errors over). The stdin channel can be used by other programs to feed data to a running process, and the stdout channel can be used by a program to export data. Usually, stdout output (and stderr output, too) is received by the terminal environment in which the program is running, in our case bash. By default, bash will show you the output by echoing it onto the terminal screen, but now that we pipe it to an other command, we are not shown the data.

Please note that, as in a pipe only the stdout of the command on the left is passed on to the next one, the stderr output will still go to the terminal. I will explain how to alter this further on in this tutorial.

If you want to see the data that's passed on between programs in a pipe, you can insert the "tee" command between it. This program receives data from the stdin and then writes it to a file, while also passing it on again through the stdout. This way, if something is going wrong in a pipe sequence, you can see what data was passing through the pipes. The "tee" command is used like this:

[Narayan@localhost ~]$ ps aux | tee filename | grep init

The "grep" command will still receive the output of "ps aux", as tee just passes the data on, but you will be able to read the output of "ps aux" in the file after the commands have been executed. Note that "tee" tries to replace the file if you specify the command like this. If you don't want "tee" to replace the file but to append the data to it instead, use the -a option, like this:

[Narayan@localhost ~]$ ps aux | tee -a filename | grep init

As you have been able to see in the above command, you can place a lot of command with pipes after each other. This is not infinite, though. There is a maximum command-line length, which is usually determined by the kernel. However, this value usually is so big that you are very unlikely to hit the limit. If you do, you can always save the stdout output to a file somewhere inbetween and then use that file to continue operation. And that introduces the next subject: saving the stdout output to a file.

Saving the stdout output to a file

You can save the stdout output of a command to a file like this:

[Narayan@localhost ~]$ ps aux > filename

The above syntax will make bash write the stdout output of "ps aux" to the file filename. If filename already exists, bash will try to overwrite it. If you don't want bash to do so, but to append the output of "ps aux" to filename, you could do that this way:

[Narayan@localhost ~]$ ps aux >> filename

You can use this feature of bash to split a long line of pipes into multiple lines:

[Narayan@localhost ~]$ command1 | command2 | ... | commandN > tempfile1

[Narayan@localhost ~]$ cat tempfile1 | command1 | command2 | ... | commandN > tempfile2

And so on. Note that the above use of cat is, in most cases, a useless one. In many cases, you can let command1 in the second snippet read the file, like this:

[Narayan@localhost ~]$ command1 tempfile1 | command2 | ... | commandN > tempfile2

And in other cases, you can use a redirect to feed a file to command1:

[Narayan@localhost ~]$ command1 tempfile2

To be honest, I mainly included this to avoid getting the Useless Use of Cat Award =).

Anyway, you can also use bash's ability to write streams to file for logging the output of script commands, for example. By the way, did you know that bash can also write the stderr output to a file, or both the stdout and the stderr streams?

Playing with standard streams: redirecting and combining

The bash shell allows us to redirect streams to other streams or to files. This is quite a complicated feature, so I'll try to explain it as clearly as possible. Redirecting a stream is done like this:

[Narayan@localhost ~]$ ps aux 2>&1 | grep init

In the snippet above, "grep init" will not only search the stdout output of "ps aux", but also the stderr output. The stderr and the stdout streams are combined. This is caused by that strange "2>&1" after "ps aux". Let's have a closer look at that.

First, the "2". As said, there are three standard streams (stin, stdout and stderr).These standard streams also have default numbers:

0: stdin
1: stdout
2: sterr
As you can see, "2" is the stream number of stderr. And ">", we already know that from making bash write to a file. The actual meaning of this symbol is "redirect the stream on the left to the stream on the right". If there is no stream on the left, bash will assume you're trying to redirect stdout. If there's a filename on the right, bash will redirect the stream on the left to that file, so that everything passing through the pipe is written to the file.

Note: the ">" symbol is used with and without a space behind it in this tutorial. This is only to keep it clear whether we're redirecting to a file or to a stream: in reality, when dealing with streams, it doesn't matter whether a space is behind it or not. When substituting processes, you shouldn't use any spaces.

Back to our "2>&1". As explained, "2" is the stream number of stderr, ">" redirects the stream somewhere, but what is "&1"? You might have guessed, as the "grep init" command mentioned above searches both the stdout and stderr stream, that "&1" is the stdout stream. The "&" in front of it tells bash that you don't mean a file with filename "1". The streams are sent to the same destination, and to the command receiving them it will seem like they are combined.

If you'd want to write to a file with the name "&1", you'd have to escape the "&", like this:

[Narayan@localhost ~]$ ps aux > &1

Or you could put "&1" between single quotes, like this:

[Narayan@localhost ~]$ ps aux > '&1'

Wrapping a filename containing problematic characters between single quotes generally is a good way to stop bash from messing with it (unless there are single quotes in the string, then you'd have have escape them by putting a in front of them).

Back again to the "2>&1". Now that we know what it means, we can also apply it in other ways, like this:

[Narayan@localhost ~]$ ps aux > filename 2>&1

The stdout output of ps aux will be sent to the file filename, and the stderr output, too. Now, this might seem unlogical. If bash would interpret it from the left to the right (and it does), you might think that it should be like:

[Narayan@localhost ~]$ ps aux 2>&1 > filename

Well, it shouldn't. If you'd execute the above syntax, the stderr output would just be echoed to the terminal. Why? Because bash does not redirect to a stream, but to the current final destination of the stream. Let me explain it:

First, we're telling bash to run the command "ps" with "aux" as an argument.
Then, we're telling to redirect stderr to stdout. At the moment, stdout is still going to the terminal, so the stderr output of "ps aux" is sent to the terminal.
After that, we're telling bash to redirect the stdout output to the file filename. The stdout output of "ps aux" is sent to this file indeed, but the stderr output isn't: it is not affected by stream 1.
If we put the redirections the other way around ("> filename" first), it does work. I'll explain that, too:

First, we're telling bash to run the command "ps" with "aux" as an argument (again).
Then, we're redirecting the stdout to the file filename. This causes the stdout output of "ps aux" to be written to that file.
After that, we're redirecting the stderr stream to the stdout stream. The stdout stream is still pointing to the file filename because of the former statement. Therefore, stderr output is also written to the file.
Get it? The redirects cause a stream to go to the same final destination as the specified one. It does not actually merge the streams, however.

Now that we know how to redirect, we can use it in many ways. For example, we could pipe the stderr output instead of the stdout output:

[Narayan@localhost ~]$ ps aux 2>&1 > /dev/null | grep init

The syntax in this snippet will send the stderr output of "ps aux" to "grep init", while the stdout output is sent to /dev/null and therefore discarded. Note that "grep init" will probably not find anything in this case as "ps aux" is unlikely to report any errors.

When looking more closely to the snippet above, a problem arises. As bash reads the command statements from the left to the right, nothing should go through the pipe, you might say. At the moment that "2>&1" is specified, stdout should still point to the terminal, shouldn't it? Well, here's a thing you should remember: bash reads command statements from the left to the right, but, before that, determines if there are multiple command statements and in which way they are separated. Therefore, bash already read and applied the "|" pipe symbol and stdout is already pointing to the pipe. Note that this also means that stream redirections must be specified before the pipe operator. If you, for example, would move "2>&1" to the end of the command, after "grep init", it would not affect ps aux anymore.

We can also swap the stdout and the stderr stream. This allows to let the stderr stream pass through a pipe while the stdout is printed to the terminal. This will require a 3rd stream. Let's have a look at this example:

[Narayan@localhost ~]$ ps aux 3>&1 1>&2 2>&3 | grep init

That stuff seems to be quite complicated, right? Let's analyze what we're doing here:

"3>&1" => We're redirecting stream 3 to the same final destination as stream 1 (stdout). Stream 3 is a non-standard stream, but it is pretty much always available in bash. This way, we're effectively making a backup of the destination of stdout, which is, in this case, the pipe.
"1>&2" => We're redirecting stream 1 (stdout) to the same final destination as stream 2 (stderr). This destination is the terminal.
"2>&3" => We're redirecting stream 2 (stderr) to the final destination of stream 3. In the first step of these three ones, we set stream 3 to the same final destination as stream 1 (stdout), which was the pipe at that moment, and after that, we redirected stream 1 (stdout) to the final destination of stream 2 at that moment, the terminal. If we wouldn't have made a backup of stream 1's final destination in the beginning, we would not be able to refer to it now.
So, by using a backup stream, we can swap the stdout and stderr stream. This backup stream does not belong to the standard streams, but it is pretty much always available in bash. If you're using it in a script, make sure you aren't breaking an earlier command by playing with the 3rd stream. You can also use stream 4, 5, 6 7 and so on if you need more streams. The highest stream number usually is 1023 (there are 1024 streams, but the first stream is stream 0, stdin). This may be different on other linux systems. Your mileage may vary. If you try to use a non-existing stream, you will get an error like this:

bash: 1: Bad file descriptor

If you want to return a non-standard stream to it's default state, redirect it to "&-", like this:

[Narayan@localhost ~]$ ps aux 3>&1 1>&2 2>&3 3>&- | grep init

Note that the stream redirections are always reset to their initial state if you're using them in a command. You'll only need to do this manually if you made a redirect using, for example, exec, as redirects made this way will last until changed manually.

No comments:

Post a Comment

Your Ad Here
Your Ad Here