As we learned in chapter 6 of TLCL, I/O redirection is one of the most useful and powerful features of the shell. With redirection, our commands can send and receive streams of data to and from files and devices, as well as allow us to connect different programs together into pipelines.
In this adventure, we will look at redirection in a little more depth to see how it works and to discover some additional features and useful redirection techniques.
What's Really Going On
Whenever a new program is run on the system, the kernel creates a table of file descriptors for the program to use. File descriptors are pointers to files. By convention, the first 3 entries in the table (descriptors 0, 1, and 2) are used as standard input (stdin), standard output (stdout), and standard error (stderr). Initially, all three descriptors point to the terminal device (which the system treats as a read/write file), so that standard input comes from the keyboard and standard output and standard error go to the terminal display.
When a program is started as a child process of another (for instance, when we run an executable program in the shell), the newly launched program inherits a copy of the parent's file descriptor table. Redirection is the process of manipulating the file descriptors so that input and output can be routed from/to different files.
The shell hides the presence of file descriptors in common redirections such as:
command > file
where we redirect standard output to a file, but the full syntax of the redirection operator includes an optional file descriptor. We could write the above statement this way and it would have exactly the same effect:
command 1> file
As a convenience, the shell assumes we want to redirect standard output if the file descriptor is omitted. Likewise, the following two statements are equivalent when referring to standard input:
command < file
command 0< file
Duplicating File Descriptors
It is sometimes desirable to write more than one output stream (for example standard output and standard error) to the same file. To do this, we would write something like this:
command > file 2>&1
We'll add the assumed file descriptor to the first redirection to make things a little clearer:
command 1> file 2>&1
This is an example of duplication. When we read this statement, we see that file descriptor 1 is changed from pointing to the terminal device to instead pointing to file. This is followed by the the second redirection that causes file descriptor 2 to be a duplicate (i.e., it points to the same file) of file descriptor 1. When we look at things this way, it's easy to see why the order of redirections is important. For example, if we reverse the order:
command 2>&1 1> file
file descriptor 2 becomes a duplicate of file descriptor 1 (which points to the terminal) and then file descriptor 1 is set to point to file. When all is said and done, file descriptor 1 points to file while file descriptor 2 still points to the terminal.
Before we go any farther, we need to take a brief detour and talk about a shell builtin that we didn't cover in TLCL. This builtin is named
program is the name of the program that will start and take the place of the shell. redirections are the redirections to be used by the new program.
One feature of
from that point on, every command using standard output would send its data to
and tried to invoke it with redirection:
the attempted redirection would have no effect. The word "Boo" would still be written to the file
Another way we can use
It's easy to open and use file descriptors 3-9 in the shell, and it's even possible to use file descriptors 10 and above, though the
So why would we want to use additional file descriptors? That's a little hard to answer. In most cases we don't need to. We could open several descriptors in a script and use them to redirect output to different files, but it's just as easy to specify (using shell variables, if desired) the names of the files to which we want to redirect since most commands are going to send their data to standard output anyway.
There is one case in which using an additional file descriptor would be helpful. It's the case of a filter program that accepts standard input and sends its filtered data to standard output. Such programs are quite common, for example
This program simply copies standard input to standard output, but it displays a running count of the number of lines that it has copied. If we invoke it this way, we can see it in action:
In this pipeline example, we generate a list of files using
The script works by reading a line from the standard input and writing the
The mysterious part of the script above is the
When the shell encounters a command with output redirection, such as:
command > file
the first thing that happens is that the output stream is started by either creating file or, if file already exists, truncating it to zero length. This means that if command completely fails or doesn't even exist, file will end up with zero length. This can be a safety issue for new users who might overwrite (or truncate) a valuable file.
To avoid this, we can do one of two things. First we can use the ">>" operator instead of ">" so that output will be appended to the end of file rather than the beginning. Second, we can set the "noclobber" shell option which prevents redirection from overwriting an existing file. To activate this, we enter:
Once we set this option, attempts to overwrite an existing file will cause the following error:
The effect of the
To turn off the noclobber option we enter this command:
While this adventure may be more on the "interesting" side than the "fun" side, it does provide some useful insight into how redirection actually works and some of the interesting ways we can use it. In a later adventure, we will put this new knowledge to work expanding the power of our scripts.
© 2000-2017, William E. Shotts, Jr. Verbatim copying and distribution of this entire article is permitted in any medium, provided this copyright notice is preserved.
Linux® is a registered trademark of Linus Torvalds.