Title: Workbook 7 Standard I/O and Pipes
1Workbook 7Standard I/O and Pipes
Pace Center for Business and Technology
2Standard I/O and Pipes
- Key Concepts
- Terminal based programs tend to read information
from one source, and write information to one
destination. - The source programs read from is referred to as
Standard In (stdin), and is usually connected to
a terminal's keyboard. - The destination programs write to is referred to
as Standard Out (stdout), and is usually
connected to a terminal's display. - When using the bash shell, stdout can be
redirected using gt or gtgt, and stdin can be
redirected using lt.
3Three types of programs
- How you can redirect where input is read from and
where output goes. The output of one command can
be used as the input for another command,
allowing simple commands to be used together to
perform more complicated tasks. - Three types of programs
- In Linux (and Unix), programs can generally be
grouped into the following three designs. - Graphical Programs
- Graphical programs are designed to run in the X
graphical environment. They expect the user to be
using a mouse, and use common graphical
components, such as popup menus and buttons, for
user input. The mozilla web browser is an example
of a graphical program. - Screen Programs
- Screen based programs expect to use a text
console. They make use of the entire display, and
handle text placement and screen redraws in
sophisticated ways. They do not require a mouse,
and are appropriate for terminals and virtual
consoles. The vi and nano text editors, and links
web browser, are examples of screen based
programs. - Terminal Programs
- Terminal programs collect input and display
output in a stream, seldom if ever redrawing the
screen, as if writing directly to a printer that
does not allow the cursor to move back up the
page. Because of their simplicity, terminal based
programs are often called simply commands. ls,
grep, and useradd are examples of terminal based
programs. - This chapter focuses on the latter type of
program. Do not let the simplicity of the way
these commands receive input and output fool you.
You will find that many of these commands are
very sophisticated, and allow you to use the
command line interface in powerful ways.
4Standard in (stdin) and Standard out (stdout)
- Terminal based programs generally read
information as stream from a single source, such
as a terminal's keyboard. Likewise, they
generally write information as a steam to a
single destination, such as a display. In Linux
(and Unix), the input stream is referred to as
Standard In (usually abbreviated stdin), and the
output stream is referred to as Standard Out
(usually abbreviated stdout). - Usually, stdin and stdout are connected to the
terminal that runs the command. Sometimes, in
order to automate commonly repeated commands, or
in order to record the output of a command for
later inclusion in a report or email, people find
it convenient to redirect stdin from or stdout
into files.
5Redirecting stdout
- Writing Output to a File
- When a terminal based program generates output,
it generally writes that output to its stdout
stream, without knowing what is connected to the
receiving end of that stream. Usually, the stdout
stream is connected to the terminal that started
the process, so the output is written to the
terminal's display. The bash shell uses gt to
redirect a process's stdout stream to a file. - For example, suppose the machine elvis is using
becomes very sluggish and non-responsive. In
order to diagnose the problem, elvis would like
to examine the currently running processes.
Because the machine is so sluggish, however, he
wants to collect the information now, but analyze
it later. He can redirect the output of the ps
aux command into the file sluggish.txt, and come
back to examine the file when the machine is more
responsive.
6Redirecting stdout
- Notice that no output is displayed to the
terminal. The ps command writes to stdout, as it
always does, but stdout is redirected by the bash
shell to the file sluggish.txt. The user elvis
can examine the file later, at a more convenient
time.
7Appending Output to a File
- If the file sluggish.txt already existed, its
original contents would be lost. This is often
referred to as clobbering a file. To append a
command's output to a file, rather than
clobbering it, bash uses gtgt. - Suppose that elvis wanted to record a timestamp
of when the sluggish behavior was happening, as
well as a list of currently running processes. He
could first create (or clobber) the file with the
output of the date command, using gt, and then
append to it the output of the ps aux command
using gtgt.
8Redirecting stdin
- Just as bash uses gt to coax commands into
delivering their output somewhere other than the
display, bash uses lt to cause them to read input
from somewhere other than the keyboard. The user
elvis is still trying to figure out why his
machine was acting sluggish. He talked to his
local system administrator, who thought that
looking at the list of currently running
processes sounded like a good idea, and asked
elvis to mail him a copy. - Using the terminal based mail command, elvis
first writes an email message to the
administrator "manually", from the keyboard. The
mail command expects a recipient as an argument,
and the subject line can be specified with the -s
command line switch. The email body is then
entered from the keyboard. The end of the message
text is signaled by a lone period on a line.
9Redirecting stdin
- For his follow-up message, elvis can easily mail
the output of the ps command he recorded in the
file sluggish.txt. He just redirects the mail
command's stdin stream to be read from the file. - The system administrator will receive an email
from elvis, with "ps output" as it's subject
line, and the contents of the file sluggish.txt
as its body. - In the first case, the mail process's stdin was
connected to the terminal, and the message body
was provided by the keyboard. In the second case,
bash arranged for the mail process's stdin to be
connected to the file sluggish.txt, and the
message body was provided by its contents. The
mail command doesn't change its basic behavior
It reads the body of the email message from
stdin. 8
10Under the Hood Open Files and File Descriptors
- Open Files and File Descriptors
- To fully appreciate how processes manage Standard
In, Standard Out, and files, we must introduce
the concept of a file descriptor. In order to
read information from or write information to a
file, a process must open the file. Linux (and
Unix) processes keep track of the files they
currently have open by assigning each an integer.
The integer is called a file descriptor. - The Linux kernel provides an easy way to examine
the open files and file descriptors of a
currently running process, using the /proc file
system. Every process has an associated
subdirectory under /proc, named after its PID
(process ID). The process's subdirectory in turn
has a subdirectory called fd (for file
descritptor). Within the /proc/pid/fd
subdirectory, a symbolic links exists for every
file the process has open. The name of the
symbolic link is the open file's integer file
descriptor, and the symbolic link resolves to the
open file itself. - In the following, elvis cats the file
/usr/share/hwdata/oui.txt, and then almost
immediately suspends the program with a CTRLZ.
11Under the Hood Open Files and File Descriptors
- Using the ps command to look up the process's
PID, elvis next examines the process's
/proc/pid/fd directory. - Not surprisingly, the cat process has the file
/usr/share/hwdata/oui.txt open (it must be able
to read the file to display its contents).
Perhaps a little surprising, it is not the only,
or even the first, file that the process has
open. The cat command has three open files before
it, or, more exactly, the same file open three
times /dev/tty1.
12Under the Hood Open Files and File Descriptors
- As a Linux (and Unix) convention, every process
inherits three open files upon startup. The
first, file descriptor 0, is Standard In. The
second, file descriptor 1, is Standard Out, and
the third, file descriptor 2, is Standard Error
(to be discussed in the next Lesson). What open
files did the cat command inherit from the bash
shell that started it? The device node /dev/tty1
for all three. - Recall that /dev/tty1 is the device node which
connects to the console serial driver within the
kernel. Whatever elvis types can be read from
this file, and whatever is written to this file
is displayed on elvis's terminal. What happens if
the cat process reads from stdin? It reads input
from elvis's keyboard. What happens if it writes
to stdout? Whatever is written is displayed on
elvis's terminal.
13Redirection
- In the next example, elvis cat's the
/usr/share/hwdata/oui.txt file, but this time
redirects stdout to the file /tmp/foo. Again,
elvis suspends the command in mid-stride with the
CTRLZ control sequence. - Using the same technique as above, elvis examines
the files opened by the cat command, and the file
descriptors associated with them.
14What happens when elvis redirects both Standard
Out and Standard In?
- What happens when elvis redirects both Standard
Out and Standard In?
15What happens when elvis redirects both Standard
Out and Standard In?
- When the cat command is called without arguments
(i.e., without any filenames of files to
display), it displays Standard In instead. Rather
than opening a specified file (using file
descriptor 3, as above), the cat command reads
from stdin instead. - What is the effective difference between the
following three commands? - There is none. In order to appreciate the real
benefit of designing commands to read from
Standard In in lieu of named files, we must wait
until pipes are introduced in a subsequent
Lesson.
16ExamplesChapter 1. Standard In and Standard Out
- Automating Graph Generation with gnuplot
- About 20 minutes
- http//csis.pace.edu/adelgado/rha-030/scripts/work
book-07/chapter-1/Gnuplot-lab.htm
17Chapter 2. Standard Error
- Key Concepts
- Unix programs commonly report error conditions to
a destination called Standard Error (stderr). - Usually, stderr is connected to a terminal's
display, and error messages are found intermixed
with standard output. - When using the bash shell, the stderr stream can
be redirected to a file using 2gt. - When using bash, the stderr stream can be
combined with stdout stream using 2gt1 or gt
18Standard Error (stderr)
- We have discussed the standard input and output
streams, stdin and stdout, and how to use gt and lt
in the bash command line to redirect them. We are
now ready to confuse matters a little by
introducing a second output stream, commonly used
for reporting error conditions, called Standard
Error (often abbreviated stderr).
19Standard Error (stderr)
- In the following sequence, elvis is using the
head -1 command to generate a list of the first
lines of all the files in the /etc/rc.d
directory.
20Standard Error (stderr)
- The head command, when fed multiple file names as
arguments, conveniently decorates the name of the
file, followed by the first specified number of
lines (in this case, one). When the head command
encounters a directory, however, it merely
complains. Next, elvis runs the same command,
redirecting stdout to the file rcsummary.out. - Most of the output is obediently redirected to
the file rcsummary.out, but the directory
complaints are still displayed. Although not
obvious at the outset, the head command is really
sending output to two independent streams. Normal
output is written to Standard Out, but error
message are written to a separate stream called
Standard Error (often abbreviated stderr).
Usually, both streams are connected to the
terminal, and so the two are difficult to
distinguish. By redirecting stdout, however, the
information written to stderr is obvious.
21Redirecting stderr
- Just as bash uses gt to redirect stdout, bash uses
2gt to redirect stderr. For example, elvis repeats
the head command from above, but instead of
redirecting stdout to rcsummary.out, he redirects
stderr to the file rcsummary.err.
22Redirecting stderr
- The output is the complement to the previous
example. We now see the normal output displayed
to the screen, but no error messages. Where did
the error messages go? It shouldn't be hard to
guess. - In the following example, both gt and 2gt are used
to redirect stdout and stderr independently. - In this case, the standard output can be found in
the file rcsummary.out, error messages can be
found in rcsummary.err, and nothing is left over
to be displayed to the screen.
23Combining stdout and stderr Old School
- Often, someone would like to redirect the
combined stdout and stderr streams to a single
file. As a first attempt, elvis tries the
following command. - Upon examining the file rcsummary.both, however,
elvis doesn't find what he expects.
24Combining stdout and stderr Old School
- The bash shell opened the file rcsummary.both
twice, but treated each open file independently.
When stdout and stderr both wrote to the file,
they clobbered each other's information. What is
needed instead is some way to tell bash to
effectively combine stderr and stdout into a
single stream, and then redirect that stream to a
single file. As you would expect, there is such a
way. - Although awkward, the last token 2gt1 should be
thought of as saying "take stderr, and send it
wherever stdout is currently going". Now
rcsummary.both contains the expected output.
25Combining stdout and stderr New School
- Using 2gt1 to combine stdout and stderr was
introduced in the original Unix shell, the Bourne
shell (sh). Because bash is designed to be
backwards compatible with sh, it supports the
syntax as well. The syntax, however, is
inconvenient. Besides being difficult to write,
the order of the redirections is important. Using
"gtout.txt 2gt1" and "2gt1 gtout.txt" does not have
the same effect! - In order to simplify things, bash uses gt to
combine both stdin and stdout, as in the
following example. - Summary
- The following table summarizes the syntax used by
the bash shell for redirecting stdin, stdout, and
stderr learned in this and the previous lesson.
26ExamplesChapter 2. Standard Error
- Using /dev/null to filter out stderr
- The user elvis is has recently learned that,
besides the /home/elvis and /tmp directories he's
familiar with, he may also own files in the /var
directory. These files are usually spooling files
for received but not yet viewed email, print jobs
waiting to be sent to the printer, etc. - Curious, he uses the find command to find all
files within the /var directory that he owns.
27ExamplesChapter 2. Standard Error
- Although the find command appropriately reported
the /var/spool/mail/elvis file, the output is
difficult to find among all of the "Permission
denied" error messages being reported from
various subdirectories of /var. In order to help
separate the wheat from the chaff, elvis
redirects stderr to some file in the /tmp
directory. - While this works, elvis is left with a file
called /tmp/foo that he really didn't want. In
situations like this, when a user wants to
discard a stream of information, experienced Unix
users usually redirect output to a pseudo device
called /dev/null. - As the following long listing shows, /dev/null is
a character device node, like those used for
conventional device drivers. - When a user writes to /dev/null, the information
is merely discarded by the kernel. When a user
reads from /dev/null, they encounter an immediate
end of file. Notice that /dev/null is one of the
few files in Red Hat Enterprise Linux that has
world writable permissions by default.
28QuestionsChapter 2. Standard Error
29Chapter 3. PipesPart Workbook 7. Standard I/O
and Pipes
- Key Concepts
- The stdout stream from one process can be
connected to the stdin stream of another process,
using what Unix calls a "pipe". - Many commands in Unix are designed to operate as
a filter, reading input from stdin and sending
output to stdout. - bash uses "" to create a pipe between two
commands.
30Pipes
- Pipes
- In the previous Lessons, we have seen that a
process's output can be redirected to somewhere
other than the terminal display, or that a
process can be asked to read input from some
location other than the terminal keyboard. One of
the most common, and most powerful, forms of
redirection is a combination of the two, where
the output (Standard Out) of one command is
"piped" directly into the input (Standard In) of
another command, forming what Linux (and Unix)
refers to as a pipe. - When two commands are joined by a pipe, the
stdout stream of the first process is tied
directly to the stdin sequence of the second
process, so that multiple processes can be
combined in a sequence. In order to create a pipe
using bash, the two commands are joined with a
vertical bar . (On most keyboards, this
character is found on the same key as the
backslash, above the RETURN key.) All processes
that are joined in a pipe are referred to as a
process group.
31Pipes
- As an example, consider prince, who is trying to
find the largest files underneath the /etc
directory. He begins by composing a find command
that will list all file with a size greater than
100Kbytes. - Observing that the find command seems to list the
files in no particular order, prince decides he
would like the files to be listed alphabetically.
He could redirect the output to a file, and then
sort the file. Instead, he takes advantage of the
fact that the sort command, when invoked without
arguments, looks to Standard In for the data to
sort. He pipes the output of his find command
into sort. - The files are now listed in alphabetical order.
32Filtering output using grep
- The traditional Unix grep command is commonly
used in pipes to reduce data to only the
"interesting" parts. The grep command will be
discussed in detail in a later Workbook. Here, we
introduce grep in its simplest form. - The grep command is used to search for and
extract lines which contain a specified string of
text. For example, in the following, prince
prints all lines that contain the text "root"
from the /etc/passwd file. - The first argument to the grep command is the
string of text to be searched for, and any
remaining arguments are files to be searched for
the text. If the grep command is called with only
one argument (a string to be searched for, but no
files to search), it looks to Standard In as its
source of data on which to operate.
33Filtering output using grep
- In the following, prince has so many files in his
home directory that he is having trouble keeping
track of them. He's trying to find a directory
called templates that he created a few months
ago. He uses the locate command to help him find
it. - Unfortunately for prince, there are many files
which contain the text templates in their name on
the system, and prince becomes overwhelmed with
lines and lines of output. In order to reduce the
information to more relevant files, prince next
takes stdout from the locate command, and creates
a pipe to stdin of the grep command, "grepping"
for the word "prince". - Because the grep command is not given a file to
search, it looks to stdin, where it finds the
stdout stream of the locate command. Filtering
the stream, grep only duplicates to its stdout
lines that matched the specified text, "prince".
The rest were discarded. The user prince easily
finds his directory under /proj, as well as
another directory created by the application
quanta.
34Pipes and stderr
- In the next example, prince is curious to see
where he shows up in the system's configuration
files, and "greps" for his name in the /etc
directory.
35Pipes and stderr
- Again, prince is overwhelmed by the amount of
output from this command. He tries the same
trick, "grepping" it down for all lines that
contain the word "passwd". - While stdout from the first grep command was
appropriately filtered, stderr is unaffected, and
still gets displayed to the screen. How would
prince go about suppressing stderr as well? - Commands as filters
- The concept of a pipe extends naturally, so that
multiple commands can be used together, each
reading information from stdin, somehow modifying
or filtering the information, and passing the
result to stdout. In a subsequent Workbook, you
will find that there are many standard Linux (and
Unix) commands that are designed for this
purpose, including some that you are already
familiar with grep, head, tail, cut, sort, sed,
and awk, to name a few.
36Pipes
- Listing Processes by Name
- Often, one would like to list information about
processes which are running a specific command.
While ps aux tables a lot of information about
currently running processes, the number of
processes running on the machine can make the
output overwhelming. The grep command can help
simplify the output. - In the following, prince would like to list
information about the processes which are
implementing his web server, the httpd command.
He lists all processes, but then reduces the
output to only those lines which contain the text
httpd.
37Examples
- Listing Processes by Name
- Often, one would like to list information about
processes which are running a specific command.
While ps aux tables a lot of information about
currently running processes, the number of
processes running on the machine can make the
output overwhelming. The grep command can help
simplify the output. - In the following, prince would like to list
information about the processes which are
implementing his web server, the httpd command.
He lists all processes, but then reduces the
output to only those lines which contain the text
httpd.
38QuestionsChapter 3. Pipes