Title: show file DIFFerences diff
1show file DIFFerences - diff
- -i - ignore the case of letters
- -w -ignore all blanks (spaces and TABs) i.e.,
ab will compare equal to ?a b? - -c - produce listing of differences with lines of
context
2list UNIQue items - uniq
- Remove or report adjacent duplicate lines
- Syntax uniq -cdu -n input-file
output-file - -c - Supercede the -u and -d options and generate
an output report with each line preceded by an
occurrence count - -d - write one copy of only the duplicated lines
3list UNIQue items - uniq
- -u - write only those lines which are not
duplicated - NOTE the default output is the union
(combination) of -d and -u - The n arguments specify an initial portion of
each line to skip - n - skip the first n characters
- -n - skip the first n fields and any blanks
before them
4C Shell Variables
- There are two types of shell variables
- Environmental shell variables
- Ordinary shell variables
- These both have variables that either take on
values or act as switches - Switches are on if they are declared and off if
they are not
5C Shell Variables
- Environmental shell variables are inherited by
all child shells while ordinary shell variables
must be defined for each instance of the shell - Shell variables are usually defined in either the
.login or the .cshrc files
6What Shell Variables Are In Use?
- set will display all ordinary shell variables and
there values - setenv will display all currently defined
environmental variables and their values - Traditionally, the ordinary variables have
lowercase names and the environmental variables
are all uppercase
7Setting Variables
- Environmental shell variables are set using
setenv - setenv variablevalue
- Ordinary shell variables are set using set
- set variablevalue
8Setting Variables
- Note that the text shows spaces on both sides of
the ?? This will not work. - To unset a defined shell variable, use the unset
and unsetenv commands - unset variable
- unsetenv variable
9Referring to Shell Variables
- To refer to shell variables, preface the variable
name with a - Examples
- echo HOME
- echo ?This is the value of the history variable?
history - echo ?My environment path is? PATH
- echo ?My current path is? path
- echo ?My favorite shell,? shell ?is a great
shell.?
10Common Environmental Variables
- HOME - your home directory as defined in
/etc/passwd - PATH - your directory search path
- TERM - your terminal type
- SHELL - your default shell, used when forking
other shells - USER - your user name
11Common Ordinary Shell Variables
- history - enables the history function and
defines number of commands to save - savehist - specifies the number of commands saved
when logging out - shell - pathname of the shell
- home - your home directory
- prompt - sets the prompt string
12Common Ordinary Shell Variables
- echo - if set, the shell displays each command
before executing it (switch) - filec - enables file name completion (switch)
- ignoreeof - tells shell to ignore D so you must
use exit or logout to leave a shell (switch) - noclobber - prevents you from accidentally
overwriting a file when you redirect output
(switch)
13Redirecting Standard Input
- Redirection of standard input works in a similar
manner - lt is the standard input redirection operator
- command options arguments lt input_filename
14Redirecting Standard Input
- command is any executable program
- options and arguments are the same as you are
familiar with - input_filename is the place (file) that contains
the data you want to use for input to command - Obviously, input_filename must exist
15Using Input and Output Redirection
- Standard input and standard output redirection
can be used together on the same command - command options arguments lt input_filename gt
output_filename - E.g.,
- crypt lt password gt cryptext
- sort lt unsorted_file gt sorted_file
16Using Input and Output Redirection
- or
- command options arguments lt input_filename gtgt
output_filename - E.g.,
- foo lt input_file gtgt output_file
17Redirecting Standard Error
- Under the C shell, you can redirect both standard
output and standard error by using the gt or gtgt
symbols followed by - an
18Redirecting Standard Error
- If you dont do this, commands like find may
clutter up your screen with error messages while
you are trying to accomplish other work. - find / -name stdlib.c -print gt whereitis
- or
- find / -name stdlib.c -print gtgt whereitis
19Pipes (Inter-Process Communication)
- Suppose you want a printed sorted list of who is
currently logged on to the system - You could issue the following command sequence
- who gt temp_file
- sort lt temp_file gt sorted_file
- lpr sorted_file
- rm temp_file
- rm sorted_file
20Pipes
- A pipe will let you do this in one command line
without having to use temporary files - who sort lpr
21So What Is a Pipe?
- The shell uses a pipe ( )to connect the
standard output of one command directly to the
standard input of another - This has the same effect of redirecting the
standard output of one command to a temporary
file, and then using that file as the standard
input to another command
22So What Is a Pipe?
- You can use pipes with any command that accepts
input from either a file specified on the command
line or standard input - Pipes can also be used with commands that only
accept data from standard input
23So What Happens in a Pipe?
- Consider the command string
- ls wc -l
- The shell opens a pipe for communication between
ls and wc - ls and wc are then executed as concurrently
running tasks
24So What Happens in a Pipe?
- As ls writes data to the pipe, wc can read the
data from the pipe - Neither process knows that it is using a pipe
instead of a standard file
25Pipes
- Because the processes are running concurrently
and not writing to intermediate files, the entire
sequence is more efficient and should complete
quicker.
26Pipes
- If the writing process produces data much more
quickly than the reading process and accept it,
the pipe can fill up - In that case, the kernel puts the writing process
to sleep until the reader has a chance to catch up
27Common Mistakes
- Confusing redirection and pipes is a common error
that can be deadly to your files - A pipe ( ) will take the output from a command
and connect it to the input of another command - Redirection ( gt ) will take the output from a
command and create or overwrite a file with the
data
28tee
- Copy standard input to standard output and one or
more files
Unix Command
Standard output
file-list
29tee
- Syntax tee -ai file-list
- -a - append to output file rather than overwrite,
default is to overwrite (replace) the
output file - -i - ignore interrupts
- file-list - one or more file names for capturing
output
30tees and Pipes
- So, what happens if you use tee in a pipe?
- ls tee dir_list sort -r
- What will this do? Why would I want to do it?
31tees and Pipes
- What about this ?
- a.out tee output_file
- Why do this?
32Filters
- A filter is a command that process an input
stream of data (from standard input) to produce
an output stream (on standard output.) - A command line that uses a filter use a pipe to
connect the filters input to the standard output
of another command or filter
33Filters
- Another pipe may connect the filters standard
output to another command or filters standard
input - Interactive utilities, such as mail and vi,
cannot be used as filters
34Commonly Used Filters
- sort - sort a file
- tr - translate characters to different characters
- spell - check the spelling of a list of words
- wc - count the number of words, characters and
lines in a file
35Commonly Used Filters
- grep - search files for a pattern
- head and tail - list first/last part of a file
- sed - stream editor
- awk - patern-action pair programming language
36Filter Examples
- who sort - display a sorted list of who is
logged on - ls wc -w - display the number of entries in a
directory - tr A-Z a-z ltmyfile gtyourfile - make all
characters in myfile lower case and saves output
in yourfile
37TRanslate - tr
- Copies standard input to standard output with
substitution or deletion of selected characters - Syntax tr -ds string1 string2
- -d - delete all input characters contained in
- string1
38TRanslate - tr
- -s - squeeze all strings of repeated output
- characters that are in string2 to single
characters - tr provides only simple text processing. It does
not allow the full power of regular expressions.
If you need the power of regular expressions, you
need to use sed
39TRanslate - tr
- tr reads from standard input.
- Any character that does not match a character in
string1 is passed to standard output unchanged - Any character that does match a character in
string1 is translated into the corresponding
character in string2 and then passed to standard
output
40TRanslate - tr
- Examples
- tr s z replaces all instances of s with
- z
- tr so zx replaces all instances of s with
- z and o with x
- tr a-z A-Z replaces all lower case
- characters with upper case
- characters
41tr
- As you can see, tr establishes a single character
to single character (or 1-to-1) mapping. This
mapping is case sensitive. - The output of tr can be redirected to a file or
piped to another filter and its input can be
redirected from a file or piped from another
command
42tr
- This implies that certain characters must be
protected from the shell by quotes or \, such as
- spaces ( ) lt gt \ ! newline
TAB - Example
- tr o \? replaces all os with a blank (space)
43tr
- Non-printing characters can also be specified
- Bell \07
- Backspace \010
- CR \015
- Escape \033
- Formfeed \014
- Newline \012
- TAB \011
44tr
- string1 and string2 can use ranges of characters
as follows - tr a-z A-Z translates all lower case to
- upper case
- tr a-m A-M translates only lower case a
- through m to upper case A though M
45tr
- Ranges must be in ascending ASCII order
- What would happen if
- tr 1-9 9-1 lt some_file
46 tr
- The tr -d option lets you delete any character
matched in string1. string2 is not allowed with
the -d option - Examples
- tr -d a-z deletes all lower case
- characters
- tr -d aeiou deletes all vowels.
47 tr
- tr -dc aeiou deletes all characters except
- vowels (note this includes spaces,
TABS, and newlines as well) - Normal usage would be with pipes or redirection
- tr -d \015 lt in_file gt out_file
48 tr
- Example from the tr man page. How does this
work? - tr -cs A-Za-z \012 ltin_file gt out_file
- It replaces all characters that are not a letter
(-c) with a newline ( \012 ) and then squeezes
multiple newlines into a single newline (-s)
49More tr Examples
- The following commands are equivalent
- tr abcdef xyzabc
- tr a-cd-f x-za-c
- This command implements the rotate 13 encryption
- tr A-MN-Za-mn-z N-ZA-Mn-za-m
50More tr Examples
- To make the text intelligible again, reverse the
arguments - tr N-ZA-Mn-za-m A-MN-Za-mn-z
- The -d option will cause tr to delete selected
characters - echo If you can read this you can spot the
missing vowels tr -d aeiou
51tr Gotchas
- The syntax varies between BSD and System V
- BSD uses a-z
- System V uses a-z
- System V allows xn to indicate n occurrences
of x - If you leave out n, then x is duplicated as many
times as necessary
52tr Gotchas
- In BSD, if string2 does not contain as many
characters as string1, the last character of
string2 is duplicated as many times as necessary - System V fails to translate these additional
characters - Would tr solve the lab problem of capitalizing
all the initial characters of each word in a
sentence? Try it!
53What If?
- Suppose your boss wants a tool that will
- Accept any number of files to check, report each
line that has doubled words, highlight the
doubled words (using ANSI escape sequences) and
display the file name with each line.
54What If?
- Work across lines even when the word at the end
of a line is repeated at the beginning of the
next line - Find doubled words despite capitalization,
punctuation, or amount of separating whitespace - Find doubled words even if they are separated by
HTML tags
55What If?
- Wow, this sounds like a BIG programming job.
Better not make any plans for this weekend!
56You Can Have Your Weekend Back!
- If you know regular expressions, you can have
your weekend back - Using the regular expression engine from Perl,
this problem can be solved in only a few lines of
code
57You Can Have Your Weekend Back!
/ ?.\n? while (ltgt) next if
!s/\b(a-z)((\sltgtgt))(\1\b)/\e7m1\em2\e
7m4\em/ig s/(\e\n)//mg s//ARGV
/mg print Code was written by Jeffrey E. F.
Friedl as an example in Mastering Regular
Expressions OReilly Publishing