CS 497C - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

CS 497C

Description:

CS 497C Introduction to UNIX Lecture 31: - Filters Using Regular Expressions grep and sed Chin-Chih Chang chang_at_cs.twsu.edu – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 13
Provided by: Kind155
Category:
Tags: 497c | regular

less

Transcript and Presenter's Notes

Title: CS 497C


1
CS 497C Introduction to UNIXLecture 31 -
Filters Using Regular Expressions grep and sed
  • Chin-Chih Changchang_at_cs.twsu.edu

2
Substitution
  • seds strongest feature is substitution, achieved
    with its s (substitute) command.
  • It has the following format
  • addresss/expression1/string2/flag
  • This is how you replace the with a colon
  • sed s///g emp.lst head -2
  • To check whether substitution is performed, you
    can use the cmp command as follows
  • sed s///g emp.lst cmp -l - emp.lst wc -l

3
Substitution
  • You can perform multiple substitutions with one
    invocation of sed by pressing Enter at the end
    of each instruction, and then close the quote at
    the end
  • sed s/ltIgt/ltEMgt/g
  • gt s/ltBgt/ltSTRONGgt/g form.html
  • You can compress multiple spaces as below
  • sed s g emp.lst head -2

4
Substitution
  • sed /dirctor/s/director/member/ emp.lst
  • sed /dirctor/s//member/ emp.lst
  • The above command suggests that sed remembers
    the scanned pattern, and stores it in // (2
    frontslashes).
  • The // representing an empty (or null) regular
    expression is interpreted to mean that the search
    and substituted patterns are the same. This is
    called the remembered pattern.

5
Substitution
  • When a pattern in the source string also occurs
    in the replaced string, you can use the special
    character to represent it.
  • sed s/director/executive director/ emp.lst
  • sed s/director/executive / emp.lst
  • These two commands are same. The , known as the
    repeated pattern, expands to the entire source
    string.

6
Regular Expressions
  • The interval regular expression (IRE) uses the
    escaped pair of curly braces with a single or
    a pair of numbers between them.
  • We can use this sequence to display files which
    have write permission set for group
  • ls -l grep .\5\w
  • The regular expression .\5\w matches five
    characters (.\5\) at the beginning () of the
    line, followed by the pattern (w).

7
Regular Expressions
  • The \5\ signifies that the previous character
    (.) has to occur five times. The . (dot)
    character is used to match any character.
  • The IRE has three forms
  • ch\m\ The metacharacter ch can occur m times.
  • ch\m,n\ ch can occur between m and n times.
  • ch\m,\ ch can occur at least m times.

8
Regular Expressions
  • We can display the listing for those files that
    have the write bit set either for group or
    others
  • ls l grep .\5,8\w
  • To locate the people born in 1945 in the sample
    database, use sed as follows
  • sed n /.\49\45/p emp.lst
  • The tagged regular expression (TRE) uses \( and
    \) to enclose a pattern.

9
Regular Expressions
  • Suppose you want to replace the words John Wayne
    by Wayne, John. The sed substitution instruction
    will then look like this
  • echo John Wayne sed s/\(John\)
    \(Wayne\)/\2, \1/
  • Because the TRE remembers a grouped pattern, you
    can look for these repeated words like this
  • grep \a-za-za-z\) \1 note

10
Regular Expressions
  • These are pattern matching options used by grep,
    sed, and perl (Page 441)
  • abc match the character string abc.
  • zero or more occurrences of previous
    character.
  • . match any character except newline.
  • . nothing or any number of characters.
  • a? match zero or one instance a.
  • a match zero or more repetitions of a.

11
Regular Expressions
  • abcde match any character within the
    brackets.
  • a-b match any character within the range a to
    b.
  • abcde match any character except those
    within the brackets.
  • a-b match any character except those in the
    range a to b.
  • match beginning of line, e.g., //.
  • lines containing nothing.

12
Regular Expressions
  • match end of line, e.g., /money./.
  • a\2\ match exactly two repetitions of a.
  • a\4,\ match four or more repetitions of a.
  • a\2, 4\ match between two and four
    repetitions of a.
  • \(exp\) expression exp for later referencing
    with \1, \2, etc.
  • ab match a or b.
Write a Comment
User Comments (0)
About PowerShow.com