Unix Filter Program:2 - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Unix Filter Program:2

Description:

l List only filenames containing matching lines ... egrep -f alphvowels /usr/share/dict/words abstemious ... tragedious. 18. fgrep, grep, egrep ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 19
Provided by: stormCis
Category:

less

Transcript and Presenter's Notes

Title: Unix Filter Program:2


1
Unix Filter Program2
  • CSRU3130, Spring 2008
  • Ellen Zhang

1
2
Outline
  • Quiz
  • grep family and regular expression

2
3
grep Family
  • Syntax
  • grep -hilnv -e expression filename
  • egrep -hilnv -e expression -f filename
    expression filename
  • fgrep -hilnxv -e string -f filename
    string filename
  • -h Do not display filenames
  • -i Ignore case
  • -l List only filenames containing matching
    lines
  • -n Precede each matching line with its line
    number
  • -v Negate matches
  • -x Match whole line only (fgrep only)
  • -e expression Specify expression as option
  • -f filename Take the regular expression (egrep)
    or a list of strings (fgrep) from filename

4
Regular Expressions
  • The simplest regular expressions are a string of
    literal characters to match.
  • The string matches the regular expression if it
    contains the substring.

5
Character Classes
  • Syntax abc, a-z1-9,1-9,
  • Named character class
  • commonly used character classes can be referred
    to by name (alpha, lower, upper, alnum, digit,
    punct, cntrl)
  • Syntax name
  • a-zA-Z alpha
  • a-zA-Z0-9 alnum
  • 45a-z 45lower

6
Protecting Regex Metacharacters
  • Regrex metacharacters .,, , , , , ?,,(,),
  • Many of them have special meaning to shell
  • , ? are used for filename extension
  • access variable values
  • Need to protect special characters use single
    quotes
  • Ask shell to leave the quoted string alone, just
    passed the string as it is to the command (grep)

7
Repetition Ranges
  • Egrep syntax
  • n means exactly n occurrences
  • n, means at least n occurrences
  • n,m means at least n occurrences but no more
    than m occurrences
  • Example
  • .0, same as .
  • a2, same as aaa
  • Grep syntax \n,m\
  • grep 0-9\1,4\ file.txt

8
egrep sub-expressions
  • Use ( ) to group part of an expression to a
    sub-expression
  • Sub-expresssions are treated like a single
    character
  • following or applies to the sub-expression
  • abc matches ab, abc, abcc, abccc,
  • (abc) matches abc, abcabc, abcabcabc,
  • (abc)2,3 matches abcabc or abcabcabc

9
egrep alternation
  • Alternation character for matching one or
    another sub-expression
  • (TFl)an will match Tan or Flan
  • (FromSubjectTo) match From, Subject, To lines
    of an email message
  • At(tennine)tion matches Attention or
    Atninetion
  • Attenninetion matches Atten or ninetion

10
egrep Repetition Shorthands
  • (star) zero or more occurrences of the
    immediately preceding character, or
    sub-expressions
  • a, 1-9, (abcABC)
  • (plus) one or more occurrences
  • abcd will match abcd, abccd, or abccccccd
    but will not match abd
  • Equivalent to 1,

11
egrep Repetition Shorthands
  • ? (question mark) an optional character, the
    single character that immediately precedes it
  • July? will match Jul or July
  • Equivalent to 0,1
  • Also equivalent to (JulJuly)
  • , ?, and are known as quantifiers because they
    specify the quantity of a match
  • Quantifiers can also be used with sub-expressions
  • (ac) will match c, ac, aac or aacaacac
    but will not match a or a blank line

12
grep Examples
  • grep 'men' GrepMe
  • grep 'fo' GrepMe
  • egrep 'fo' GrepMe
  • egrep -n 'Tthe' GrepMe
  • fgrep 'The' GrepMe
  • egrep 'NC0-9A?' GrepMe
  • fgrep -f expfile GrepMe
  • Find all lines with signed numbers
  • egrep -0-9\.?0-9 .cbsearch. c
    return -1compile. c strchr("1-23", t-gt
    op)1 - 0, dst,convert. c Print integers in
    a given base 2-16 (default 10)convert. c
    sscanf( argv i1, " d", base)strcmp. c
    return -1strcmp. c return 1

13
Grep Backreferences
  • Sometimes it is handy to be able to refer to a
    match that was made earlier in a regex
  • Find all accounts whose userid is same as groupid
  • This is done using backreferences
  • Use \( and \) to specify sub-expressions, tagged
    expression
  • \n backreference specifier, matching exactly the
    string that has matched the nth tagged expression

14
Back-references
  • For example, search all lines where the first
    word is the same as the last
  • grep \(alpha\1,\\) . \1 filename
  • The \(alpha\1,\\) matches 1 or more
    letters
  • Find accounts whose uidgroupid
  • grep '\(0-9\)\1' /etc/passwd
  • Note one regex can have multiple backreference

15
Practical Regex Examples
  • Variable names in C
  • a-zA-Z_a-zA-Z_0-9
  • Dollar amount with optional cents
  • \0-9(\.0-90-9)?
  • Time of day
  • (10121-9)0-50-9 (ampm)
  • HTML headers lth1gt ltH1gt lth2gt
  • lthH1-4gt

16
Examples
  • Interesting examples of grep commands
  • To search lines that have no digit character
  • grep -v '0-9' filename
  • Look for users with uid0 (root permission)
  • grep '0' /etc/passwd
  • To search users without passwords
  • grep /etc/passwd

17
Specify pattern in files
  • -f option useful for complicated patterns, also
    don't need to worry about shell interpretation.
  • Example
  • cat alphvowels aeiouaaeioueaeioui
    aeiouoaeiouuaeiou
  • egrep -f alphvowels /usr/share/dict/words
    abstemious ... tragedious

18
This is one line of text
input line
o.o
regular expression
fgrep, grep, egrep
grep, egrep
grep
QuickReference
egrep
Write a Comment
User Comments (0)
About PowerShow.com