awk and nawk - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

awk and nawk

Description:

It was designed as a generalized tool, based upon grep and sed, ... ARGV (an array of the command line arguments) Dr. Tim Gottleber. 13-7. The parts of a program ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 45
Provided by: drtimgo
Category:
Tags: argv | awk | nawk

less

Transcript and Presenter's Notes

Title: awk and nawk


1
Chapter 13
  • awk (and nawk)

2
Overview
  • The awk programming language was written by
  • Alfred Aho
  • Peter Wienberger
  • Brian Kernighan
  • It was designed as a generalized tool, based upon
    grep and sed, for handling numbers and text.

3
Overview (cont.)
  • awk is a fully functional programming language
  • With awk, we finally have field level
    addressability!
  • awk works field by field

4
awk command syntax
  • There are two ways to execute an awk
    program/script
  • awk -F field-separator program target-file
  • awk -F field-separator -f program.file target
  • From our discussion of sed, and Refrigerator Rule
    No. 5, I would hope you are firmly committed to
    the second form!

5
awk Variables
  • There are a number of awk variables that are very
    useful
  • FS (The field separator, defaults to white space)
  • OFS (Output field separator, can be critical)
  • NR (Number of records, a sequential counter)
  • NF (Number of fields in the current record)
  • FILENAME (Name of the current target file)

6
awk Variables (cont.)
  • 0 (The entire line as read from the target file)
  • n (Where n is the nth field in the record. This
    is how we get field level addressability in awk)
  • nawk, gawk, etc give us more variables, the most
    significant two are
  • ARGC (the count of the command line arguments)
  • ARGV (an array of the command line arguments)

7
The parts of a program
  • All programs are composed of one or more of the
    following three constructs
  • sequence (a series of instructions, one following
    the next, executed sequentially)
  • selection (the ability of the code to decide
    which instructions to execute, conditional
    execution)
  • iteration (adding looping so that selected code
    will be repeated over an over)

8
awk Program Format
  • Awk programs are composed of pattern action
    pairs (actions must be enclosed in French braces
    )
  • a pattern without a corresponding action takes
    the default action, print 0
  • an action without a corresponding pattern is
    applied to every line
  • each input line is submitted to every
    pattern/action pair

9
awk Program Format (cont.)
  • Placement of the open French brace is critical
  • pattern both patterns are action 1
    executed for lines action 2 matching the
    pattern
  • pattern lines matching the pattern action
    1 are printed, and both action 2 actions are
    performed on every line!

10
Patterns
  • In an awk program, the pattern is the selection
    tool that decides what actions are applied to
    which lines.
  • Patterns can be
  • relational expressions
  • regular expressions
  • magic patterns

11
Relational Expression patterns
12
Regular Expression patterns
  • Must be enclosed in slashes /RE/
  • Anchors apply to the entire line if they are used
    as the only pattern
  • Remember, you can use regular expressions in
    relational patterns with and ! to apply them
    to fields
  • Both true regular expressions and fixed patterns
    can be used as REs in awk

13
Magic Patterns
  • There are two optional magic patterns in awk
  • BEGIN the action associated is performed before
    the target file is opened
  • END the action associated is performed after the
    target file is successfully closed
  • Both are coded in UPPER CASE
  • You can use one without the other, they are NOT
    related in any way.

14
Actions
  • Actions are composed of different awk verbs
  • An action can be a single verb, or hundreds of
    them.
  • Each action is enclosed in a pair of French
    braces
  • Some awk scripts are just a single action
  • Let the commands do the work!

15
Actions comments
  • awk scripts need to be well documented so you
    know why you did what you did
  • Dont create comments that explain what you are
    doing, always explain why.

16
Actions print
  • The print command is the simplistic output tool
    for awk
  • You can direct print to send its data to a file
    with the gt operator
  • Generally print is used for simple output or
    debugging output

17
Actions printf
  • This is the outstanding formatting tool awk is
    famous for
  • the format of a printf command is printf
    (formatting string,variables)
  • The formatting characters correspond to the
    variables one for one
  • Each formatting character is prefixed by

18
Actions printf (cont.)
  • The formatting specifiers contain then following
    characters
  • - indicates that the data should be left justifed
  • n indicates the minimum width of the field
  • .n indicates the maximum width of the
    field -5s indicates a string field,
    left justified, of width 5 bytes

19
Actions printf
20
Actions printf formatting characters
21
Actions printf spacing characters
  • There are two characters available to change the
    spacing of your text
  • \n inserts a newline character. You must use
    this if you want your output to occur on
    successive lines.
  • \t inserts a tab character

22
Actions getline
  • getline is used to read from the keyboard
  • It can also capture the results of a command but
    this form is seldom used
  • Read from the keyboard using getline variable
    lt /dev/tty
  • If you dont supply a variable, awk will use 0,
    so in most cases you want to use a variable.

23
Actions rand() srand()
  • The rand() function generates peseudo-random
    numbers in the range 0 - 1.
  • Given the same seed, it will always generate the
    same series of numbers.
  • srand() is used to supply a new seed to rand().
  • If you dont supply srand() a value, it uses the
    current time as the seed.

24
Actions system()
  • The system() function allows you to execute
    system commands within an awk script.
  • You must enclose the system command in quotation
    marks.
  • You cannot capture the output from the system()
    function within the script but you can capture
    the return code.

25
Actions length()
  • The length(argument) function returns the
    length of the argument in bytes.
  • If you give length() a number, it will return the
    number of digits in the number.
  • If you dont give length() an argument, it will
    use 0 by default.

26
Actions index()
  • The index(string,target) function returns the
    position of the first occurrence of the target
    within the string.
  • This is useful for identifying sections of the
    string for later division.
  • The index() function is often used to set the
    boundary for the substr() function.

27
Actions substr()
  • The substr(string,start,length) function will
    return the part of the string beginning with
    start and continuing for length bytes.
  • If you dont give it a length, it will return all
    the bytes between the start and the end of the
    string.

28
Actions split()
  • You will use split(string, array, separator) to
    divide a string into parts using separator to
    parse them, storing the resultant parts in the
    array.
  • If you dont code a separator, the function will
    use the field separator to parse the string.

29
Actions if
  • Besides using patterns, if gives us another way
    to perform selection
  • The format of an if statement is if
    (condition) verb(s) else
    verb(s)
  • If you have more than one verb, they must be
    enclosed in French braces.

30
Actions if operators
31
Actions if
  • A sample if

32
Actions exit
  • The input file is closed
  • Control is transferred to the action associated
    with the END magic pattern if there is one
  • Generally used as a bailout in case of
    catastrophic errors

33
Actions for loop
  • Our first example of iteration
  • This is a counted loop
  • Will execute until the counter reaches the target
    value
  • Can increment (count up) or decrement (count
    down)
  • for also works with the elements of an array
  • multiple verbs must be enclosed in

34
Actions for loop example
35
Actions while loop
  • The while loop is an example of conditional
    execution
  • The loop cycles as long as the condition
    specified is true
  • A while loop always checks to see if it should
    execute
  • multiple verbs must be enclosed in

36
Actions while loop example
37
Actions do/while
  • Even though it has a while in it, this is an
    example of until logic.
  • Until logic is shunned by conscientious coders.
  • nuff said

38
Actions break
  • Used to exit from a loop
  • Control is passed to the line following the end
    of the loop
  • Causes an exit from the loop but NOT the awk
    script. If you want to bail out of the whole
    script, use the exit command.

39
Actions break example
40
Actions continue
  • Causes awk to skip the rest of the body of the
    loop for the current value
  • In a for loop the counter is incremented, and the
    next cycle of the loop is started
  • In a while loop, the next iteration of the loop
    starts

41
Actions continue example
42
Actions next
  • Causes the script to start over
  • takes the next element from standard input or the
    target file
  • Like exit, this command effects the whole script

43
Actions next example
44
A very handy little vi trick
  • In vi place your cursor on an open , , or ( or
    a close , , or )
  • Touch the (percent) key
  • The cursor will jump to the corresponding open or
    close element!
  • This is an easy way to insure that you have
    closed all your containers!
Write a Comment
User Comments (0)
About PowerShow.com