Practical Extraction - PowerPoint PPT Presentation

About This Presentation
Title:

Practical Extraction

Description:

can manipulate textual data, email, news articles, log files, or just about any ... auto increment -- auto decrement. Literal Operators . concatenate ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 20
Provided by: josephb61
Category:

less

Transcript and Presenter's Notes

Title: Practical Extraction


1
Practical Extraction Report Language
PERL
Joseph Beltran
2
What is PERL?
  • An interpreted language that is optimized for
    I/O, system tasks and string manipulation
  • Larry Wall originally created PERL because he
    sought the need for a language that combines the
    best features of other scripting languages

3
Uses of PERL
  • Text Processing
  • can manipulate textual data, email, news
    articles, log files, or just about any kind of
    text, with great ease
  • System Administration
  • particularly useful for tying together lots of
    smaller scripts, working with file systems,
    networking, so on
  • CGI and Web Programming
  • can be used to process and generate HTML
  • Other Uses
  • DNA sequencing for The Human Genome Project
  • NASAs Satellite Systems Control
  • Perl Data Language for number-crunching
  • Perl Object Environment for event-driven machines

4
Variables
  • PERL provides three kinds of variables
  • Scalars, Arrays, and Associative Arrays
  • The initial character of the name identifies the
    particular type of variable and, hence, its
    functionality.

5
Variables
  • Scalar Variables
  • name
  • Strings and numbers whether integers or decimals
    are treated in the same way
  • aVar 4
  • bVar 4.5 a decimal number
  • cVar 3.14e10 a floating point number
  • dVar "a string of words
  • eVar aVar . bVar note use of . to
    concatenate strings

6
Variables
  • Arrays
  • _at_name()
  • Single dimension list of scalars
  • _at_aList (2, 4, 6, 8) explicit values
  • _at_aList (1..4) range of values\
  • _at_aList (1, "two", 3, "four") mixed values
  • _at_aList () empty list
  • aList index of last item
  • aList0 first item in _at_aList

7
Variables
  • Associative Arrays
  • name
  • A two-dimensional array, for use with
    attribute/value pairs.
  • The first element in each row is a key and the
    second element is a value associated with that
    key.
  • aAA"A" 1 creates first row of associative
    array
  • aAA"B" 2 creates second row of
    associative array
  • aAA ("A", 1, "B", 2) same as first two
    statements

8
Operators
  • If variables are the nouns, PERL provides
    operators, which are the verbs.
  • Operators access and change the values of
    variables.
  • Some assignments apply to all three kinds of
    variables. However, most are specialized with
    respect to their types.

9
Operators
  • Numeric Operators
  • plus
  • - minus
  • multiply
  • / divide
  • exponentiation
  • modulus
  • equal
  • ! not equal
  • lt less than
  • gt greater than
  • lt less than or equal to
  • gt greater than or equal to
  • binary assignment
  • - same, subtraction
  • same, multiplication
  • auto increment
  • -- auto decrement

10
Operators
  • Literal Operators
  • . concatenate
  • x n repetition e.g., "A" x 3 gt "AAA"
  • eq equal
  • ne not equal
  • lt less than
  • gt greater than
  • le less than or equal to
  • ge greater than or equal to

11
Control Structures
  • PERL is an iterative language in which control
    flows from the first statement in the program to
    the last statement unless something interrupts.
  • Some of the things that can interrupt this linear
    flow are conditional branches and loop
    structures.

12
Control Structures
  • If Conditional Statement
  • if (expression_A)
  • A_true_stmt_1
  • A_true_stmt_2
  • elseif (expression_B)
  • B_true_stmt_1
  • B_true_stmt_2
  • else
  • false_stmt_1

13
Control Structures
  • While Loop Statement
  • LABEL while (expression)
  • stmt_1
  • stmt_2
  • Until Loop Statement
  • LABEL until (expression)
  • stmt_1
  • stmt_2

14
Control Structures
  • For Loop Statement
  • LABEL for (initial exp test exp increment
    exp)
  • stmt_1
  • stmt_2
  • For Each Loop Statement
  • LABEL foreach i (_at_aList)
  • stmt_1
  • stmt_2

15
Input / Output
  • PERL uses filehandles to control input output
  • These are STDIN for accessing input, STDOUT for
    printing output, and STDERR for writing error
    messages
  • Additional filehandles are created by the open
    command

16
Input / Output
  • Opening Files
  • Syntax open (FILEHANDLE, "filename")
  • Examples
  • open (INPUT, "index.html") for reading
  • open (OUTPUT, "gt index.html") for writing
  • open (OUTPUT, "gtgt index.html") for appending
  • Closing Files
  • Syntax close (FILEHANDLE)
  • Example
  • close (INPUT)

17
Regular Expressions
  • Regular expressions give us extreme power to do
    pattern matching on text documents.
  • Patterns
  • Literal String Pattern
  • if (/cat/) print "cat found in a"
  • Single-Character Pattern
  • /.at/ matches "cat, and "bat
  • /0-9/ matches 0 to 9
  • /0123456789/ matches 0 to 9

18
Regular Expressions
  • Operators
  • Substitution
  • s/cat/dog/ replaces "cat" with "dog
  • s/cat/dog/gi same but ignores case
  • Splitting
  • _at_a split(/cat/, a) removes cat from
    a
  • Joining
  • a join (cat", "dog", "bird") returns
    "catdogbird"

19
Examples
  • Example 1
  • STDIN and STDOUT, Looping and Conditions
  • Example 2
  • SEARCH and REPLACE strings
  • Example 3
  • FILE READING and WRITING
  • Sample scripts are run using Active Perl 5.6
  • from www.activestate.com
Write a Comment
User Comments (0)
About PowerShow.com