Perl tutorial - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Perl tutorial

Description:

Perl tutorial – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 29
Provided by: AriJ3
Category:

less

Transcript and Presenter's Notes

Title: Perl tutorial


1
Perl tutorial
2
Running Perl code
  • from command line
  • perl e print 12/4,\n
  • from a file
  • perl myprogram.pl
  • interactively
  • gdsh

3
Variables
  • A scalar represents a single value
  • room 219
  • name Ari
  • sentence name sits in room room

4
Data structures in Perl
  • An array represents a list of values
  • _at_data (Ari, 219, Maa-123.490)
  • data0, data,_at_data2..3,_at_data
  • A hash represents a set of key/value pairs
  • data (namegtAri, roomgt219,coursegtMaa-123.4
    90)
  • _at_k keys data

5
References
  • References are the key to complex data structures
  • people-gtAri-gtroom-gtsize
  • people is a reference to a hash
  • people
  • Ari gt
  • room gt
  • size gt 12

6
Conditional constructs
  • if ( condition )
  • elsif ( other condition )
  • Negated version of if is unless

7
Looping constructs
  • while ( condition )
  • Negated version of while is until

8
Looping constructs (for)
  • for ( i0 iltmax i )
  • for ( _at_array )
  • for my key ( keys array )

9
Builtin operators and functions
  • Arithmetics, boolean logic, ...
  • _at_sorted sort _at_data
  • _at_sorted sort altgtb _at_data
  • _at_sorted sort a cmp b _at_data

10
File I/O
  • open(INFILE, "input.txt) or die "Cant open
    input.txt !"
  • open(OUTFILE, "gtoutput.txt") or die "Cant open
    output.txt !"
  • open(LOGFILE, "gtgtmy.log") or die "Cant open
    logfile !"
  • my line ltINFILEgt
  • my _at_lines ltINFILEgt
  • while (ltINFILEgt) assigns each line in
    turn to _
  • print "Just read in this line _"
  • close INFILE

11
Regular expressions
12
What are regular expressions?
  • at command prompt cp .html
  • in web search garden?
  • , ? are wildcards
  • search with different criteria
  • search and replace
  • splitting input
  • concise, cryptic looking code...
  • s/(A_Z1)(a-z)\.sgml/\1\2\.html/g

13
What can I do with regexes?
  • look out for interesting things in input stream
  • input stream text files on your computer, web
    pages, data produced by a program, ...
  • you do not need to know the exact format of the
    input stream
  • often line by line, but not necessarily
  • text
  • for example sort spam

14
Drawbacks and alternatives
  • regexes are easier to write that read
  • maintenance is a problem
  • alternatives
  • text editor / word processor macros
  • spreadsheet import
  • ordinary code
  • more readable, easier to use
  • regexes flexibility, portability, conciseness,
    fault-tolerance, programmability

15
How do I use regexes?
  • regexes are not a programming language but an
    extension in many
  • Perl, Tcl, Python, C, Java, Visual Basic
  • sed and grep unix tools
  • generally similar but not all implementations are
    alike!
  • in these slides I use Perl unless stated otherwise

16
Why regular expressions?
  • Stephen Kleene (b. 1909, ? 1994)
  • the algebra of regular sets
  • Kleene star

17
Howto (1)
  • In normal text string geo is one instance.
  • In a regex the string geo is a template that
    matches all instances of characters g, e, and
    o all in a row
  • grep geo mytextfile gthits
  • ? all lines in mytextfile which contain word
    geo are stored in a new file hits

18
Howto (2)
  • sed s/geo/ego/ mytextfile gthits
  • ?all lines in mytextfile are copied to hits but
    lines containing string geo are changed so that
    geo becomes ego
  • you may want to use modifier /g to change all
    instances of geo to ego

19
More complicated patterns
  • alphanumeric characters match themselves
  • character classes
  • \w \W \s \S \d \D ... ...
  • metacharacters and anchors
  • .
  • quantifiers
  • ? num num, min,max
  • grouping and alternation
  • ()

20
More patterns
  • Metacharacters
  • \t \n \r
  • Word boundary (zero width)
  • \b \w\W or \W\w
  • Escape, I.e. how to match . for example
  • \. \

21
Capturing
  • ( ... ) creates capture buffers
  • to refer to the nth buffer use
  • \n within the match
  • n outside the match
  • Examples
  • s/( ) ( )/2 1/
  • if (/(.)\1/) print 1 is now\n
  • if (/Time (..)(..)(..)/) print it is
    12.3 now\n
  • match variables are dynamically scoped until the
    end of the enclosing block or next successful
    match

22
Modifiers
  • Case insensitive matching
  • /i
  • Global, I.e. matching all occurrences
  • /g
  • Evaluate code in regex
  • /e

23
  • For a regex to match the entire regex must match

24
Greediness
  • By default, a quantified subpattern is greedy,
    that is, it will match as many times as possible
    (given a particular starting location) while
    still allowing the rest of the pattern to match.
  • To match a minimum number of times, follow the
    quantifier with a ?.
  • For example vs. ?

25
Example of greediness
  • _ "I have 2 numbers 53147"
  • if ( /(.)(\d)/ )
  • print "Beginning is lt1gt, number is lt2gt.\n"
  • Output Beginning is ltI have 2 numbers 53147gt,
    number is ltgt.
  • (.)(\d) ltI have 2 numbers 5314gt lt7gt
  • (.?)(\d) ltgt ltgt
  • (.?)(\d) ltI have gt lt2gt
  • (.)(\d) ltI have 2 numbers 5314gt lt7gt
  • (.?)(\d) ltI have 2 numbers gt lt53147gt
  • (.)\b(\d) ltI have 2 numbers gt lt53147gt
  • (.\D)(\d) ltI have 2 numbers gt lt53147gt

26
  • A regular expression is merely a set of
    assertions that gives a definition of success.
  • Warning it is possible to devise regexes which
    take exponential time to solve.
  • Sometimes it is best to use several (simple)
    regexes instead of one (complicated)

27
Perlishnesses (1)
  • In if (/(\d)/) ... Perl tries to match the
    contents of the default variable (_) against the
    regex
  • Sometimes the default variable gets the value you
    want.
  • Sometimes you need to set it explicitly.

28
Perlishnesses (2)
  • The split /regex/, string splits string into a
    list of substrings and returns that list
  • x "1.618,2.718, 3.142"
  • _at_const split /,\s/, x
  • const0 '1.618'
  • const1 '2.718'
  • const2 '3.142'
Write a Comment
User Comments (0)
About PowerShow.com