The Simplest NL Applications: Text Searching and Pattern Matching - PowerPoint PPT Presentation

About This Presentation
Title:

The Simplest NL Applications: Text Searching and Pattern Matching

Description:

The Simplest NL Applications: Text Searching and Pattern Matching ... Note: ^ marks morpheme boundaries. # marks word boundaries. From Lexical to Intermediate ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 20
Provided by: elaine8
Category:

less

Transcript and Presenter's Notes

Title: The Simplest NL Applications: Text Searching and Pattern Matching


1
The Simplest NL Applications Text Searching and
Pattern Matching
Read J M Chapter 2
2
Searching for a Single StringUsing a
Nondeterministic FSM
c o c o
n u t
1 2 3 4 5
6 7 8
?
?
3
Searching for a Single String Using the Boyer
Moore Algorithm
4
Searching for Multiple Strings
?
o c o s
2 3 4 5
6
l
c o c o
n u t
1 2 3 4 5
6 7 8
?
?
Example lococonut
5
Converting to a Deterministic FSM
?
o c o s
2 3 4 5
6
l
c o c o
n u t
1 2 3 4 5
6 7 8
?
?
6
Regular Expressions
  • Two different (but related) uses of the term
  • Expressions that define all and only the regular
    languages
  • (aa?? ab ? ba ? bb)
  • Expressions in a useful pattern language

Matching ip addresses S!ltemphasisgt (0-9 (\ .
0-9) 3) lt/emphasisgt ! ltinetgt 1
lt/inetgt! Finding doubled words \lt (A-Za-z)
\s \1 \gt
7
REs Syntax and Semantics
Syntax The regular expressions over an alphabet ?
are all strings over the alphabet ? ? (, ), ?,
?, that can be obtained as follows 1. ? and
each member of ? is a regular expression. 2. If ?
, ? are regular expressions, then so is ??. 3. If
? , ? are regular expressions, then so is ???. 4.
If ? is a regular expression, then so is ?. 5.
If ? is a regular expression, then so is (?). 6.
Nothing else is a regular expression.
8
REs Syntax and Semantics
Regular expressions define languages via a
semantic interpretation function we'll call L 1.
L(?) ? and L(a) a for each a ? ? 2. If ? ,
? are regular expressions, then L(??) L(?) L(?)
all strings that can be formed by
concatenating to some string from L(?) some
string from L(?). 3. If ? , ? are regular
expressions, then L(???) L(?) ? L(?) 4. If ? is
a regular expression, then L(?) L(?) 5. If
(?) is a regular expression, then L( (?) )
L(?) A language is regular if and only if it can
be described by a regular expression. Note L is
compositional.
9
The Importance of Compositionality
What is the meaning of Mary cooked the
yujutes. Mary tyroked the yujutes.
10
Morphological Analysis
  • Read J M Chapter 3
  • Recognize words
  • Parse words

11
Morphological Parsing
Goal to represent the facts declaratively so
that a single representation can be used for both
recognition and generation.
Note marks morpheme boundaries. marks word
boundaries.
12
From Lexical to Intermediate
Note All the transducers in the book are
described as lexicalintermediate, but they can
run the other direction.
13
Where Did reg-noun-stem Come From?
14
We Can Cascade or Compose
15
From Intermediate to Surface
For text, we need spelling rules.
x ? ? e / s ___ s
z
Read this as Replace ? as e in the context after
the /.
16
Turning the Rule into a Transducer
foxes xerox foxsat
17
Disambiguation - Local
Local ambiguities
s
asses
luxury
18
Disambiguation - Harder
Sometimes additional knowledge is necessary
foxes fox N PL or fox V SG
Can we think of nouns that cannot also be verbs?
19
Search
  • For FSMs, we can build a deterministic machine.
  • In other cases, we will have to search
  • Depth-first
  • Breadth-first chart parsing

S S
VP VP
NP
PP
NP NP V
V PR N det
N PREP DET N I hit the
boy with a bat.
Write a Comment
User Comments (0)
About PowerShow.com