DataOriented Natural Language Processing using LexicalFunctional Grammar - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

DataOriented Natural Language Processing using LexicalFunctional Grammar

Description:

Experience-Based Parsing: DOP ... Tree-DOP: Ranking output parses. Relative frequency: ... LFG-DOP: Fragmentation. Root: select any non-frontier non-terminal ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 24
Provided by: mhe64
Category:

less

Transcript and Presenter's Notes

Title: DataOriented Natural Language Processing using LexicalFunctional Grammar


1
Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
Mary Hearne School of Computing, Dublin City
University
2
Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
  • Data-Oriented Parsing (DOP) A review
  • Parsing with Lexical-Functional Grammar LFG-DOP
  • LFG-based models what are the challenges?

3
Experience-Based Parsing PCFGs
Basic strategy extract grammar rules and their
relative frequencies from a treebank
Vanilla PCFG
4
Experience-Based Parsing DOP
Basic strategy extract grammar rules and their
relative frequencies from a treebank
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1
1/2
1/2
1/4
1/4
1/4
1/4
5
Non-local dependencies
keep an eye on NP
from last N to first N
6
Tree-DOP Decomposition
7
Tree-DOP Parsing new input
8
Tree-DOP Ranking output parses
Relative frequency
  • e - the number of occurrences of subtree e in
    the set of fragments
  • r(e) - the root node category of subtree e

Parse probability
  • Multiply fragment probabilities to calculate
    derivation probability
  • Sum derivation probabilities to calculate parse
    probability

9
Tree-DOP Some results
F-Score
Exact Match
MPP
MPD
MPP
MPD
d 1
94.78
English full parses only (90.83)
d 1
69.14
94.79
76.54
d 2
97.55
d 2
86.42
97.83
87.65
d 3
96.86
d 3
83.95
98.17
88.89
d 4
95.43
d 4
72.84
95.83
76.54
F-Score
Exact Match
MPP
MPD
MPP
MPD
d 1
92.68
French full parses only (92.36)
d 1
52.94
92.22
55.29
d 2
96.13
d 2
72.94
96.10
68.24
d 3
96.09
d 3
70.59
97.06
76.47
d 4
96.62
d 4
70.59
96.65
74.12
10
LFG-DOP
Lexical Functional Grammar (LFG) a
constraint-based theory of language
  • c-structure context-free phrase structure trees
  • f-structure attribute-value matrix
  • ?-links mapping from c- to f-structure

11
LFG-DOP Fragmentation
Root select any non-frontier non-terminal
c-structure node as root and
Root and Frontier
  • C-structure delete all except this new root and
    the subtree it dominates
  • ?-links delete all ?-links corresponding to
    deleted c-structure nodes
  • F-structure delete all f-structure units not
    ?-accessible from the remaining c-structure nodes
  • Forms delete all semantic forms corresponding to
    deleted terminals

Frontier select a (possibly empty) set of
non-root non-terminal nodes in the root-created
c-structure
  • C-structure delete all c-structure subtrees
    dominated by frontier nodes

?-accessibility
12
LFG-DOP Parsing new input
  • C-structure left-most substitution
  • category-matching
  • F-structure unification
  • uniqueness, completeness, coherence

LFG-DOP composition
o

13
LFG-DOP computing probabilities
Parse probability
  • Multiply fragment probabilities to calculate
    derivation probability
  • Sum derivation probabilities to calculate parse
    probability
  • Normalise over the probabilities of valid parses

PDOP(T)
PLFG-DOP(TT is valid)

S Tx is valid PDOP(Tx)
14
LFG-DOP Robustness via discard
o

Discard operation
  • Delete attribute-value pairs from the f-structure
    while keeping c-structure and ?-links constant
  • Restriction pairs whose values are ?-linked to
    remaining c-structure nodes are not deleted

SUBJ
PRED john
SUBJ
PRED john
SUBJ
PRED john
S
NUM pl
NUM sg
PRED seeltSUBJ,OBJgt
PRED seeltSUBJ,OBJgt
PRED seeltSUBJ,OBJgt
NP
VP
TNS pres
TNS pres
TNS pres
NUM sg
NUM sg
NUM sg
john
V
NP
PRED mary
PRED mary
PRED mary
see
mary
OBJ
OBJ
OBJ
NUM sg
NUM sg
NUM sg
15
LFG-DOP What are the challenges?
  • To define fragmentation operations such that
  • phenomena such as recursion and re-entrancy are
    handled
  • constraints are applied appropriately
  • discard is used only to handle ill-formed input
  • To somehow distinguish between constraining and
    informative features
  • translation vs. parsing
  • To address the fact that substitution is local
    but unification is global, i.e.
  • to enforce LFG well-formedness conditions in an
    accurate and efficient manner
  • to sample for the best parse in an accurate and
    efficient manner
  • to define a probability model that doesnt leak

16
LFG-DOP Recursion Re-entrancy
PRED venirltSUBJ,XCOMPgt TNS
pres FIN PERS
3
PRED LED CASE nom NUM
sing PERS 3
S
NP
NP
VP
PRED jean NUM sg PERS 3
DET
NPadj
SPEC
SUBJ
f2
f4
jean
V
V
the
Adj
N
PRED tomberltSUBJgt SUBJ DE
FIN -
COMP
V
vient
yellow
LED
f2
f5
ADJUNCT
de
XCOMP
tomber
f3
f3
f1
  • How can we adequately express the constraints on
    the composition of fragments such as (ADJ yellow)
    and (V tomber)?

CASE nom NUM sing PERS
3
NUM sg PERS 3
SUBJ
f2
V
Adj
SPEC-TYPE def
SPEC
f4
PRED tomberltSUBJgt SUBJ DE
FIN -
tomber
f2
yellow
XCOMP
f5
ADJUNCT
f3
f1
f3
17
LFG-DOP Constraint over-specification
PRED flashltSUBJgt
TNS-ASP
f2
CASE nom NUM sing PERS
3
SUBJ
V
SPEC
SPEC-TYPE def
f4
flashing
SUBJ
f3
f5
ADJUNCT
f3
f1
  • Is it appropriate to insist that the subject of
    flashing have an adjunct?
  • Is it appropriate to be forced to use discard to
    allow the subject of flashing have an indefinite
    specifier?
  • Important we also want to remain
    language-independent

18
LFG-DOP An alternative fragmentation process
  • Determine a c-structure fragment using root and
    frontier as for Tree-DOP but retain the full
    f-structure given in the original representation.
  • Delete all f-structure units (and the attributes
    with which they are associated) which are not
    ?-linked from one or more remaining c-structure
    nodes unless that unit is the value of an
    attribute subcategorised for by a PRED value
    whose corresponding terminal is dominated by the
    current fragment root node in the original
    representation.
  • Where we have floating f-structure units, also
    retain the minimal f-structure unit which retains
    them both. By minimal unit we mean the unit
    containing both floating f-structures along with
    their (nested sequence of) attributes.
  • Delete all semantic forms (including PRED
    attributes and their values) not associated with
    one of the remaining c-structure teriminals.

19
LFG-DOP constraints vs. information
  • To prune attribute-value pairs based on a
    language-specific algorithm
  • e.g. English subj-verb agreement but not
    obj-verb agreement
  • To automatically learn which attribute-value
    pairs should be pruned for a particular dataset
  • To do soft pruning distinguish between
    constraining features and informative features
  • Account for the difference during unification

best suited to translation?
best suited to parsing?
20
LFG-DOP substitution vs. unification
substitution is local but unification is global
SUBJ
NUM sg
SUBJ
NUM sg
SUBJ
NUM sg
VP
NP
PRED john
o
o
o
o
S
NUM sg
V
PRED loveltSUBJ,OBJgt
TNS pres
NP
PRED mary
TNS pres
NUM sg
TNS pres
NUM sg
NUM sg
NUM sg
V
NP
john
NP
VP
loves
OBJ
NUM pl
mary
OBJ
NUM pl
OBJ
NUM pl
To be enforced category matching, uniqueness,
coherence and completeness
Enforce category matching during parsing
Model M1
Model M2
Enforce category matching and uniqueness during
parsing
Model M3
Enforce category matching, uniqueness and
coherence during parsing
There is no Model M4 completeness can never be
checked until a complete parse has been obtained
21
LFG-DOP sampling
ij,VP
The exact probability of sampling fx at ij,VP
is
  • PDOP(fx)
  • Multiplied by the sampling probability mass
    available at each of its substitution sites
    ik,V and ikj-k,NP
  • And divided by the sampling probability mass
    available at ij,VP

VP
fx

V
NP
ik
ikj-k
ij,VP
Problem for computing the exact probability of
sampling fx at ij,VP
f1
SUBJ
NUM sg
f2
  • We cannot know the sampling probability mass
    available at substitution site ikj-k,NP
    until ikj-k,NP is the leftmost substitution
    site unless we stick with Model M1

TNS pres
VP
fx
NUM sg

V
NP
OBJ
f3
ik
ikj-k
Problem for establishing when enough samples have
been taken
  • We cannot know how many valid parses there are
    until all constraints have been resolved

22
LFG-DOP leaked probability mass
PRED john
S
NUM sg
NP
PRED john
S
VP
SUBJ
NUM pl
SUBJ
NUM pl
o
o
SUBJ
NUM pl

NUM sg
TNS pres
NP
VP
john
NP
VP
TNS pres
left
TNS pres
PRED leaveltsUBJgt
PRED leaveltsUBJgt
john
left



0.05
0.007
0.001
0.00000035
  • This derivation will be thrown out because it
    does not satisfy the uniqueness condition
  • Its probability is thrown out with it ?
    leaked probability mass
  • Normalisation camouflages the problem but does
    not solve it

23
Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
Questions?
Write a Comment
User Comments (0)
About PowerShow.com