DataOriented Natural Language Processing using LexicalFunctional Grammar - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

DataOriented Natural Language Processing using LexicalFunctional Grammar

Description:

Experience-Based Parsing: DOP ... Tree-DOP: Ranking output parses. Relative frequency: ... LFG-DOP: Fragmentation. Root: select any non-frontier non-terminal ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 24

Provided by: mhe64

Category:

more less

Transcript and Presenter's Notes

Title: DataOriented Natural Language Processing using LexicalFunctional Grammar

1
Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
Mary Hearne School of Computing, Dublin City
University
2
Data-Oriented Natural Language Processing using
Lexical-Functional Grammar

Data-Oriented Parsing (DOP) A review
Parsing with Lexical-Functional Grammar LFG-DOP

LFG-based models what are the challenges?

3
Experience-Based Parsing PCFGs
Basic strategy extract grammar rules and their
relative frequencies from a treebank
Vanilla PCFG
4
Experience-Based Parsing DOP
Basic strategy extract grammar rules and their
relative frequencies from a treebank
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1
1/2
1/2
1/4
1/4
1/4
1/4
5
Non-local dependencies
keep an eye on NP
from last N to first N
6
Tree-DOP Decomposition
7
Tree-DOP Parsing new input
8
Tree-DOP Ranking output parses
Relative frequency

e - the number of occurrences of subtree e in
the set of fragments
r(e) - the root node category of subtree e

Parse probability

Multiply fragment probabilities to calculate
derivation probability
Sum derivation probabilities to calculate parse
probability

9
Tree-DOP Some results
F-Score
Exact Match
MPP
MPD
MPP
MPD
d 1
94.78
English full parses only (90.83)
d 1
69.14
94.79
76.54
d 2
97.55
d 2
86.42
97.83
87.65
d 3
96.86
d 3
83.95
98.17
88.89
d 4
95.43
d 4
72.84
95.83
76.54
F-Score
Exact Match
MPP
MPD
MPP
MPD
d 1
92.68
French full parses only (92.36)
d 1
52.94
92.22
55.29
d 2
96.13
d 2
72.94
96.10
68.24
d 3
96.09
d 3
70.59
97.06
76.47
d 4
96.62
d 4
70.59
96.65
74.12
10
LFG-DOP
Lexical Functional Grammar (LFG) a
constraint-based theory of language

c-structure context-free phrase structure trees
f-structure attribute-value matrix
?-links mapping from c- to f-structure

11
LFG-DOP Fragmentation
Root select any non-frontier non-terminal
c-structure node as root and
Root and Frontier

C-structure delete all except this new root and
the subtree it dominates

?-links delete all ?-links corresponding to
deleted c-structure nodes
F-structure delete all f-structure units not
?-accessible from the remaining c-structure nodes
Forms delete all semantic forms corresponding to
deleted terminals

Frontier select a (possibly empty) set of
non-root non-terminal nodes in the root-created
c-structure

C-structure delete all c-structure subtrees
dominated by frontier nodes

?-accessibility
12
LFG-DOP Parsing new input

C-structure left-most substitution
category-matching
F-structure unification
uniqueness, completeness, coherence

LFG-DOP composition
o

13
LFG-DOP computing probabilities
Parse probability

Multiply fragment probabilities to calculate
derivation probability
Sum derivation probabilities to calculate parse
probability
Normalise over the probabilities of valid parses

PDOP(T)
PLFG-DOP(TT is valid)

S Tx is valid PDOP(Tx)
14
LFG-DOP Robustness via discard
o

Discard operation

Delete attribute-value pairs from the f-structure
while keeping c-structure and ?-links constant
Restriction pairs whose values are ?-linked to
remaining c-structure nodes are not deleted

SUBJ
PRED john
SUBJ
PRED john
SUBJ
PRED john
S
NUM pl
NUM sg
PRED seeltSUBJ,OBJgt
PRED seeltSUBJ,OBJgt
PRED seeltSUBJ,OBJgt
NP
VP
TNS pres
TNS pres
TNS pres
NUM sg
NUM sg
NUM sg
john
V
NP
PRED mary
PRED mary
PRED mary
see
mary
OBJ
OBJ
OBJ
NUM sg
NUM sg
NUM sg
15
LFG-DOP What are the challenges?

To define fragmentation operations such that
phenomena such as recursion and re-entrancy are
handled
constraints are applied appropriately
discard is used only to handle ill-formed input
To somehow distinguish between constraining and
informative features
translation vs. parsing
To address the fact that substitution is local
but unification is global, i.e.
to enforce LFG well-formedness conditions in an
accurate and efficient manner
to sample for the best parse in an accurate and
efficient manner
to define a probability model that doesnt leak

16
LFG-DOP Recursion Re-entrancy
PRED venirltSUBJ,XCOMPgt TNS
pres FIN PERS
3
PRED LED CASE nom NUM
sing PERS 3
S
NP
NP
VP
PRED jean NUM sg PERS 3
DET
NPadj
SPEC
SUBJ
f2
f4
jean
V
V
the
Adj
N
PRED tomberltSUBJgt SUBJ DE
FIN -
COMP
V
vient
yellow
LED
f2
f5
ADJUNCT
de
XCOMP
tomber
f3
f3
f1

How can we adequately express the constraints on
the composition of fragments such as (ADJ yellow)
and (V tomber)?

CASE nom NUM sing PERS
3
NUM sg PERS 3
SUBJ
f2
V
Adj
SPEC-TYPE def
SPEC
f4
PRED tomberltSUBJgt SUBJ DE
FIN -
tomber
f2
yellow
XCOMP
f5
ADJUNCT
f3
f1
f3
17
LFG-DOP Constraint over-specification
PRED flashltSUBJgt
TNS-ASP
f2
CASE nom NUM sing PERS
3
SUBJ
V
SPEC
SPEC-TYPE def
f4
flashing
SUBJ
f3
f5
ADJUNCT
f3
f1

Is it appropriate to insist that the subject of
flashing have an adjunct?
Is it appropriate to be forced to use discard to
allow the subject of flashing have an indefinite
specifier?
Important we also want to remain
language-independent

18
LFG-DOP An alternative fragmentation process

Determine a c-structure fragment using root and
frontier as for Tree-DOP but retain the full
f-structure given in the original representation.
Delete all f-structure units (and the attributes
with which they are associated) which are not
?-linked from one or more remaining c-structure
nodes unless that unit is the value of an
attribute subcategorised for by a PRED value
whose corresponding terminal is dominated by the
current fragment root node in the original
representation.
Where we have floating f-structure units, also
retain the minimal f-structure unit which retains
them both. By minimal unit we mean the unit
containing both floating f-structures along with
their (nested sequence of) attributes.
Delete all semantic forms (including PRED
attributes and their values) not associated with
one of the remaining c-structure teriminals.

19
LFG-DOP constraints vs. information

To prune attribute-value pairs based on a
language-specific algorithm
e.g. English subj-verb agreement but not
obj-verb agreement
To automatically learn which attribute-value
pairs should be pruned for a particular dataset
To do soft pruning distinguish between
constraining features and informative features
Account for the difference during unification

best suited to translation?
best suited to parsing?
20
LFG-DOP substitution vs. unification
substitution is local but unification is global
SUBJ
NUM sg
SUBJ
NUM sg
SUBJ
NUM sg
VP
NP
PRED john
o
o
o
o
S
NUM sg
V
PRED loveltSUBJ,OBJgt
TNS pres
NP
PRED mary
TNS pres
NUM sg
TNS pres
NUM sg
NUM sg
NUM sg
V
NP
john
NP
VP
loves
OBJ
NUM pl
mary
OBJ
NUM pl
OBJ
NUM pl
To be enforced category matching, uniqueness,
coherence and completeness
Enforce category matching during parsing
Model M1
Model M2
Enforce category matching and uniqueness during
parsing
Model M3
Enforce category matching, uniqueness and
coherence during parsing
There is no Model M4 completeness can never be
checked until a complete parse has been obtained
21
LFG-DOP sampling
ij,VP
The exact probability of sampling fx at ij,VP
is

PDOP(fx)
Multiplied by the sampling probability mass
available at each of its substitution sites
ik,V and ikj-k,NP
And divided by the sampling probability mass
available at ij,VP

VP
fx

V
NP
ik
ikj-k
ij,VP
Problem for computing the exact probability of
sampling fx at ij,VP
f1
SUBJ
NUM sg
f2

We cannot know the sampling probability mass
available at substitution site ikj-k,NP
until ikj-k,NP is the leftmost substitution
site unless we stick with Model M1

TNS pres
VP
fx
NUM sg

V
NP
OBJ
f3
ik
ikj-k
Problem for establishing when enough samples have
been taken

We cannot know how many valid parses there are
until all constraints have been resolved

22
LFG-DOP leaked probability mass
PRED john
S
NUM sg
NP
PRED john
S
VP
SUBJ
NUM pl
SUBJ
NUM pl
o
o
SUBJ
NUM pl

NUM sg
TNS pres
NP
VP
john
NP
VP
TNS pres
left
TNS pres
PRED leaveltsUBJgt
PRED leaveltsUBJgt
john
left

0.05
0.007
0.001
0.00000035

This derivation will be thrown out because it
does not satisfy the uniqueness condition
Its probability is thrown out with it ?
leaked probability mass
Normalisation camouflages the problem but does
not solve it

23
Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
Questions?

Write a Comment

User Comments (0)