Title: DataOriented Natural Language Processing using LexicalFunctional Grammar
1Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
Mary Hearne School of Computing, Dublin City
University
2Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
- Data-Oriented Parsing (DOP) A review
- Parsing with Lexical-Functional Grammar LFG-DOP
- LFG-based models what are the challenges?
3Experience-Based Parsing PCFGs
Basic strategy extract grammar rules and their
relative frequencies from a treebank
Vanilla PCFG
4Experience-Based Parsing DOP
Basic strategy extract grammar rules and their
relative frequencies from a treebank
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1/10
1
1/2
1/2
1/4
1/4
1/4
1/4
5Non-local dependencies
keep an eye on NP
from last N to first N
6Tree-DOP Decomposition
7Tree-DOP Parsing new input
8Tree-DOP Ranking output parses
Relative frequency
- e - the number of occurrences of subtree e in
the set of fragments - r(e) - the root node category of subtree e
Parse probability
- Multiply fragment probabilities to calculate
derivation probability - Sum derivation probabilities to calculate parse
probability
9Tree-DOP Some results
F-Score
Exact Match
MPP
MPD
MPP
MPD
d 1
94.78
English full parses only (90.83)
d 1
69.14
94.79
76.54
d 2
97.55
d 2
86.42
97.83
87.65
d 3
96.86
d 3
83.95
98.17
88.89
d 4
95.43
d 4
72.84
95.83
76.54
F-Score
Exact Match
MPP
MPD
MPP
MPD
d 1
92.68
French full parses only (92.36)
d 1
52.94
92.22
55.29
d 2
96.13
d 2
72.94
96.10
68.24
d 3
96.09
d 3
70.59
97.06
76.47
d 4
96.62
d 4
70.59
96.65
74.12
10LFG-DOP
Lexical Functional Grammar (LFG) a
constraint-based theory of language
- c-structure context-free phrase structure trees
- f-structure attribute-value matrix
- ?-links mapping from c- to f-structure
11LFG-DOP Fragmentation
Root select any non-frontier non-terminal
c-structure node as root and
Root and Frontier
- C-structure delete all except this new root and
the subtree it dominates
- ?-links delete all ?-links corresponding to
deleted c-structure nodes - F-structure delete all f-structure units not
?-accessible from the remaining c-structure nodes - Forms delete all semantic forms corresponding to
deleted terminals
Frontier select a (possibly empty) set of
non-root non-terminal nodes in the root-created
c-structure
- C-structure delete all c-structure subtrees
dominated by frontier nodes
?-accessibility
12LFG-DOP Parsing new input
- C-structure left-most substitution
- category-matching
- F-structure unification
- uniqueness, completeness, coherence
LFG-DOP composition
o
13LFG-DOP computing probabilities
Parse probability
- Multiply fragment probabilities to calculate
derivation probability - Sum derivation probabilities to calculate parse
probability - Normalise over the probabilities of valid parses
PDOP(T)
PLFG-DOP(TT is valid)
S Tx is valid PDOP(Tx)
14LFG-DOP Robustness via discard
o
Discard operation
- Delete attribute-value pairs from the f-structure
while keeping c-structure and ?-links constant - Restriction pairs whose values are ?-linked to
remaining c-structure nodes are not deleted
SUBJ
PRED john
SUBJ
PRED john
SUBJ
PRED john
S
NUM pl
NUM sg
PRED seeltSUBJ,OBJgt
PRED seeltSUBJ,OBJgt
PRED seeltSUBJ,OBJgt
NP
VP
TNS pres
TNS pres
TNS pres
NUM sg
NUM sg
NUM sg
john
V
NP
PRED mary
PRED mary
PRED mary
see
mary
OBJ
OBJ
OBJ
NUM sg
NUM sg
NUM sg
15LFG-DOP What are the challenges?
- To define fragmentation operations such that
- phenomena such as recursion and re-entrancy are
handled - constraints are applied appropriately
- discard is used only to handle ill-formed input
- To somehow distinguish between constraining and
informative features - translation vs. parsing
- To address the fact that substitution is local
but unification is global, i.e. - to enforce LFG well-formedness conditions in an
accurate and efficient manner - to sample for the best parse in an accurate and
efficient manner - to define a probability model that doesnt leak
16LFG-DOP Recursion Re-entrancy
PRED venirltSUBJ,XCOMPgt TNS
pres FIN PERS
3
PRED LED CASE nom NUM
sing PERS 3
S
NP
NP
VP
PRED jean NUM sg PERS 3
DET
NPadj
SPEC
SUBJ
f2
f4
jean
V
V
the
Adj
N
PRED tomberltSUBJgt SUBJ DE
FIN -
COMP
V
vient
yellow
LED
f2
f5
ADJUNCT
de
XCOMP
tomber
f3
f3
f1
- How can we adequately express the constraints on
the composition of fragments such as (ADJ yellow)
and (V tomber)?
CASE nom NUM sing PERS
3
NUM sg PERS 3
SUBJ
f2
V
Adj
SPEC-TYPE def
SPEC
f4
PRED tomberltSUBJgt SUBJ DE
FIN -
tomber
f2
yellow
XCOMP
f5
ADJUNCT
f3
f1
f3
17LFG-DOP Constraint over-specification
PRED flashltSUBJgt
TNS-ASP
f2
CASE nom NUM sing PERS
3
SUBJ
V
SPEC
SPEC-TYPE def
f4
flashing
SUBJ
f3
f5
ADJUNCT
f3
f1
- Is it appropriate to insist that the subject of
flashing have an adjunct? - Is it appropriate to be forced to use discard to
allow the subject of flashing have an indefinite
specifier? - Important we also want to remain
language-independent
18LFG-DOP An alternative fragmentation process
- Determine a c-structure fragment using root and
frontier as for Tree-DOP but retain the full
f-structure given in the original representation. - Delete all f-structure units (and the attributes
with which they are associated) which are not
?-linked from one or more remaining c-structure
nodes unless that unit is the value of an
attribute subcategorised for by a PRED value
whose corresponding terminal is dominated by the
current fragment root node in the original
representation. - Where we have floating f-structure units, also
retain the minimal f-structure unit which retains
them both. By minimal unit we mean the unit
containing both floating f-structures along with
their (nested sequence of) attributes. - Delete all semantic forms (including PRED
attributes and their values) not associated with
one of the remaining c-structure teriminals.
19LFG-DOP constraints vs. information
- To prune attribute-value pairs based on a
language-specific algorithm - e.g. English subj-verb agreement but not
obj-verb agreement - To automatically learn which attribute-value
pairs should be pruned for a particular dataset - To do soft pruning distinguish between
constraining features and informative features - Account for the difference during unification
best suited to translation?
best suited to parsing?
20LFG-DOP substitution vs. unification
substitution is local but unification is global
SUBJ
NUM sg
SUBJ
NUM sg
SUBJ
NUM sg
VP
NP
PRED john
o
o
o
o
S
NUM sg
V
PRED loveltSUBJ,OBJgt
TNS pres
NP
PRED mary
TNS pres
NUM sg
TNS pres
NUM sg
NUM sg
NUM sg
V
NP
john
NP
VP
loves
OBJ
NUM pl
mary
OBJ
NUM pl
OBJ
NUM pl
To be enforced category matching, uniqueness,
coherence and completeness
Enforce category matching during parsing
Model M1
Model M2
Enforce category matching and uniqueness during
parsing
Model M3
Enforce category matching, uniqueness and
coherence during parsing
There is no Model M4 completeness can never be
checked until a complete parse has been obtained
21LFG-DOP sampling
ij,VP
The exact probability of sampling fx at ij,VP
is
- PDOP(fx)
- Multiplied by the sampling probability mass
available at each of its substitution sites
ik,V and ikj-k,NP - And divided by the sampling probability mass
available at ij,VP
VP
fx
V
NP
ik
ikj-k
ij,VP
Problem for computing the exact probability of
sampling fx at ij,VP
f1
SUBJ
NUM sg
f2
- We cannot know the sampling probability mass
available at substitution site ikj-k,NP
until ikj-k,NP is the leftmost substitution
site unless we stick with Model M1
TNS pres
VP
fx
NUM sg
V
NP
OBJ
f3
ik
ikj-k
Problem for establishing when enough samples have
been taken
- We cannot know how many valid parses there are
until all constraints have been resolved
22LFG-DOP leaked probability mass
PRED john
S
NUM sg
NP
PRED john
S
VP
SUBJ
NUM pl
SUBJ
NUM pl
o
o
SUBJ
NUM pl
NUM sg
TNS pres
NP
VP
john
NP
VP
TNS pres
left
TNS pres
PRED leaveltsUBJgt
PRED leaveltsUBJgt
john
left
0.05
0.007
0.001
0.00000035
- This derivation will be thrown out because it
does not satisfy the uniqueness condition - Its probability is thrown out with it ?
leaked probability mass - Normalisation camouflages the problem but does
not solve it
23Data-Oriented Natural Language Processing using
Lexical-Functional Grammar
Questions?