Treebank-Based Wide Coverage Probabilistic LFG Resources - PowerPoint PPT Presentation

About This Presentation
Title:

Treebank-Based Wide Coverage Probabilistic LFG Resources

Description:

Lexicon development is time consuming and extremely expensive. Rarely if ever complete ... Grammar and Lexicon Extraction: Penn-II & LFG ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 93
Provided by: josefvan
Category:

less

Transcript and Presenter's Notes

Title: Treebank-Based Wide Coverage Probabilistic LFG Resources


1
Treebank-Based Wide Coverage Probabilistic LFG
Resources
Josef van Genabith, Aoife Cahill, Grzegorz
Chrupala, Jennifer Foster, Deirdre Hogan, Conor
Cafferkey, Mick Burke, Ruth ODonovan, Yvette
Graham, Karolina Owczarzak, Yuqing Guo, Ines
Rehbein, Natalie Schluter and Djame
Sedah National Centre for Language Technology
NCLT School of Computing, Dublin City University
2
Overview
  • Context/Motivation
  • Treebank-Based Acquisition of Wide-Coverage LFG
    Resources (Penn-II)
  • LFG
  • Automatic F-Structure Annotation Algorithm
  • Acquisition of Lexical Resources
  • Parsing
  • Parsing Architectures
  • LDD-Resolution
  • Comparison with Hand-Crafted (XLE, RASP) and
    Treebank-Based (CCG, HPSG) Resources
  • Generation
  • Basic Generator
  • Generation Grammar Transforms
  • History-Based Generation
  • MT Evaluation

3
Motivation
  • What do grammars do?
  • Grammars define languages as sets of strings
  • Grammars define what strings are grammatical and
    what strings are not
  • Grammars tell us about the syntactic structure of
    (associated with) strings
  • Shallow vs. Deep grammars
  • Shallow grammars do all of the above
  • Deep grammars (in addition) relate text to
    information/meaning representation
  • Information predicate-argument-adjunct
    structure, deep dependency relations, logical
    forms,
  • In natural languages, linguistic material is not
    always interpreted locally where you encounter
    it long-distance dependencies (LDDs)
  • Resolution of LDDs crucial to construct accurate
    and complete information/meaning representations.
  • Deep grammars (text lt-gt meaning) (LDD
    resolution)

4
Motivation
  • Constraint-Based Grammar Formalisms (FU, GPSG,
    PATR-II, )
  • Lexical-Functional Grammar (LFG)
  • Head-Driven Phrase Structure Grammar (HPSG)
  • Combinatory Categorial Grammar (CCG)
  • Tree-Adjoining Grammar (TAG)
  • Traditionally, deep constraint-based grammars are
    hand-crafted
  • LFG ParGram, HPSG LingoErg, Core Language Engine
    CLE, Alvey Tools, RASP, ALPINO,
  • Wide-coverage, deep constraint-based grammar
    development is very time consuming, knowledge
    extensive and expensive!
  • Very hard to scale hand-crafted grammars to
    unrestricted text!
  • English XLE (Riezler et al. 2002) German XLE
    (Forst and Rohrer 2006) Japanese XLE (Masuichi
    and Okuma 2003) RASP (Carroll and Briscoe 2002)
    ALPINO (Bouma, van Noord and Malouf, 2000)

5
Motivation
  • Instance of knowledge acquisition bottleneck
    familiar from classical rationalist
    rule/knowledge-based AI/NLP
  • Alternative to classical rationalist
    rule/knowledge-based AI/NLP
  • Empiricist data-driven research paradigm
    (AI/NLP)
  • Corpora, , machine-learning-based and
    statistical approaches,
  • Treebank-based grammar acquisition, probabilistic
    parsing
  • Advantage grammars can be induced (learned)
    automatically
  • Very low development cost, wide-coverage, robust,
    but
  • Most treebank-based grammar induction/parsing
    technology produces shallow grammars
  • Shallow grammars dont resolve LDDs (but see
    (Johnson 2002) ), do not map strings to
    information/meaning representations

6
Motivation
  • Poses a number of research questions
  • Can we address the knowledge acquisition
    bottleneck for deep grammar development by
    combining insights from rationalist and
    empiricist research paradigms?
  • Specifically
  • Can we automatically acquire wide-coverage
    deep, probabilistic, constraint-based grammars
    from treebanks?
  • How do we use them in parsing?
  • Can we use them for generation?
  • Can we acquire resources for different languages
    and treebank encodings?
  • How do these resources compare with hand-crafted
    resources?
  • How do they fare in applications ?

7
Context
  • TAG (Xia, 2001)
  • LFG (Cahill, McCarthy, van Genabith and Way,
    2002)
  • CCG (Hockenmaier Steedman, 2002)
  • HPSG (Miyao and Tsujii, 2003)
  • LFG
  • (van Genabith, Sadler and Way, 1999)
  • (Frank, 2000)
  • (Sadler, van Genabith and Way, 2000)
  • (Frank, Sadler, van Genabith and Way, 2003)

8
Lexical-Functional Grammar (LFG)
  • Parsing

9
LFG Acquisition for English - Overview
  • Treebank-Based Acquisition of LFG Resources
    (Penn-II)
  • Lexical Functional Grammar LFG
  • Penn-II Treebank Preprocessing/Clean-Up
  • F-Str Annotation Algorithm
  • Grammar and Lexicon Extraction
  • Parsing Architectures (LDD Resolution)
  • Comparison with best hand-crafted resources XLE
    and RASP
  • Comparison with treebank-based CCG and HPSG
    resources

10
Lexical-Functional Grammar (LFG)
  • Lexical-Functional Grammar (LFG) (Bresnan
    Kaplan 1981, Bresnan 2001, Dalrymple 2001) is a
    constraint-based theory of grammar.
  • Two (basic) levels of representation
  • C-structure represents surface grammatical
    configurations such as word order, annotated CFG
    rules/trees
  • F-structure represents abstract syntactic
    functions such as SUBJ(ject), OBJ(ect),
    OBL(ique), PRED(icate), COMP(lement), ADJ(unct)
    , AVM attribute-value matrices/feature
    structures
  • F-structure approximates to basic
    predicate-argument structure, dependency
    representation, logical form (van Genabith and
    Crouch, 1996 1997)

11
Lexical-Functional Grammar (LFG)
12
Lexical-Functional Grammar (LFG)
  • Subcategorisation
  • Semantic forms (subcat frames) seeltSUBJ,OBJgt
  • Completeness all GFs in semantic form present at
    local f-structure
  • Coherence only the GFs in semantic form present
    at local f-structure
  • Long Distance Dependencies (LDDs) resolved at
    f-structure with
  • Functional Uncertainty Equations (regular
    expressions specifying paths in f-structure)
    e.g. ?TOPICREL ?COMP OBJ
  • subcat frames
  • Completeness/Coherence.

13
Lexical-Functional Grammar (LFG)
14
Introduction Penn-II LFG
  • If we had f-structure annotated version of
    Penn-II, we could use (standard) machine learning
    methods to extract probabilistic, wide-coverage
    LFG resources
  • How do we get f-structure annotated Penn-II?
  • Manually? No 50,000 trees !
  • Automatically! Yes F-Structure annotation
    algorithm !
  • Penn-II is a 2nd generation treebank contains
    lots of annotations to support derivation of deep
    meaning representations
  • trees, Penn-II functional tags (-SBJ, -TMP,
    -LOC), traces coindexation
  • f-structure annotation algorithm exploits those.

15
Treebank Annotation Penn-II LFG
16
Treebank Annotation Penn-II LFG
17
Treebank Preprocessing/Clean-Up Penn-II LFG
  • Penn-II treebank often flat analyses
    (coordination, NPs ), a certain amount of noise
    inconsistent annotations, errors
  • No treebank preprocessing or clean-up in the LFG
    approach (unlike CCG- and HPSG-based approaches)
  • Take Penn-II treebank as is, but
  • Remove all trees with FRAG or X labelled
    constituents
  • Frag fragments, X not known how to annotate
  • Total of 48,424 trees as they are.

18
Treebank Annotation Penn-II LFG
  • Annotation-based (rather than conversion-based)
  • Automatic annotation of nodes in Penn-II treebank
    trees with f-structure equations
  • Annotation Algorithm exploits
  • Head information
  • Categorial information
  • Configurational information
  • Penn-II functional tags
  • Trace information

19
Treebank Annotation Penn-II LFG
  • Architecture of a modular algorithm to assign LFG
    f-structure equations to trees in the Penn-II
    treebank

Head-Lexicalisation Magerman,1994
Left-Right Context Annotation Principles
Proto F-Structures
Coordination Annotation Principles
Proper F-Structures
Catch-All and Clean-Up
Traces

20
Treebank Annotation Penn-II LFG
  • Head Lexicalisation modified rules based on
    (Magerman, 1994)

21
Treebank Annotation Penn-II LFG
  • Left-Right Context Annotation Principles
  • Head of NP likely to be rightmost noun
  • Mother ? Left Context Head Right Context

22
Treebank Annotation Penn-II LFG
Left-Right Annotation Matrix
NP
Left Context Head Right Context
DT ?specdet? QP ?specquant? JJ, ADJP ???adjunct NN, NNS ?? NP ???app PP ???adjunct S, SBAR ???relmod
NP
NP
DT
ADJP
NN
?? NN
?specdet? DT
???adjunct ADJP
?
RB
JJ
deal
a
RB
JJ
deal
a
very politicized
very politicized
23
Treebank Annotation Penn-II LFG
24
Treebank Annotation Penn-II LFG
  • Do annotation matrix for each of the monadic
    categories (without Fun tags) in Penn-II
  • Based on analysing the most frequent rule types
    for each categorysuch that
  • sum total of token frequencies of these rule
    types is greater than 85 of total number of rule
    tokens for that category
  • 100 85
    100 85
  • NP 6595 102 VP 10239
    307
  • S 2602 20 ADVP
    234 6
  • Apply annotation matrix to all (i.e. also unseen)
    rules/sub-trees, i.e. also those NP-LOC, NP-TMP
    etc.

25
Treebank Annotation Penn-II LFG
  • Traces Module
  • Long Distance Dependencies (LDDs)
  • Topicalisation
  • Questions
  • Wh- and wh-less relative clauses
  • Passivisation
  • Control constructions
  • ICH (interpret constituent here)
  • RNR (right node raising)
  • Translate Penn-II traces and coindexation into
    corresponding reentrancy in f-structure

26
Treebank Annotation Control Wh-Rel. LDD
27
Treebank Annotation Penn-II LFG
Head-Lexicalisation Magerman,1995
Left-Right Context Annotation Principles
Proto F-Structures
Coordination Annotation Principles
Proper F-Structures
Catch-All and Clean-Up
Traces

Constraint Solver
28
Treebank Annotation Penn-II LFG
  • Collect f-structure equations
  • Send to constraint solver
  • Generates f-structures
  • F-structure annotation algorithm in Java,
    constraint solver in Prolog
  • 3 min annotating 50,000 Penn-II trees
  • 5 min producing 50,000 f-structures

29
Treebank Annotation Penn-II LFG
  • Evaluation (Quantitative)
  • Coverage
  • Over 99.8 of Penn-II sentences (without X and
    FRAG constituents) receive a single covering and
    connected f-structure

0 F-structures 45 0.093
1 F-structure 48329 99.804
2 F-structures 50 0.103
30
Treebank Annotation Penn-II LFG
  • F-structure quality evaluation against DCU 105
    Dependency Bank, a manually annotated dependency
    gold standard of 105 sentences randomly extracted
    from WSJ section 23.
  • Triples are extracted from the gold standard
  • Evaluation software from (Crouch et al. 2002) and
    (Riezler et al. 2002)
  • relation(predicate0, argument1)

DCU 105 All Annotations Preds-Only
Precision 97.06 94.28
Recall 96.80 94.28
31
Treebank Annotation Penn-II LFG
  • Following (Kaplan et al. 2004) evaluation against
    PARC 700 Dependency Bank calculated for
  • all annotations ? PARC features ?
    preds-only
  • Mapping required (Burke 2004, 2006)

PARC 700 PARC features
Precision 88.31
Recall 86.38
32
Grammar and Lexicon Extraction Penn-II LFG
  • Lexical Resources
  • Lexical information extremely important in modern
    lexicalised grammar formalisms
  • LFG, HPSG, CCG, TAG,
  • Lexicon development is time consuming and
    extremely expensive
  • Rarely if ever complete
  • Familiar knowledge acquisition bottleneck
  • Treebank-based subcategorisation frame induction
    (LFG semantic forms) from Penn-II and III
  • Parser-based induction from British National
    Corpus (BNC)
  • Evaluation against COMLEX, OALD, Korhonens data
    set

33
Grammar and Lexicon Extraction Penn-II LFG
  • Lexicon Construction
  • Manual vs. Automated
  • Our Approach
  • Subcat Frames not Predefined
  • Functional and/or Categorial Information
  • Parameterised for Prepositions and Particles
  • Active and Passive
  • Long Distance Dependencies
  • Conditional Probabilities

34
Grammar and Lexicon Extraction Penn-II LFG
35
Grammar and Lexicon Extraction Penn-II LFG
applyltSUBJ,OBLforgt winltSUBJ,OBJgt
36
Grammar and Lexicon Extraction Penn-II LFG
Lexicon extracted from Penn-II (ODonovan et al
2005)
37
Grammar and Lexicon Extraction Penn-II LFG
38
Grammar and Lexicon Extraction Penn-II LFG
  • Parsing-Based Subcat Frame Extraction (ODonovan
    2006)
  • Treebank-based vs. parsing-based subcat frame
    extraction
  • Parsed British National Corpus BNC (100 million
    words) with our automatically induced LFGs
  • 19 days on single machine 5 million words per
    day
  • Subcat frame extraction for 10,000 verb lemmas
  • Evaluation against COMLEX and OALD
  • Evaluation against Korhonen (2002) gold standard
  • Our method is statistically significantly better
    than Korhonen (2002)

39
Parsing Penn-II and LFG
  • Overview Parsing Architectures
  • Pipeline Integrated
  • Long-Distance Dependency (LDD) Resolution at
    F-Structure
  • Evaluation Comparison with Hand-Crafted
    Resources (XLE and RASP)
  • Comparison against Treebank-Based CCG and HPSG
    Resources

40
Parsing Penn-II and LFG
41
Lexical-Functional Grammar (LFG)
42
Parsing Penn-II and LFG
  • Require
  • subcategorisation frames (ODonovan et al., 2004,
    2005 ODonovan 2006)
  • functional uncertainty equations
  • Previous Example
  • claim(subj,comp), deny(subj,obj)
  • ? topicrel ? comp obj (search along a path of
    0 or more comps)

43
Parsing Penn-II and LFG
  • Subcat frames as above (ODonovan et al. 2004,
    2005)
  • Functional Uncertainty equations
  • Automatically acquire finite approximations of
    FU-equations
  • Extract paths between co-indexed material in
    automatically generated f-structures from
    sections 02-21 from Penn-II
  • 26 TOPIC, 60 TOPICREL, 13 FOCUS path types
  • 99.69 coverage of paths in WSJ Section 23
  • Each path type associated with a probability
  • LDD resolution ranked by Path x Subcat
    probabilities (Cahill et al., 2004)

44
Parsing Penn-II and LFG
  • How do treebank-based constraint grammars compare
    to deep hand-crafted grammars like XLE and RASP?
  • XLE (Riezler et al. 2002, Kaplan et al. 2004)
  • hand-crafted, wide-coverage, deep,
    state-of-the-art English LFG and XLE parsing
    system with log-linear-based probability models
    for disambiguation
  • PARC 700 Dependency Bank gold standard (King et
    al. 2003), Penn-II Section 23-based
  • RASP (Carroll and Briscoe 2002)
  • hand-crafted, wide-coverage, deep,
    state-of-the-art English probabilistic
    unification grammar and parsing system (RASP
    Rapid Accurate Statistical Parsing)
  • CBS 500 Dependency Bank gold standard (Carroll,
    Briscoe and Sanfillippo 1999), Susanne-based

45
Parsing Penn-II and LFG
  • (Bikel 2002) retrained to retain Penn-II
    functional tags (-SBJ, -SBJ, -LOC,-TMP, -CLR,
    -LGS, etc.)
  • Pipeline architecture
  • tag text ? Bikel retrained f-structure
    annotation algorithm LDD resolution ?
    f-structures ? automatic conversion ? evaluation
    against XLE/RASP gold standards PARC-700/CBS-500
    Dependency Banks

46
Parsing Penn-II and LFG
  • Systematic differences between f-structures and
    PARC 700 and CBS 500 dependency representations
  • Automatic conversion of f-structures to PARC 700
    / CBS 500 -like structures (Burke et al. 2004,
    Burke 2006, Cahill et al. 2008)
  • Evaluation software (Crouch et al. 2002) and
    (Carroll and Briscoe 2002)
  • Approximate Randomisation Test (Noreen 1989) for
    statistical significance

47
Parsing Penn-II and LFG
  • Result dependency f-scores (CL 2008 paper)
  • PARC 700 XLE vs. DCU-LFG
  • 80.55 XLE
  • 82.73 DCU-LFG (2.18)
  • CBS 500 RASP vs. DCU-LFG
  • 76.57 RASP
  • 80.23 DCU-LFG (3.66)
  • Results statistically significant at ? 95 level
    (Noreen 1989)
  • Best result now against PARC 700 84.00 (3.45)
    Charniak Reranker Grzegorz Penn-II
    function-tag labeler

48
Parsing Penn-II and LFG
PARC 700 Evaluation
49
Parsing Penn-II and LFG
50
Parsing Penn-II and LFG
51
Parsing Penn-II and LFG
52
Parsing Penn-II and LFG
53
Parsing Penn-II and LFG
54
Evaluation against Treebank-Based CCG and HPSG
  • CCG Combinatory Categorial Grammar (Steedman
    2000)
  • HPSG Head-Driven Phrase Structure Grammar
    (Pollard Sag 1994)
  • Both constraint-based grammar formalisms
  • Treebank-based CCG resources (Hockenmaier
    Steedman 2002, Hockenmaier 2003, Clark Curran
    2004, )
  • Treebank-based HPSG resources (Miyao, Ninomiya
    Tsujii 2003, Miyao Tsujii 2004, )
  • DepBank reannotated version of PARC 700
    (Briscoe Carroll 2006) with CBS 500style GRs
  • RASP (version 2) (Briscoe Carroll 2006)

55
Evaluation against Treebank-Based CCG and HPSG
  • CCG
  • Small set of basic categories NP, N, PP, S
  • Complex categories VP S\NP Vi S\NP Vdi
    (S\NP)/NP
  • Small set of combination rules
  • X/Y Y ? X
  • Y X\Y ? X
  • X/Y Y/Z ? X/Z

56
Evaluation against Treebank-Based CCG and HPSG
  • HPSG
  • Uniform representation typed feature structures
    and inheritance
  • Sign PHON, SYNSEM, DTRS
  • Inheritance hierarchy
  • Principles (HEAD-FEATURE, VALENCE, )
  • Id-Schemata (HEAD-COMP, HEAD-MOD, )

57
Evaluation against Treebank-Based CCG and HPSG
58
Evaluation against Treebank-Based CCG and HPSG
59
Evaluation against Treebank-Based CCG and HPSG
60
Probability Models Penn-II LFG
61
Probability Models Penn-II LFG
  • Evaluation Results

62
Probability Models Penn-II LFG
  • Results are interesting as
  • Extensive treebank preprocessing (clean-up,
    correction and restructuring) in CCG and (some
    in) HPSG
  • none in LFG
  • Custom-designed parsers and sophisticated
    (log-linear, max ent) parse selection probability
    models in HPSG and CCG
  • Mix of off-the-shelf and custom designed
    components, each with their own probability model
    in early-disambiguation processing pipeline in
    LFG, no proper overall probability model, but an
    approximation at best
  • Still competitive results

63
Probability Models Penn-II LFG
  • Probability Models
  • Our approach does not constitute proper
    probability model (Abney, 1996)
  • Why? Probability model leaks
  • Highest ranking parse tree may feature
    f-structure equations that cannot be resolved
    into f-structure
  • Probability associated with that parse tree is
    lost
  • Doesnt happen often in practice (coverage gt99.5
    on unseen data)
  • Research on appropriate discriminative,
    log-linear or maximum entropy models is important
    (Miyao and Tsujii, 2002) (Riezler et al. 2002)

64
Demo System
  • http//lfg-demo.computing.dcu.ie/lfgparser.html

65
Applications Generation
  • Applications Generation

66
Applications Generation
  • Research Question
  • Can we make the automatically induced LFG
    resources reversible/bi-directional?
  • Can they be used for both (probabilistic) parsing
    and generation?

67
Generation Penn-II LFG
68
Generation Penn-II LFG
69
Generation Penn-II LFG
70
Generation Penn-II LFG
71
Generation Penn-II LFG
72
Generation Penn-II LFG
73
Generation Penn-II LFG
74
Generation Penn-II LFG
Problem conditioning of generation rules on
purely local f-str features Solution I
generation grammar transformation (Cahill et al.
2006) Solution II history-based probabilistic
generation (Hogan et al. 2007, Cafferkey et al.
2007) condition generation rules on parent GF
75
Generation Penn-II LFG
76
Generation Penn-II LFG
77
Generation Penn-II LFG
78
Generation the Good, the Bad and the Ugly
  • Orig Supporters of the legislation view the bill
    as an effort to add stability and certainty to
    the airline-acquisition process , and to preserve
    the safety and fitness of the industry .
  • Gen Supporters of the legislation view the bill
    as an effort to add stability and certainty to
    the airline-acquisition process , and to preserve
    the safety and fitness of the industry.
  • Orig The upshot of the downshoot is that the A
    's go into San Francisco 's Candlestick Park
    tonight up two games to none in the best-of-seven
    fest .
  • Gen The upshot of the downshoot is that the A 's
    tonight go into San Francisco 's Candlestick Park
    up two games to none in the best-of-seven fest .
  • Orig By this time , it was 430 a.m. in New York
    , and Mr. Smith fielded a call from a New York
    customer wanting an opinion on the British stock
    market , which had been having troubles of its
    own even before Friday 's New York market break .
  • Gen Mr. Smith fielded a call from New a customer
    York wanting an opinion on the market British
    stock which had been having troubles of its own
    even before Friday 's New York market break by
    this time and in New York , it was 430 a.m. .
  • Orig Only half the usual lunchtime crowd
    gathered at the tony Corney Barrow wine bar on
    Old Broad Street nearby .
  • Gen At wine tony Corney Barrow the bar on Old
    Broad Street nearby gathered usual , lunchtime
    only half the crowd , .

79
Generation Penn-II LFG
80
Generation Penn-II LFG
Problem conditioning of generation rules on
purely local f-str features Solution generation
grammar transformation (Cahill et al.
2006) Solution history-based probabilistic
generation (Hogan et al. 2007, Cafferkey et al.
2007) condition generation rules on parent GF
81
Generation the Good, the Bad and the Ugly
  • Orig By this time , it was 430 a.m. in New York
    , and Mr. Smith fielded a call from a New York
    customer wanting an opinion on the British stock
    market , which had been having troubles of its
    own even before Friday 's New York market break .
  • Gen Mr. Smith fielded a call from New a customer
    York wanting an opinion on the market British
    stock which had been having troubles of its own
    even before Friday 's New York market break by
    this time and in New York , it was 430 a.m. .
    (Cahill et al. 2006) GGT
  • Gen By this time , in New York , it was 430
    a.m. , and Mr. Smith fielded a call from New a
    customer York , wanting an opinion on the market
    British stock which had been having troubles of
    its own even before Friday s New York market
    break . (Hogan et al. 2007) HB
  • Gen By this time , in New York , it was 430
    a.m. , and Mr. Smith fielded a call from a New
    York customer , wanting an opinion on the market
    British stock which had been having troubles of
    its own even before Friday s New York market
    break . (Hogan et al. 2007) HB MWU

82
Generation Chinese CTB2
  • CTB2 (Yuqing Guo - Toshiba China Beijing RD Lab)
  • (Cahill et al. 2006) out of the box
  • Training articles 1-270 (3,480 sentences)
  • Testing articles 301-325 (351 sentences)

83
Applications Machine Translation
  • Applications Machine Translation
  • Labelled Dependency-Based MT Evaluation (LaDEva)
  • Automatic Acquisition of Transfer Rules

84
Applications Machine Translation
  • Labelled-Dependency-Based MT Evaluation
  • Most automatic MT evaluation metrics (BLEU, NIST)
    are string (n-gram) based.
  • They unfairly punish perfectly legitimate
    syntactic and lexical variation
  • Yesterday John resigned.
  • John resigned yesterday.
  • Yesterday John quit.
  • Legitimate lexical variation throw in WordNet
    synonyms into the string match
  • What about syntactic variation?

85
Applications Machine Translation
  • Idea use labelled dependencies for MT evaluation
  • Why dependencies abstract away from some
    particulars of surface realisation
  • Adjunct placement, order of conjuncts in a
    coordination, topicalisation, ...

86
Applications Machine Translation
  • Idea is intuitive
  • To make it happen you need a robust parser that
    can parse MT output ?
  • Treebank-induced parsers parse anything !
  • How do we judge whether labelled dependency-based
    method is better than string-based methods?
  • We compare (correlation) with human
    judgement/evaluation performance
  • Why humans not fooled by legitimate syntactic
    variation

87
Applications Machine Translation
  • Experiment use LDC Multiple Translation Chinese
    (MTC) Parts 2 and 4 data
  • 16,807 translation-reference human score segments
  • 5,007 test, rest for training (weights etc.)
  • To make this work, we throw in
  • n-best parsing
  • WordNet synonyms
  • partial matching
  • training weights
  • etc

88
Applications Machine Translation
89
Applications Machine Translation
90
References (MT Eval)
  • Karolina Owczarzak, Yvette Graham and Josef van
    Genabith Using F-structures in Machine
    Translation Evaluation. In Proceedings of the
    12th International Conference on Lexical
    Functional Grammar, July 28-30, 2007, Stanford,
    CA
  • Karolina Owczarzak, Josef van Genabith, and Andy
    Way. Labelled Dependencies in Machine Translation
    Evaluation. In Proceedings of ACL 2007 Workshop
    on Statistical Machine Translation, pages
    104-111, Prague, Czech Republic
  • Karolina Owczarzak, Josef van Genabith, and Andy
    Way. Dependency-Based Automatic Evaluation for
    Machine Translation. In Proceedings of HLT-NAACL
    2007 Workshop on Syntax and Structure in
    Statistical Translation. Rochester, NY.

91
References (Parsing)
  • Aoife Cahill, Michael Burke, Ruth O'Donovan,
    Stefan Riezler, Josef van Genabith and Andy Way.
    2008. Wide-Coverage Statistical Parsing Using
    Automatic Dependency Structure Annotation.
    Computational Linguistics, Volume 34, 1, MIT
    Press, March 2008. (accepted for publication)
  • Joachim Wagner, Djamé Seddah, Jennifer Foster and
    Josef van Genabith C-Structures and F-Structures
    for the British National Corpus. In Proceedings
    of the 12th International Conference on Lexical
    Functional Grammar, July 28-30, 2007, Stanford,
    CA
  • A. Cahill, M. Burke, R. O'Donovan, J. van
    Genabith, and A. Way. Long-Distance Dependency
    Resolution in Automatically Acquired
    Wide-Coverage PCFG-Based LFG Approximations, In
    Proceedings of the 42nd Annual Meeting of the
    Association for Computational Linguistics
    (ACL-04), July 21-26 2004, pages 320-327,
    Barcelona, Spain, 2004
  • Cahill A, M. McCarthy, J. van Genabith and A.
    Way. Parsing with PCFGs and Automatic F-Structure
    Annotation, In M. Butt and T. Holloway-King
    (eds.) Proceedings of the Seventh International
    Conference on LFG CSLI Publications, Stanford,
    CA., pp.76--95. 2002

92
References (Generation, Lex. Acq.)
  • Deirdre Hogan, Conor Cafferkey, Aoife Cahill and
    Josef van Genabith, Exploiting Multi-Word Units
    in History-Based Probabilistic Generation, in
    Proceedings of the Joint Conference on Empirical
    Methods in Natural Language Processing and
    Natural Language Learning (EMNLP-CoNLL 2007),
    Prague, Czech Republic. pp.267-276
  • A. Cahill and J. Van Genabith, Robust PCFG-Based
    Generation using Automatically Acquired
    LFG-Approximations, COLING/ACL 2006, Sydney,
    Australia
  • R. O'Donovan, M. Burke, A. Cahill, J. van
    Genabith and A. Way. Large-Scale Induction and
    Evaluation of Lexical Resources from the Penn-II
    and Penn-III Treebanks, Computational
    Linguistics, 2005
  • R. O'Donovan, M. Burke, A. Cahill, J. van
    Genabith, and A. Way. Large-Scale Induction and
    Evaluation of Lexical Resources from the Penn-II
    Treebank, In Proceedings of the 42nd Annual
    Meeting of the Association for Computational
    Linguistics (ACL-04), July 21-26 2004, pages
    368-375, Barcelona, Spain, 2004
Write a Comment
User Comments (0)
About PowerShow.com