FATE: a FrameNet Annotated corpus for Textual Entailment

About This Presentation
Title:

FATE: a FrameNet Annotated corpus for Textual Entailment

Description:

Anaphora. Copula and support verbs. Modal expressions. Metaphors. Existential constructions ... to the external referent via the ANAPHORA frame. Anaphora ... –

Number of Views:48
Avg rating:3.0/5.0
Slides: 33
Provided by: Lego1
Learn more at: http://www.lrec-conf.org
Category:

less

Transcript and Presenter's Notes

Title: FATE: a FrameNet Annotated corpus for Textual Entailment


1
FATEa FrameNet Annotated corpus for Textual
Entailment
LREC 2008 , Marrakech , 28 May 2008
  • Marco Pennacchiotti, Aljoscha Burchardt
  • Computerlinguistik
  • Saarland University, Germany

SALSA II - The Saarbrücken Lexical Semantics
Acquisition Project
2
Summary
  • FrameNet and Textual Entailment
  • FATE annotation schema
  • Annotation examples and statistics
  • Conclusions

3
Frame Semantics
Fillmore 1976, 2003
  • Frame conceptual structure modeling a
    prototypical situation
  • Frame Elements (FE) participants of the
    situation
  • Frame Evoking elements (FEE) predicates evoking
    the situation

Predicate-argument level normalizations
Evelyn spoke about her past Evelyns
statement about her past STATEMENT(Speaker
Evelyn Topic her past)
  • FrameNet Berkeley Project 1
  • Database of frames for the core lexicon of
    English
  • 800 frames, 10.000 lemmas, 135.000 annotated
    sentences

(1) http//framenet.icsi.berkeley.edu
4
Textual Entailment (TE)
Given two text fragments, the Text T and the
Hypothesis H, T entails H if the meaning of H
can be inferred from the meaning of T, as would
typically interpreted by people Dagan 2005
  • T Yahoo has recently acquired Overture
  • H Yahoo owns Overture
  • T ? H
  • Recognizing Textual Entailment (RTE)
  • recognize if entailment holds for a given (T,H)
    pair
  • Models core inferences of many NLP applications
    (QA, IE, MT,)
  • RTE Challenges Dagan et al.,2005 Giampiccolo
    et al., 2007
  • Compare systems for RTE
  • Corpus 800 training pairs, 800 test pairs,
    evenly split in and - pairs

5
Predicate-argument and RTE
  • Predicate-level inference plays a relevant role
    in TE (20 of positive examples in RTE-2
    Garoufi, 2007 )

An avalanche has struck a popular skiing resort
in Austria, killing at least 11 people. Humans
died in an avalanche.
T
H
DEATH(Protagonist 11 people / humans Cause
avalanche / avalanche )
  • Implementation gap
  • Burchardt et al.,2007 FrameNet system
    comparable to lexical overlap
  • Hickl et al.,2006 PropBank-based features are
    not effective
  • Rana et al.,2005 DIRT paraphrase repository
    does not help

6
FATE corpus
FATE a manually frame-annotated Textual
Entailment corpus, to study the role of frame
semantics in RTE
  • Reference corpus RTE-2 test set, 800 pairs,
    29,000 tokens
  • Frame resource FrameNet version 1.3
  • Corpus Format SALSA/TIGER XML Burchardt
    et al.,2006
  • Pre-processing annotation on top of Collins
    parser syntactic analysis
  • T and H are randomly reordered to avoid
    biases
  • Annotation performed by one highly experienced
    annotator
  • inter-annotator agreement over 5 of the
    corpus
  • FEE-agreement 82
  • Frame-agreement 88
  • Role-agreement 91
  • annotation carried out using the SALTO tool 1

(1) http//www.coli.uni-saarland.de/projects/salsa
/salto/doc
7
FATE annotation process an example
Collins synt. an.
full-text annotation (all words considered)
Ruppenhofer,2007
8
FATE annotation process an example
frame
Collins synt. an.
FEE
9
FATE annotation process an example
frame
FE
Collins synt. an.
FEE
FE filler
Maximization principle chose the largest
constituent possible when annotating
10
Annotation Schema
Relevance Principle
  • Intuition annotate as FEE only those words
    evoking a relevant situation (frame) in the
    sentence at hand
  • Very intuitive flavor, but high agreement 83 on
    a pilot set of 15 sentences

KIDNAPPING
Victim
Place
Perpetrator
Authorities in Brazil hold 200 people as hostage
11
Annotation Schema
Span Annotation
  • On T of positive pairs, annotate only the
    fragments (spans) contributing to the inferential
    process
  • Spans are obtained from the ARTE annotation
    Garoufi,2007
  • For negative pairs it is not straightforward to
    derive spans, hence we do full annotation

T Soon after the EZLN had returned to Chiapas,
Congress approved a different version of the
COCOPA Law, which did not include the autonomy
clauses, claiming they were in contradiction with
some constitutional rights (private property and
secret voting) this was seen as a betrayal by
the EZLN and other political groups. H EZLN is
a political group.
12
Annotation Schema
Other guidelines
  • Unknown frames use an Unknown frame for words
    evoking situations not present in the FrameNet
    database
  • Anaphora
  • Copula and support verbs
  • Modal expressions
  • Metaphors
  • Existential constructions

13
Corpus statistics
  • Annotated pairs 800 (400 positive, 400
    negatives)
  • Annotated frames 4,500
  • avg. 5.6 frames per pair
  • 1,600 frames in positive pairs
  • 2,800 in negative pairs
  • Annotated roles 9,500
  • avg. 2.1 roles per frame
  • Annotation time 230 hours
  • 90 h for positive pairs (13 min/pair)
  • 140 h for negative pairs (21 min/pair)

14
FrameNet and RTE (simple case)
T
H
  • Syntactic normalization
  • Active / Passive

EDUCATIONAL_TEACHING(Student ground soldiers /
soldiers Material virtual reality/ virtual
reality)
15
Implementation gap insights
  1. Resource coverage is too low
  2. Models for predicate-argument inference are weak
  3. Automatic annotation models (SRL) are not good
    enough to be safely used in RTE
  • FrameNet coverage is good
  • 373 Unknown frames (8 of total frames)
  • Unknown roles 1 of total roles
  • Coverage is unlikely to be a limiting factor for
    using FrameNet in applications

16
Why should you use FATE ?
  • Resource coverage is too low
  • Models for predicate-argument inference are weak
  • Automatic annotation models (SRL) are not good
    enough to be safely used in RTE
  • To better study predicate-argument inference in
    RTE
  • To experiment frame-RTE models on a gold-std
    corpus
  • To learn better SRL models, by training on FATE
  • Corpus is freely available on-line

17
  • Thank you!
  • Questions?

FATE download http//www.coli.uni-saarland.de/pr
ojects/salsa/fate
  • pennacchiotti_at_coli.uni-sb.de
  • www.coli.uni-saarland.de/pennacchiotti

18
(No Transcript)
19
FrameNet and RTE
T
H
  • Syntactic normalization
  • Apposition to copula

PEOPLE_BY_VOCATION(Person Andreotti / Andreotti
Place Italy / Italy Age elder/ elder)
20
FrameNet and RTE
T
H
  • Frame-to-frame inference
  • Sentencing --- HR ---gt Imprisonment
  • Convict maps to Prisoner
  • Place maps to Place

21
Annotation Schema
Anaphora
  • Locality principle
  • Annotate the local referent of a role filler
  • Link the local referent to the external referent
    via the Anaphora frame

22
Annotation Schema
Support and Copula Verbs
  • Verbs carrying minimal semantic content (e.g. be,
    seem)
  • Annotate the noun as FEE, instead of the verb
    Ruppenhofer,2007

23
Annotation Schema
Modal Expressions
  • Modal expression (e.g. modal verbs, particles,
    modal triggers) are annotated only when the modal
    meaning is prevalent in the sentence

24
Annotation Schema
Other guidelines
  • Metaphors are annotated with their figurative
    meaning
  • Existential constructions (e.g. there is) are
    annotated with the frame Existence, only when it
    is the only meaning conveyed in the sentence
    (e.g. There are 11 official languages)
  • Unknown frames use an Unknown frame for words
    evoking situations not present in the FrameNet
    database
  • Maximization principle chose the largest
    constituent possible when annotating

25
Motivations
  • Semantic knowledge at the predicate-argument
    level is critical in NLP tasks
  • From who did BMW buy Rover ?
  • Rover was bought by BMW from British
    Aerospace
  • BMW acquired Rover from British Aerospace
  • BMWs purchase of Rover from British
    Aerospace
  • British Aerospace sold Rover to BMW
  • Predicate-argument resources (e.g. PropBank and
    FrameNet) allow to map meaning preserving
    alternations to the same predicative structure
  • BUY_EVENT (Buyer BMW , Seller British Aerospace
    , Good Rover)

26
Motivations
  • Implementation gap very scarce impact of
    predicate-argument resource in NLP applications
    Fliedner,2007 Frank et al.,2006
  • Possible reasons
  • Resource coverage is too low
  • Modeling predicate knowledge is too hard
  • Automatic annotation (SRL) is not good enough
  • Our goal create a gold-standard corpus,
    manually annotated with predicate-argument
    structure, to investigate (1)-(3)
  • Corpus Second Recognizig Textual Entailment
    (RTE) Challenge
  • Annotation FrameNet

27
FATE Corpus annotation an example
T
Collins synt. an.
full-text annotation (all words considered)
Ruppenhofer,2007
28
Frame Semantics
Fillmore 1976, 2003
  • Frames are organized in a hierarchy with various
    frame-to-frame relations

LEGEND
  • FrameNet Berkeley Project 1
  • Database of frames for the core lexicon of
    English
  • 800 frames, 10.000 lemmas, 135.000 annotated
    sentences
  • Hierarchy 7 frame relations, 1136 edges, 86
    roots

(1) http//framenet.icsi.berkeley.edu
29
FATE Corpus annotation an example
frame
T
Collins synt. an.
FEE
30
FATE Corpus annotation an example
frame
T
FE
Collins synt. an.
FEE
FE filler
Maximization principle chose the largest
constituent possible when annotating
31
FATE Corpus annotation an example
frame
T
FE
Collins synt. an.
FEE
FE filler
H
DEATH(Protagonist Hiddleston / person Cause
avalanche )
32
FrameNet and Salsa Project
  • FrameNet Berkeley Project 1
  • Database of frames for the core lexicon of
    English
  • 800 frames, 10.000 lemmas, 135.000 annotated
    sentences from BNC
  • SALSA Project 2
  • A German corpus with frame annotation (20.000
    verbal instances)
  • Semantic frame-based lexicon for German
  • Methods for automation and application of
    frame-semantic information (SRL, RTE, discourse
    interpretation, etc.)

(1) http//framenet.icsi.berkeley.edu/ (2)
http//www.coli.uni-saarland.de/projects/salsa/
33
Annotation Schema
Span Annotation
  • On T of positive pairs, annotate only the
    fragments (spans) contributing to the inferential
    process
  • Spans are obtained from the ARTE annotation
    Garoufi,2007
  • For negative pairs it is not straightforward to
    derive spans, hence we do full annotation

T Soon after the EZLN had returned to Chiapas,
Congress approved a different version of the
COCOPA Law, which did not include the autonomy
clauses, claiming they were in contradiction with
some constitutional rights (private property and
secret voting) this was seen as a betrayal by
the EZLN and other political groups. H EZLN is
a political group.
34
FrameNet and RTE
T
H
  • Frame-to-frame inference
  • KILLING --- cause ---gt DEATH
  • Cause maps to Cause
  • Victim maps to Protagonist
Write a Comment
User Comments (0)
About PowerShow.com