Approximating Textual Entailment with LFG and FrameNet Frames - PowerPoint PPT Presentation

About This Presentation
Title:

Approximating Textual Entailment with LFG and FrameNet Frames

Description:

... release oil to help relieve the U.S. fuel crisis caused by Hurricane Katrina. ... Fred & Detour different sense assignments (FN coverage) Linguistic Components ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 45
Provided by: coliUnis
Category:

less

Transcript and Presenter's Notes

Title: Approximating Textual Entailment with LFG and FrameNet Frames


1
Approximating Textual Entailment with LFG and
FrameNet Frames
  • Aljoscha Burchardt and Anette Frank
  • Computational Linguistics Department
    Language Technology Lab
  • Saarland University
    DFKI GmbH
  • Saarbrücken Saarbrücken

SALSA Workshop, Saarbrücken, June 27-28,
2006 Multilingual semantic annotation theory and
applications
2
Overview
  • The PASCAL Recognizing Textual Entailment task
    (RTE) What is it, and how to approach it?
  • The SALSA RTE SystemA baseline system for
    approximating Textual Entailment
  • Building on LFG-based syntactic analysis and
    frame semantics
  • Computing structural and semantic overlap as an
    approximation of textual entailment in a learning
    architecture
  • Open architecture for future extensions towards
    deeper modelling
  • Linguistic analysis LFG and FrameNet frames
  • Approximating Textual Entailment
  • Computing a match graph for structural and
    semantic overlap
  • Feature extraction and machine learning
  • Results of this years RTE task
  • Discussion, error analysis and perspectives
  • Conclusion

3
The PASCAL RTE Task What is it?
  • A recently established Challenge for the NLP/AI
    community
  • Testing a systems capacity to recognize Textual
    Entailment
  • Realistic, open-domain data set
  • drawn from system outputs in NLP applications
    IR, IE, QA, SUM
  • Controlled set-up balanced training and test
    sets
  • 800/800 text-hypothesis pairs

4
Taking a look at the data
  • Fine-grained linguistic analysis
  • T Oscar-winning actor Nicolas Cages new son
    and Superman have sth. in common ...
  • H Nicolas Cages new son was awarded an Oscar.
    No (IE)
  • Lexical semantics and paraphrases
    (nominalisation, synonymy)
  • T on December 10th 1936 King Edward VIII gave
    up his right to the British throne.
  • H King Edward VIII abdicated on the 10th of
    December, 1936. Yes (QA)
  • Inference and world knowledge
  • T Olson, 62, previously worked as a partner at
    Ernst Young LLP, before joining the Fed board
    in 2001, to serve a term ending in 2010.
  • H Olson is a member of the Fed board. Yes
    (IE)
  • Modality
  • T U.S. Secretary of State Condoleezza Rice said
    Thursday that North Korea should return to
    nuclear disarmament talks and ...
  • H North Korea says it will rejoin nuclear talks.
    No (SUM)
  • Temporal and local restrictions (monotonicity)
  • T In most Pacific countries there are very few
    women in parliament.
  • H Women are poorly represented in parliament.
    Yes (!) (IR)

5
Textual Entailment
We say that T entails H if the meaning of H can
be inferred from the meaning of T, as would
typically be interpreted by people. This
somewhat informal definition is based on (and
assumes) common human understanding of language
as well as common background knowledge. Cases
in which inference is very probable (but not
completely certain) are still judged
True. (Dagan, Glickmann, Magnini, RTE 2005
Workshop Proceedings)

Circumscribing Textual Entailment? See
discussions in Zaenen, Karttunen and Crouch
(2005), Manning
(2006), Crouch,
Karttunen and Zaenen (2006).
6
A Challenge, ... in fact
  • T Hundreds of divers and treasure hunters,
    including the Duke of Argyll, have risked their
    lives in the dangerous waters of the Isle of Mull
    trying to discover the reputed 30,000,000 pounds
    in Gold carried by this vessel--the target of the
    most enduring treasure hunt in British history.
  • H Shipwreck salvaging was attempted. (Yes, IR)
  • T The 26-member International Energy Agency
    said, Friday, that member countries would release
    oil to help relieve the U.S. fuel crisis caused
    by Hurricane Katrina.
  • H Responding to a plea from the International
    Energy Agency for member countries to release
    reserves, Canada is prepared to help. (No, SUM)

7
Approximating Textual Entailment
  • How to reconcile obvious complexity and required
    depth?
  • Parsing complexity
  • Semantic analysis
  • Argument structure, anaphora, lexical meaning,
    semantic and discourse relations, presupposition,
    ...
  • Inferences based on linguistic meaning and world
    knowledge
  • Statistical/ML approximation of Textual
    Entailment
  • Based on state-of-the-art syntactic and shallow
    semantic analysis
  • Measuring structural and semantic overlap
  • With possibilities for extensions towards deeper
    modelling
  • Inference on partial structures (lexical
    entailment)
  • Targeted modelling of specific aspects, e.g.
    modality contexts

8
A baseline system for approximating Textual
Entailment
  • Fine-grained LFG-based syntactic analysis
  • English LFG grammar (Riezler et al.
    2002)broad-coverage with high-quality
    probabilistic disambiguation
  • Frame Semantics
  • Coarse-grained lexical-semantic classification of
    predicates with role-based argument structure
    encoding
  • Extended semantic representations WordNet
    senses, SUMO concepts
  • Computing structural and semantic overlap
  • Hypothesis high/low ratio of H/T overlap gt
    entailment yes/no

9
A baseline system for approximating Textual
Entailment
  • Fine-grained LFG-based syntactic analysis
  • English LFG grammar (Riezler et al.
    2002)broad-coverage with high-quality
    probabilistic disambiguation
  • Frame Semantics
  • Coarse-grained lexical-semantic classification of
    predicates with role-based argument structure
    encoding
  • Extended semantic representations WordNet
    senses, SUMO concepts
  • Computing structural and semantic overlap
  • A learning problem measures of overlap, weighted
    entailment decision

10
The SALSA RTE System
Linguistic analysis componentsand Integration
XLE parsingLFG f-structure
f-structure w/ (extended) frame- semantic
projection
Fred/Detour Rosy frames roles
WordNet-based WSDWordNet SUMO
Using XLE term rewriting system (Crouch 2005)
11
Linguistic ComponentsLFG analysis combined with
FrameNet frames
  • Deep syntactic LFG analysis
  • Broad-coverage grammar with probabilistic
    disambiguation
  • Fine-grained grammatical function analysis with
    integrated NER
  • Performance on RTE-II development and test set
  • Coverage ? 99 (? 86 full parses, ? 13 partial
    parses)
  • On RTE H/T pairs ? 76 fully analysed pairs ?
    2 single analysis only
  • Frame semantic analysis
  • Focusing on lexical semantic classes and
    role-based argument structure
  • Disregarding aspects of deep semantics
    modality, quantification, ...
  • Normalisation over syntactic and lexical
    alternations (diatheses, lexicalisation, PoS)

12
Linguistic ComponentsFrame and role assignment
  • Shalmaneser (Erk Pado, 2006)
  • Shallow semantic parser for FrameNet frame and
    role assignment
  • Fred statistical frame assignment
  • WSD system for predicates, in terms of frames
  • Rosy semantic role assignment
  • Argument recognition and argument labelling
  • Using state-of-the-art features from robust
    syntactic parsing
  • Detour (to FrameNet via WordNet) (Burchardt et
    al., 2005)
  • Aim overcome lexical gaps in FrameNet
  • A rule-based frame assignment system that takes a
    detour to FrameNet via WordNet
  • Determine similarity of unknown LUs to existing
    frames (their LUs) based on WordNet-similarity
    measures

13
Linguistic ComponentsFrame and role assignment
  • Fred
  • Rosy
  • Fred,
  • Detour
  • Rosy

14
Linguistic ComponentsFrame and role assignment
  • Fred Detour different sense assignments (FN
    coverage)

15
Linguistic ComponentsIntegration and extended
semantics projection
  • Porting frame and role assignments to LFG
    f-structure
  • Defining a frame semantics projection using head
    lemmata as interface layer (accounts for parser
    discrepancies)
  • Using XLE rewrite system (Crouch 2005)

Head-indexed frame role assignments
16
Linguistic ComponentsIntegration and extended
semantics projection
  • Rule-based extensions of LFG-frame structures
  • Frames corresponding to LFG NE classes
  • Locations, companies, dates,
  • Extra-thematic roles, based on LFG adjunct
    classes, etc.
  • Time, Reason, Location, Concessive,
  • adjunct(Z,Y), ntype_sem(Y,time)
  • gt s(Z,SemZ), s(Y,SemY), time(SemZ,SemY).
  • Extended semantics projection WordNet and SUMO
    classes
  • WSD Banerjee Pedersen, 2003
  • WordNet SUMO/MILO mapping Niles and Pease
    (20019

17
Linguistic ComponentsIntegration and extended
semantics projection
  • Normalisations of syntactic structure
  • Passive Mapping SUBJ and OBJ to dsubj and dobj
    argument slots
  • Coindexing relative pronouns and relativised
    head, appositives, etc.
  • Heuristic rules collect antecedent candidate sets
    for pronominals
  • FEF Frame-Exchange-Format
  • (Partial) Visualisation of extended
    syntactic-semantic graph structures in FEFViewer
    (Alexander Koller, Coli Saarbrücken)

18
A walk-through-example from RTE 2006
  • Pair 716
  • Text
  • In 1983, Aki Kaurismäki directed his first
    full-time feature.
  • Hypothesis
  • Aki Kaurismäki directed a film.

19
LFG F-Structuresin XLE graphical display
20
Automatic Frame Annotation for Textin SALTO
Viewer
Collins Parse
21
Automatic Frame Annotation for Hypothesis
  • 716_h Aki Karusmäki directed a film.

22
LFG and Frames for Hypothesisin FEFViewer
Aki Kaurismäki directed a film.
23
The SALSA RTE System
Recognizing Textual Entailment Graph matching
Statistical approximation
Linguistic analysis componentsand Integration
hypothesis
text
XLE parsingLFG f-structure
f-structure w/ frames concepts
f-structure w/ frames concepts
f-structure w/ (extended) frame- semantic
projection
Fred/Detour Rosy frames roles
text-hypothesis-match graph
  • matching nodes and edges
  • different match types (similarity types)
  • extensions for deeper modelling (modality,
    lexical entailment)

WordNet-based WSDWordNet SUMO
Feature extraction
Model training classification
24
Hypothesis-Text-Match GraphsComputing structural
and semantic overlap
  • Computing structural and semantic overlap
  • Computing a match graph from text and
    hypothesis graphs
  • Matches are established by different aspects and
    degrees of similarity
  • Approximating textual entailment
  • High/low overlap ratio of hypothesis and match
    graph gt entailment yes/no

25
Hypothesis-Text-Match Graphs Different matching
strategies
  • Match graph/Text overlap Ratio of matched
    material and non-matched material in Text
  • Match graph/Hypothesis overlap Ratio of the
    matched material and non-matched material in
    Hypothesis
  • T Leo Fender invented the first electric guitar
    and the electric bass guitar.
  • H Leo Fender invented the first electric guitar.
  • I 7/12 58 II 7/7 100

hypothesis
26
Hypothesis-Text-Match GraphsComputing structural
and semantic overlap
  • Graph matching using XLE rewrite system
  • Defining different types of match conditions on
    t- and h-graph, triggering new nodes and edges
    in m-graph, with match-type info
  • Matching algorithm tied to rewrite-logic
  • Locally defined matches (no graph traversal)
  • Starting with (multiple) node matches
  • Edge matches restricted to connect matched nodes

text-hypothesis gt text-hypothesis-match
frame(hx1,killing)
frame(m(z1,x1,y1), killing), match_type(m(z1,x1,
y1),killing,frame)
gt
frame(ty1,killing)
Rewrite rule frame(hX1,Frame),
frame(tY1,Frame) gt frame(m(Z1,X1,Y1),Frame),
match_type(m(Z1,X1,Y1),Frame,frame).
27
Hypothesis-Text-Match GraphsComputing structural
and semantic overlap
  • Aspects of similarity
  • Syntax-based (i.e. lexical and structural)
    similarity
  • Identical PREDs and attribute values trigger node
    matches
  • Identical ATTRIBUTES (GF, morph. features)
    trigger edge matches
  • Semantics-based similarity
  • Identical FRAMES and CONCEPTS trigger node
    matches
  • Identical ROLES trigger edge matches
  • Match graph consists of identical partial
    syntactic semantic graphs
  • Degrees of similarity (strict vs. weak matching)
  • Non-identical, but structurally related PREDs
  • coreferentially related (relative clauses,
    appositives, pronominals)
  • Non-identical, but semantically related PREDs
    (WN-related, pathlt3)
  • Non-identical, but semantically related FRAMES
    (FN-/Detour-related)
  • Match graph establishes overlapping partial
    graphs (marked by match types)

28
t In 1983, Aki Kaurismäki directed his first
full-time feature.
29
Approximating Textual Entailment Extensions for
deeper modelling Modality
  • Detecting indicators of inconsistent modality
    types
  • T A pet must have rabies protection confirmed by
    a blood test.
  • H A case of rabies was confirmed.
  • Marking modal contexts in text and hypothesis
  • 5 modality types conditional, future, diamond,
    box, negation
  • Handling inconsistent modality types in matching
    process
  • Introducing negatively marked match nodes
  • Blocking embedded structures for similarity-based
    matches
  • Thus, reducing the size of the match graph

30
Approximating Textual Entailment Extensions for
deeper modelling Lexical Entailments
  • Bridging partial non-matching text and hypothesis
    pairs
  • T Olson, 62, previously worked as a partner at
    Ernst Young LLP, as a Minnesota bank president
    and as a congressional aide, before joining the
    Fed board in 2001, to serve a term ending in
    2010.
  • H Olsen is a member of the Fed board.
  • Lexically induced inferences, defined as rewrite
    rules on h/t/m graphs
  • Similar non-lexical heuristic inferences
  • Appositions prime minister X ? X is prime
    minister
  • Possessive constructions Xs Y ? the Y of X

t (X1) joins X2 h (Y1) member-of Y2
m(Z2,Y2,X2) gt match_type(heuristic_entailment_
match).
31
Approximating Textual EntailmentMachine learning
  • Feature selection with WEKA Classifiers
  • Many learners select intuitively important
    features, but also idiosyncratic ones
  • Selected learners and models
  • Model 1 Simple Conjunctive Rule classifier
    generated a single rule
  • Medium/high threshold on pred/frame matches as
    criterion for rejection
  • High degree of frame similarity /w medium
    predicate similarity models entailment
  • Model 2 Meta-classifier LogitBoost (additive
    logistic regression)
  • Features (1.-4.) used in iteration final
    feature set 1.,2.,4.

1. No. of predicate matches relative to hypothesis
2. No. of frame (Fred,Detour) matches relative to hypothesis
3. No. of roles (Rosy) matches relative to hypothesis
4. Match graph size rel. to hypothesis, incl. syn, sem, ontological info
32
Results in RTE-II
  • SALSA RTE system results
  • Both models score SUM gt IR gt QA gt IE
  • Refined model better on QA simple model better
    on SUM
  • Overall RTE-II results
  • Average accuracy 60 (Median 59)
  • Shallow overlap measures vary considerably
    between data sets, whereas deeper approaches
    remain more stable
  • Tendency towards deeper, knowledge-rich methods

Dev set all tasks
Model 1 61.1
Model 2 59.8
RTE-II all tasks IE IR QA SUM
Model 1 59.0 49.5 59.5 54.5 72.5
Model 2 57.8 48.5 58.5 57.0 67.0
Accuracy range (in) 53 - 56 58 - 61 62 - 64 74 -75
No. of groups 7 11 3 2
33
Discussion of ResultsTrue positives
  • High ratio of matching predicates, frames, and
    f-structure
  • Typical phenomena
  • Non-identical predicates compensated by matching
    frames (626)
  • Missing frame assignments compensated by WN
    relatedness
  • die pass away (wn-related, 103)
  • Active-passive diathesis resolved by f-structure
    normalisation (129)
  • Relative overlap measures also work for longer
    hypotheses

T Everest summiter David Hiddleston has passed away in an avalanche of Mt. Tasman. H A person died in an avalanche. (103)
T An earthquake has hit the east coast of Hokkaido, Japan, with a magnitude of 7.0 Mw. H An earthquake occurred on the east coast of Hokkaido, Japan. (626)
T In one of the latest attacks, a US soldier on patrol was killed by a single shot from a sniper in northern Baghdad, the military said yesterday. H A sniper killed a U.S. soldier on patrol in Baghdad with a single shot. (129)
34
Discussion of ResultsTrue negatives
  • Modal context marking seems to be effective
  • 27 of all true negatives involved modality
    mismatches, while only 11.9 of all sentences
    involve marked modal contexts
  • Future plans
  • Extend to lexically induced modality/facticity
    indicators
  • Testing for non-monotonicity contexts

T The goal of preserving indigenous culture can hardly be achieved by a handful of researchers and curators at museums of ethnology and folk culture. H Indigenous folk art is preserved. (233)
T Even today, within the deepest recesses of our mind, lies a primordial fear that will not allow us to enter the sea without thinking about the possibility of being attacked by a shark. H A shark attacked a human being. (322)
35
Error analysisFalse positives
  • Typical cases
  • Semantic dissimilarity
  • Non-matching predicates within larger match
    graphs, which are in fact semantically dissimilar
  • Structural distance
  • Matching nodes within a match graph correspond to
    far distant nodes in the text graph compared to
    neighbouring nodes in the match graph

36
Error analysisFalse positives
Unconnected nodes matched with distant nodes in
text grap
TSome 420 people have been hanged in Singapore
since 1991, mostly for drug trafficking, an
Amnesty International 2004 report said. That
gives the country of 4.4 million people the
highest execution rate in the world relative to
population. H4.4 million people were executed in
Singapore. (198) False positive
37
Error analysisFalse positives
  • Graph matching process
  • Not a top-down process
  • Starts by relating any nodes, and builds growing
    clusters by finding matching edges
  • This allows criss-cross matching of nodes in the
    match graph
  • Introduce weighted edges that reflect the
    relative distance of pairs of match nodes in
    text and hypothesis (path distance)

38
Error analysisFalse positives
  • Graph matching process
  • Not a top-down process
  • Starts by relating any nodes, and builds growing
    clusters by finding matching edges
  • This allows criss-cross matching of nodes in the
    match graph

text
hypothesis
  • Introduce weighted edges that reflect the
    relative distance of pairs of match nodes in
    text and hypothesis (path distance)

39
Conclusions
  • A medium-depth approach Approximating Textual
    Entailment
  • Lexical and syntactic overlap, semantic
    similarity (WordNet)
  • Frame semantics lexical semantic classes
    argument structure
  • Flexible graph matching method with extensions to
    deeper processing
  • Modality contexts, lexical inferences
  • Perspectives for future extensions
  • Engineering and fine-tuning
  • Combination with shallow (and deeper) methods in
    voting architecture
  • Frame and role assignment
  • Sense discrimination outlier detection (Erk,
    2006)
  • Coverage integration with other resources
    (VerbNet, NomBank)
  • Modelling dissimilarity
  • Semantic distance measures and distance-weighted
    graph edges
  • Acquisition of lexical modality indicators and
    (lexical) entailment rules

40
References
  • RTE Proceedings
  • RTE Challenge Homepage http//www.pascal-network.
    org/Challenges/RTE2
  • I. Dagan, O. Glickman, and B. Magnini(2005) The
    PASCAL recognising textual entailment challenge.
    In Proceedings of the RTE-1 Workshop,
    Southampton, UK.
  • B. Magnini and I. Dagan, editors (2006).
    Proceedings of the Second PASCAL Recognising
    Textual Entailment Challenge, Venice, Italy.
  • Electronic proceedings and slides
    http//ir-srv.cs.biu.ac.il64080/RTE2/proceedings
    /
  • Discussion about RTE Task
  • Zaenen, Karttunen and Crouch, 2005 Local
    Textual Inference can it be defined or
    circumscribed?, In ACL 2005 Workshop on
    Empirical Modelling of Semantic Equivalence and
    Entailment, Ann Arbor, Michigan.
  • Manning (2006) Local Textual Inference It's
    hard to circumscribe, but you know it when you
    see it - and NLP needs it, MS. Stanford
    University.
  • Crouch, Karttunen and Zaenen (2006)
    Circumscribing is not excluding A reply to
    Manning, MS. Palo Alto Research Center.
  • All papers http//www2.parc.com/istl/members/zaen
    en/

41
References
  • A. Burchardt and A. Frank (2006) Approximating
    Textual Entailment with LFG and FrameNet Frames
    In Proceedings of the Second Recognising Textual
    Entailment Workshop, Venice, Italy.http//www.col
    i.uni-saarland.de/projects/salsa/page.php?idpubli
    cations
  • K. Erk and S. Pado (2006) Shalmaneser - a
    flexible toolbox for semantic role assignment.
    In Proceedings of LREC-06, Genoa.http//www.coli.
    uni-saarland.de/projects/salsa/page.php?idpublica
    tions
  • A. Burchardt, K. Erk, and A. Frank (2005) A
    WordNet Detour to FrameNet. In Proceedings of
    the GLDV 2005 Workshop GermaNet II,
    Bonn.http//www.coli.uni-saarland.de/projects/sal
    sa/page.php?idpublications
  • R. Crouch (2005). Packed Rewriting for Mapping
    Semantics to KR. In Proceedings of the Sixth
    International Workshop on Computational
    Semantics, Tilburg.http//www2.parc.com/istl/grou
    ps/nltt/papers/iwcs05_crouch.pdf

42
(No Transcript)
43
Approximating Textual EntailmentSimilarity/Entail
ment measures and feature extraction
text graph hypothesis graph match graph proportional h/t and m/h ratio
lexical lex_id lex_id lex_id ratio_lexid
syntactic node_m (pred, coref, pro) edge_syn_m (all, gf, subc) ratio_nodes ratio_edges
Semantic strict (lfg_)frames_t (lfg_)roles_t (lfg_)frames_h (lfg_)roles_h (lfg_)frames_m (lfg_)roles_m ratio_(lfg_)frames ratio_(lfg_)roles
weak node_frameFN/derived_m mode_framerel/detour/wnrel_m node_heuristic_entailment_m node_modal_ctxt_mismatch_m
Connect-edness clusters_no, clusters_avg_size clusters_avgsize_rel_h clusters_abssize_rel_h
other fragmentary fragmentary rte_task
44
Error analysisSparse features
  • Feature set
  • High-frequency features that measure similarity
  • Few, and low-frequency features that model
    dissimilarity
  • Bias towards similarity
  • 29,5 false positives
  • 12,75 false negatives
  • Plans for further development
  • Introducing distance measures (semantic and
    structural)
  • Getting a grip on remaining differences, i.e.
    non-matched edges between matching clusters
Write a Comment
User Comments (0)
About PowerShow.com