Systematic Mismatches Across Annotations - PowerPoint PPT Presentation

About This Presentation
Title:

Systematic Mismatches Across Annotations

Description:

Institute for Research in Cognitive Science & Department of ... We observe that certain annotated features of the Penn Discourse ... Ftv: Factive verbs. Ctrl: ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 33
Provided by: verbsCo
Category:

less

Transcript and Presenter's Notes

Title: Systematic Mismatches Across Annotations


1
Systematic Mismatches Across Annotations
  • Alan Lee and Aravind Joshi
  • Institute for Research in Cognitive Science
    Department of Computer and Information Science,
    University of Pennsylvania
  • ULA Workshop,
  • U of Colorado, Boulder
  • March 2008

2
Preliminaries
  • We observe that certain annotated features of the
    Penn Discourse Treebank 2.0 (PDTB) do not match
    up neatly with annotations at the syntactic
    level.
  • What do certain mismatches suggest for linguistic
    theory? How do we get from syntax to discourse?
  • How does this affect NLP applications?

3
Outline
  • Attribution spans
  • Parallel Connectives
  • AltLex
  • Polarity and Determinacy

4
Outline
  • Attribution spans
  • Parallel Connectives
  • AltLex
  • Polarity and Determinacy

5
Attribution Spans
  • Relation between agents and abstract objects
    (discourse relations or their arguments)
  • Annotation Text Spans and Four features
    (source, type,
  • polarity, determinacy). More on the features
    later.

6
  • There have been no orders for the Cray-3 so far,
    though the company says it is talking with
    several prospects.
  • Discourse semantics contrary-to-expectation
    relation between there being no orders for the
    Cray-3 and there being a possibility of some
    prospects.
  • Sentence semantics contrary-to-expectation
    relation between there being no orders for the
    Cray-3 and the company saying something.

7
  • Although takeover experts said they doubted Mr.
    Steinberg will make a bid by himself, the
    application by his Reliance Group Holdings Inc.
    could signal his interest in helping revive a
    failed labor-management bid.
  • Discourse semantics contrary-to-expectation
    relation between Mr. Steinberg not making a bid
    by himself and the RGH application signaling
    his bidding interest.
  • Sentence semantics contrary-to-expectation
    relation between experts saying something and
    the RGH application signaling Mr. Steinbergs
    bidding interest.

8
  • Mismatches occur with other relations as well,
    such as causal relations
  • Investors are nervous about the issue because
    they say the company's ability to meet debt
    payments is dependent on too many variables,
    including the sale of assets and the need to
    mortgage property to retire some existing debt.
  • Discourse semantics causal relation between
    investors being nervous and problems with the
    companys ability to meet debt payments
  • Sentence semantics causal relation between
    investors being nervous and investors saying
    something!

9
How to address mismatch?
  • One possibility - treat attribution as a
    different layer of structure in discourse. (and
    also in syntax?)
  • This has the effect of reducing the complexity of
    the discourse structure.

10
Discourse Graphbank (Wolf Gibson 2005)
  • Farm prices in October edged up 0.7 from
    September
  • as raw milk prices continued their rise,
  • the Agriculture Department said.
  • Milk sold to the nation's dairy plants and
    dealers averaged 14.50 for each hundred pounds,
  • up 50 cents from September and up 1.50 from
    October 1988,
  • the department said.

11
sim
elab
attr
attr
1-2
4-5
1
2
3
4
5
6
ce
elab
ce - cause/effect elab - elaboration sim -
similiarity attr - atribution
12
elab
1-2
4-5
1
2
3,attr
4
5
6,attr
ce
elab
ce - cause/effect elab - elaboration sim
- similiarity attr - atribution
13
Residual issues
Does attribution scope over the entire relation,
or just Arg1?
Even if B.A.T receives approval for the
restructuring, the company will remain in play,
say shareholders and analysts, though the
situation may unfold over the next 12 months,
rather than six.
Arg1 attributed to shareholders and
analysts Rel and Arg2 attributed to Writer
Guideline in case of doubt, attribute to the
Writer
14
Residual issues
  • Attribution cannot always be excluded by default
  • Advocates said the 90-cent-an-hour rise, to 4.25
    an hour by April 1991, is too small for the
    working poor, while opponents argued that the
    increase will still hurt small business and cost
    many thousands of jobs.

What implications does this have for the approach
of treating attribution as an independent layer
of discourse?
15
Outline
  • Attribution spans
  • Parallel Connectives
  • AltLex
  • Polarity and Determinacy

16
Parallel Connectives
  • Either he wasnt being real in the past or he
    isnt being real right now. (1549)
  • Youve either got a chair or you dont. (2428)
  • If the answers to these questions are
    affirmative, then these institutional investors
    are
  • likely to be favorably disposed toward a specific
    poison pill. (0275)
  • Parallel connectives are annotated
    discontinuously
  • In the PDTB, both parts of a parallel connective
    are treated as equally prominent (no hierarchical
    relationship)

17
In Penn Treebank, the treatment of a parallel
connective depends on its position within
sentence. When Either is sentence-initial,
both either and or are annotated as CC.
  • Either he wasnt being real in the past or he
    isnt being real right now. (wsj_1549)

S
S
S
CC
CC
Either
he wasnt being real in the past
or
he isnt being real right now
18
This is not possible when either is
sentence-medial. Here, either is treated as an
RB and or is as a CC.
  • Youve either got a chair or you dont.
    (wsj_2428)

S
S
S
CC
or
NP-SBJ
VP
ADVP
VP
you dont
RB
You
ve
got a chair
either
19
  • How to represent parallel connective?
  • DL-TAG approach elementary discourse tree with
  • two lexical anchors (DC discourse clause)

DC
DC
Either
DC?
or
DC?
because
DC?
DC?
But question remains how to transition from
syntactic structure to discourse structure?
20
Outline
  • Attribution spans
  • Parallel Connectives
  • AltLex
  • Polarity and Determinacy

21
Alternative Lexicalization(AltLex)
  • A discourse relation is inferred between two
    sentences which do not contain an Explicit
    connective, but insertion of an Implicit
    connective leads to redundancy. This is because
    the relation is alternatively lexicalized by some
    non-connective expression
  • Under a post-1987 crash reform, the Chicago
    Mercantile Exchange wouldnt permit the December
    SP futures to fall further than 12 points for a
    half hour. AltLex (consequence) That caused a
    brief period of panic seeling of stocks on the
    Big Board.

22
Discourse Connectives and Syntactic Constituency
  • Most explicit connectives correspond to syntactic
    constituencies. E.g. (because IN, but CC,
    as a result PP, etc.)
  • Some small exceptions with parallel connectives,
    as we have seen.

23
  • AltLex expressions often do not correspond to
    syntactic constituencies.
  • Under a post-1987 crash reform, the Chicago
    Mercantile Exchange wouldnt permit the December
    SP futures to fall further than 12 points for a
    half hour. AltLex (consequence) That caused a
    brief period of panic selling of stocks on the
    Big Board.

S
NP-SBJ
VP
VBD
DT
DT
PP-LOC
That
caused
a brief period
of panic selling..
24
  • For a list of AltLex expressions annotated in
  • the PDTB
  • http//www.seas.upenn.edu/pdtb/altlex-strings.txt
  • Or search using PDTB Browser (shameless
  • plug)
  • http//www.seas.upenn.edu/pdtb/PDTBAPI/pdtbbrowse
    r.jnlp

25
Outline
  • Attribution spans
  • Parallel Connectives
  • AltLex
  • Polarity and Determinacy

26
Attribution Features
  • Attribution is annotated on relations and
    arguments, with FOUR
  • Features.
  • Source encodes the different agents to whom
    proposition is attributed
  • Wr Writer agent
  • Ot Other non-writer agent
  • Arb Generic/Atbitrary non-writer agent
  • Inh Used only for arguments attribution
    inherited from relation
  • Type encodes different types of Abstract Objects
  • Comm Verbs of communication
  • PAtt Verbs of propositional attitude
  • Ftv Factive verbs
  • Ctrl Control verbs
  • Null Used only for arguments with no explicit
    attribution

27
Polarity vs Determinacy
  • Polarity Indicates narrow scope of surface
    negated attributions.
  • (Neg-raising, Klima 1964). Marked as Neg when
    neg-raising
  • occurs. Null otherwise.
  • John doesnt think the book fell ( John thinks
    the
  • book didnt fall)
  • Determinacy Attributions rendered indeterminate
    in certain
  • contexts. Marked as Indet, or Null otherwise.
  • John didnt say the book fell ( no lowering of
    negation)
  • Only a certain class of verbs can have negative
    polarity,
  • i.e. induce neg-raising. Verbs of Propositional
    Attitude (PAtt)
  • have this behavior, but not others.

28
Polarity vs Determinacy
  • I dont believe they have the culture to
    adequately service high-net-worth individuals.
    (0927)
  • Discourse semantics
  • I believe they DO NOT have the culture to
    adequately service high-net-worth individuals.
    (0927)
  • Negation of expect is lowered onto the
    argument. The attribution is marked as negative
    polarity.
  • Note that the attribution event of expecting
    did occur (is determinate).

29
Polarity vs Determinacy
  • It didnt say if its earlier results were
    influenced significantly by nonrecurring
    elements. (1711)
  • Negation of say is NOT lowered onto the
    argument. The attribution is marked as
    indeterminate.
  • The attribution event (of saying) did not
    actually occur.

30
At Syntactic Level
  • At which level should discrepancy in the
    polarity vs determinacy type of
  • negation be captured?
  • - In PropBank, negations of attribution verbs are
    uniformly marked as a
  • negative feature for the adjunct feature ARGM.
  • - In TimeML, they contain a polarity feature of
    Neg.
  • I dont BELIEVE they have the culture to
    adequately service high-net-worth individuals.
  • ARG1 I
  • ARG2 they have the culture
  • ARGM Neg (PropBank) No Neg for lower predicate
    have
  • POLARITY Neg (TimeML)
  • Should the negation be marked as ARGM for the
    lower predicate
  • (have) instead?

31
At Syntactic Level
  • It didnt SAY if its earlier results were
    influenced significantly
  • by nonrecurring elements.
  • ARG1 It
  • ARG2 if its earlier results were influenced
    significantly
  • by nonrecurring elements
  • ARGM Neg (PropBank)
  • POLARITY Neg (TimeML)
  • Saying event is indeterminate. Does this still
    count as an event?
  • How to order this temporally?

32
Some questions
  • How much of discourse is projected from syntax?
  • Is there a need for a different architecture,
    different building blocks?
  • How are these issues manifested
    cross-linguistically? Currently, discourse
    annotation work being done for Hindi, Turkish,
    Czech and Finnish (possibly).
Write a Comment
User Comments (0)
About PowerShow.com