CS3730/ISP3120 Discourse Processing and Pragmatics - PowerPoint PPT Presentation

About This Presentation
Title:

CS3730/ISP3120 Discourse Processing and Pragmatics

Description:

CS3730/ISP3120 Discourse Processing and Pragmatics Lecture Notes Jan 10, 12 – PowerPoint PPT presentation

Number of Views:203
Avg rating:3.0/5.0
Slides: 33
Provided by: DanJ64
Category:

less

Transcript and Presenter's Notes

Title: CS3730/ISP3120 Discourse Processing and Pragmatics


1
CS3730/ISP3120Discourse Processing and Pragmatics
  • Lecture Notes
  • Jan 10, 12

2
Outline
  • Finish going over the syllabus and list of
    assigned papers.
  • Signup for presentation dates.
  • Questions about Bonnie Webbers Chapter in the
    Handbook of Discourse processing?
  • Questions about the Intro to the Handbook of
    Discourse Processing?
  • Information about Reference
  • Information about Annotation

3
Syllabus and List of Assigned Papers
  • Note that the weighting for calculating final
    grades has changed, to give you more credit for
    presentations and reaction essays.
  • Instructions for the course yahoo! group were
    added to the syllabus. The data cannot be posted
    to the web, so information about where it is has
    been posted to the group.
  • Note that a journal article was removed from the
    list of assigned papers, and the schedule changed
    accordingly
  • Auditors the grade you will receive is NC, for
    no credit This is a change effective this term.

4
List of Assigned Papers
  • Schedule

5
Signup for Presentation Dates
  • Non-auditors (NAs) should sign up for two
    lectures.
  • NAs should randomly select a number
  • NAs will select a presentation date in increasing
    order, and then select their second presentation
    date in decreasing order

6
Signup for presentation dates
  • Let doubles ( NAs 2) 24
  • At most doubles/2 (rounded up) paired
    presentations may be chosen during the first
    round.
  • An individual NA may be involved in at most one
    paired presentation
  • A day may have at most 2 presenters
  • The following papers much have sole presenters
    Lappin Leass 94, Mann Thompson 88, Hobbs 79

7
Questions on Bonnie Webbers Chapter?
  • Good for pointers into the literature.
  • P. 800 ellipsis, eventualities, information
    structure (general idea of what they are?)
  • P. 802-803 understand the examples?

8
Questions about the Introduction to The Handbook
of DP?
  • Many fields study discourse.
  • Different definitions, theoretical paradigms, and
    methodologies
  • Interesting links are described on page 6
  • Most of the topics in Part 1, Discourse Analysis
    and Linguistics, have influenced work in NLP
    (except historical linguistics diachronic)
  • Throughout, many possibilities for NLP, some in
    interdisciplinary work

9
Reference
  • Reference
  • Kinds of reference phenomena
  • Constraints on co-reference
  • Preferences for co-reference

From Dan Jurafskys Lecture notes. Here are his
acknowledgements Thanks to Diane Litman, Andy
Kehler, Jim Martin!!! This material is from JM
Chapter 18, written by Andy Kehler, slides
inspired by Diane Litman Jim Martin
10
Reference Resolution
  • John went to Bills car dealership to check out
    an Acura Integra. He looked at it for half an
    hour
  • Id like to get from Boston to San Francisco, on
    either December 5th or December 6th. Its ok if
    it stops in another city along they way

11
Why reference resolution?
  • Conversational Agents Airline reservation system
    needs to know what it refers to in order to
    book correct flight
  • Information Extraction First Union Corp. is
    continuing to wrestle with severe problems
    unleashed by a botched merger and a troubled
    business strategy. According to industry
    insiders at Paine Webber, their president, John
    R. Georgius, is planning to retire by the end of
    the year.

12
Some terminology
  • John went to Bills car dealership to check out
    an Acura Integra. He looked at it for half an
    hour
  • Reference process by which speakers use words
    John and he to denote a particular person
  • Referring expression John, he
  • Referent the actual entity (but as a shorthand
    we might call John the referent).
  • John and he corefer
  • Antecedent John
  • Anaphor he

13
Many types of reference
  • (after Webber 91)
  • According to John, Bob bought Sue an Integra, and
    Sue bought Fred a Legend
  • But that turned out to be a lie (a speech act)
  • But that was false (proposition)
  • That struck me as a funny way to describe the
    situation (manner of description)
  • That caused Sue to become rather poor (event)
  • That caused them both to become rather poor
    (combination of several events)

14
Reference Phenomena
  • Indefinite noun phrases new to hearer
  • I saw an Acura Integra today
  • Some Acura Integras were being unloaded
  • I am going to the dealership to buy an Acura
    Integra today. (specific/non-specific)
  • I hope they still have it
  • I hope they have a car I like
  • Definite noun phrases identifiable to hearer
    because
  • Mentioned I saw an Acura Integra today. The
    Integra was white
  • Identifiable from beliefs The Indianapolis 500
  • Inherently unique The fastest car in

15
Indefinites (an aside)
  • Lots of complexities an e.g. of one type
  • The king and his men dont know Merry and Pippin,
    and they cant even see what they are
    (superordinate term figures versus basic level
    term hobbits)

There they the King and his men saw close
beside them a great rubbleheap and suddenly they
were aware of two small figures lying on it at
their ease, grey-clad, hardly to be seen among
the stones. The Two Towers, Tolkein
16
Reference Phenomena Pronouns
  • I saw an Acura Integra today. It was white
  • Compared to definite noun phrases, pronouns
    require more referent salience.
  • John went to Bobs party, and parked next to a
    beautiful Acura Integra
  • He went inside and talked to Bob for more than an
    hour.
  • Bob told him that he recently got engaged.
  • ??He also said that he bought it yesterday.
  • OK He also said that he bought the Acura yesterday

17
More on Pronouns
  • Cataphora pronoun appears before referent
  • Before he bought it, John checked over the
    Integra very carefully.

18
Inferrables
  • I almost bought an Acura Integra today, but the
    engine seemed noisy.
  • Mix the flour, butter, and water.
  • Kneed the dough until smooth and shiny
  • Spread the paste over the blueberries
  • Stir the batter until all lumps are gone.

19
Generics
  • I saw no less than 6 Acura Integras today. They
    are the coolest cars.

20
Pronominal Reference Resolution
  • Given a pronoun, find the reference (either in
    text or as a entity in the world)
  • We will look at constraints. The first student
    presentations will look at resolution algorithms.
  • Hard constraints on reference
  • Soft constraints on reference

21
Hard constraints on coreference
  • Number agreement
  • John has an Acura. It is red.
  • Person and case agreement
  • John and Mary have Acuras. We love them (where
    WeJohn and Mary)
  • Gender agreement
  • John has an Acura. He/it/she is attractive.
  • Syntactic constraints
  • John bought himself a new Acura (himselfJohn)
  • John bought him a new Acura (him not John)

22
Pronoun Interpretation Preferences
  • Selectional Restrictions
  • John parked his Acura in the garage. He had
    driven it around for hours.
  • Recency
  • John has an Integra. Bill has a Legend. Mary
    likes to drive it.

23
Pronoun Interpretation Preferences
  • Grammatical Role Subject preference
  • John went to the Acura dealership with Bill. He
    bought an Integra.
  • Bill went to the Acura dealership with John. He
    bought an Integra
  • (?) John and Bill went to the Acura dealership.
    He bought an Integra

24
Repeated Mention preference
  • John needed a car to get to his new job. He
    decided that he wanted something sporty. Bill
    went to the Acura dealership with him. He bought
    an Integra.

25
Parallelism Preference
  • Mary went with Sue to the Acura dealership.
    Sally went with her to the Mazda dealership.
  • Mary went with Sue to the Acura dealership.
    Sally told her not to buy anything.

26
Verb Semantics Preferences
  • John telephoned Bill. He lost the pamphlet on
    Acuras.
  • John criticized Bill. He lost the pamphlet on
    Acuras.
  • Implicit causality
  • Implicit cause of criticizing is object.
  • Implicit cause of telephoning is subject.

27
Manual Annotation
  • AKA coding, labeling

28
From Webbers Chapter
  • The aims of computational work in discourse and
    dialog
  • Modeling particular phenomena in discourse and
    dialog in terms of underlying computational
    processes
  • Providing useful natural language services, whose
    success depends in part on handling aspects of
    discourse and dialog
  • What computation contributes is a coherent
    framework for modeling these phenomena in terms
    of search through a space of possible candidate
    interpretations (in language analysis) or
    candidate realizations (in language generation)

29
Desiderata
  • Interesting and rich enough
  • Not so rich that automation is too far ahead of
    the current state of the art
  • Too complex logical structure
  • Knowledge bottleneck (without viable source)
  • Too fine-grained or subtle
  • Annotation instructions (aka coding manual)
    feasible
  • Time required for training is reasonable
  • Annotators can reliably perform the annotations
    in a reasonable amount of time

30
Minimal Process for NLP
  1. Develop initial coding manual
  2. At least two people perform sample annotations,
    and discuss their disagreements and experiences
  3. Revise coding manual
  4. Repeat 2-3 until agreement on training data is
    sufficient
  5. Independently annotate a fresh test set
  6. Evaluate agreement

31
Additional Steps
  1. Develop initial coding manual
  2. At least two people perform sample annotations,
    and discuss their disagreements and experiences.
    Analysis of patterns of agreement and
    disagreement using probability models (Wiebe et
    al. ACL-99 Bruce Wiebe NLE-99 from work in
    applied statistics)
  3. Revise coding manual
  4. Repeat 2-3 until agreement on training data is
    sufficient
  5. Independently annotate a fresh test set
  6. Evaluate agreement
  7. Train more annotators, assess average time for
    training and annotation
  8. Evaluate other types of reliability (psychology,
    content analysis, applied statistics literatures)

32
Measures of Agreement
  • Percentage Agreement OK, but not sufficient
  • If the distribution of classes is highly skewed,
    then the baseline algorithm of always assigning
    the most frequent class would have high agreement
  • Kappa measures agreement over and above
    agreement expected by chance
  • Details available in section 3 of this paper by
    our group
Write a Comment
User Comments (0)
About PowerShow.com