Sentence Ordering and Topic Segmentation - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Sentence Ordering and Topic Segmentation

Description:

Who is Sonia Gandhi? Congress President Sonia Gandhi, who married into what was once India's most ... Sonia Gandhi is now an Indian citizen. Gandhi, who is 51, ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 18
Provided by: VasileiosH9
Category:

less

Transcript and Presenter's Notes

Title: Sentence Ordering and Topic Segmentation


1
Sentence Ordering and Topic Segmentation
  • Vasileios Hatzivassiloglou
  • University of Texas at Dallas

2
Ordering of sentences
  • In certain applications, the context planning
    module is weak, but we receive (more or less
    fully formed) a number of sentences that need to
    be put together
  • Examples
  • Summarization of a single document
  • Summarization of multiple documents
  • Generating long answers for questions (e.g.,
    definitions)

3
Deciding the order
  • It may be possible to use certain clues
  • Presence of a certain features, such as
    genus-species information in a definitional
    sentence
  • More generally, classifying the sentences in
    functional categories (e.g., genus-species,
    historical, example, etc.) and following a
    partial order between categories
  • Using known order, e.g., in single-document
    summarization

4
Cohesion
  • A major element in producing good text is
    cohesion, the semantic relatedness between the
    components (sentences)
  • Cohesive text flows naturally
  • How to estimate cohesion?
  • shared words between adjacent sentences,
    especially important words for the topic
  • co-reference (hard to recognize)
  • semantically related words between adjacent
    sentences

5
Lexical chains
  • Groups of related words and compounds
  • Built using
  • a source of semantic relations such as a
    thesaurus or WordNet
  • adjacency information from known, ordered sets of
    sentences
  • Can be constructed at various levels of
    granularity
  • Can be topic-dependent

6
WordNet
  • Developed at Princeton over the last 15 years
  • Contains more than 120,000 words and compounds
    organized into synsets, sets of synonyms that
    define a word sense
  • Hierarchically organized to indicate hyponymy
    (e.g., sugar maple ? maple ? tree ? plant)
  • Limited antonymy (opposition) and meronymy
    (part-whole) links

7
Sample lexical chains
  • proportion 1, face mask 1, equipment 2, pump 2,
    device 1, computer 3, machine 8
  • Glasgow Royal infirmary 1, HCI 5, hospital 4
  • drug 1, chloroform 1, anaesthetic 4
  • surgery 2, operation 2
  • Tory government 1, government 1
  • anaesthetist 1, doctor 1, Scottish doctor 1,
    surgeon 1

8
Example definitional output
  • Who is Sonia Gandhi?
  • Congress President Sonia Gandhi, who married into
    what was once Indias most powerful political
    family, is the first non-Indian since
    independence 50 years ago to lead the Congress.
    After Prime Minister Rajiv Gandhi was
    assassinated in 1991, Gandhi was persuaded by the
    Congress to succeed her husband to continue
    leading the party as the chief, but she refused.
    The BJP had shrugged off the influence of the
    51-year-old Sonia Gandhi when she stepped into
    politics early this year, dismissing her as a
    foreigner. Sonia Gandhi is now an Indian
    citizen. Gandhi, who is 51, met her husband when
    she was an 18-year old student at Cambridge in
    London, the first time she was away from her
    native Italy.

9
Example definitional output
  • Who is Sonia Gandhi?
  • Congress President Sonia Gandhi, who married into
    what was once Indias most powerful political
    family, is the first non-Indian since
    independence 50 years ago to lead the Congress.
    After Prime Minister Rajiv Gandhi was
    assassinated in 1991, Gandhi was persuaded by the
    Congress to succeed her husband to continue
    leading the party as the chief, but she refused.
    The BJP had shrugged off the influence of the
    51-year-old Sonia Gandhi when she stepped into
    politics early this year, dismissing her as a
    foreigner. Sonia Gandhi is now an Indian
    citizen. Gandhi, who is 51, met her husband when
    she was an 18-year old student at Cambridge in
    London, the first time she was away from her
    native Italy.

10
Example definitional output
  • Who is Sonia Gandhi?
  • Congress President Sonia Gandhi, who married into
    what was once Indias most powerful political
    family, is the first non-Indian since
    independence 50 years ago to lead the Congress.
    After Prime Minister Rajiv Gandhi was
    assassinated in 1991, Gandhi was persuaded by the
    Congress to succeed her husband to continue
    leading the party as the chief, but she refused.
    The BJP had shrugged off the influence of the
    51-year-old Sonia Gandhi when she stepped into
    politics early this year, dismissing her as a
    foreigner. Sonia Gandhi is now an Indian
    citizen. Gandhi, who is 51, met her husband when
    she was an 18-year old student at Cambridge in
    London, the first time she was away from her
    native Italy.

11
Example definitional output
  • Who is Sonia Gandhi?
  • Congress President Sonia Gandhi, who married into
    what was once Indias most powerful political
    family, is the first non-Indian since
    independence 50 years ago to lead the Congress.
    After Prime Minister Rajiv Gandhi was
    assassinated in 1991, Gandhi was persuaded by the
    Congress to succeed her husband to continue
    leading the party as the chief, but she refused.
    The BJP had shrugged off the influence of the
    51-year-old Sonia Gandhi when she stepped into
    politics early this year, dismissing her as a
    foreigner. Sonia Gandhi is now an Indian
    citizen. Gandhi, who is 51, met her husband when
    she was an 18-year old student at Cambridge in
    London, the first time she was away from her
    native Italy.

12
Applications of lexical chains
  • Ordering, as discussed
  • Text classification
  • Text segmentation and filtering
  • focused information retrieval

13
TextTiling
  • Step one Estimate the cohesion score between
    left and right blocks at each potential gap
  • The cohesion score can be based on shared words,
    shared lexical chain elements, or new words
  • One option is to treat the blocks as vectors, and
    then calculate their inner product

14
Depth scores
  • Cohesion scores are converted to depth scores
  • Intuitively, the depth score corresponds to
  • accounting for the variability in cohesion within
    a segment (e.g., introduction vs. later detailed
    discussion)
  • One formula is
  • di (si-1 si) (si1 si)
  • We are only interested at points where siltsi-1
    and siltsi1

15
Smoothing cohesion scores
  • We need to detect gradual gaps and conversely,
    stronger nearby gaps when evaluating a particular
    gap
  • This can be achieved by replacing si with
    (si-1sisi1)/3, which smoothes the cohesion
    score across successive gaps

16
Final gap determination
  • The appropriate strength of the depth score
    varies according to the text
  • A self-adjusting approach
  • Calculate the mean depth score µ and its standard
    deviation s
  • Choose gaps with di gt µcs, where c is a constant

17
Reading
  • Section 15.5 on discourse segmentation and text
    tiling
Write a Comment
User Comments (0)
About PowerShow.com