Story Segmentation of Broadcast News - PowerPoint PPT Presentation

About This Presentation
Title:

Story Segmentation of Broadcast News

Description:

[ election - Gore] ... a. b. c. news ? ? this is ron claiborne ... turning to news overseas ... [ election] ... no matter what ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 20
Provided by: mehrbod8
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Story Segmentation of Broadcast News


1
Story Segmentation of Broadcast News
  • Mehrbod Sharifi mehrbod_at_cs.columbia.edu
  • Thanks to Andrew Rosenberg
  • mehrbod/presentations/SSegDec06.pdf

2
GALE (Global Autonomous Language Exploitation)
  • to absorb, analyze and interpret huge volumes
    of speech and text in multiple languages,
    eliminating the need for linguists and analysts
    and automatically providing relevant, distilled
    actionable information
  • Transcription Engines (ASR)
  • Translation Engines (MT)
  • Distillation Engines (QAIR)
  • http//projects.ldc.upenn.edu/gale/
  • http//www.darpa.mil/ipto/Programs/gale/

3
(No Transcript)
4
Task Story Segmentation
  • Input
  • .sph audio files from TDT-4 corpus distributed
    by LDC
  • .rttmx output from other collaborators of GALE
    project (all automated, one word per row)
  • Speaker boundaries (Chuck at ICSI)
  • ASR words, start end time, confidence, phone
    durations (Andreas at SRI/ICSI)
  • Sentence boundaries probabilities (Dustin at UW)
  • Gold standard annotated story boundaries
  • Output
  • .rttmx files with story boundaries (generated by
    a method that performs well on unseen data)
  • /n/squid/proj/gale1/AA/eng-tdt4/tdt4-eng-rttmx-121
    92005/README

5
Task Story Segmentation
  • Event specific thing that happens at a specific
    time and place along with all necessary
    preconditions and unavoidable consequences
  • U.S. Marine jet sliced a funicular cable in
    Italy in February 1998, the cable car's crash to
    earth and the subsequent injuries were all
    unavoidable consequences and thus part of the
    same event.
  • Topic an event or activity, along with all
    directly related events and activities
  • Story News stories may be of any length, even
    fewer than two independent clauses, as long as
    they constitute a complete, cohesive news report
    on a particular topic. Note that single news
    stories may discuss more than one related topic.
  • http//www.ldc.upenn.edu/Projects/TDT4/Annotation/
    annot_task_def_V1.4.pdf

6
Task Story Segmentation
  • Example 3898 words / 263 sentences / 26 stories
    (? reject or low confidence word)
  • ?
  • ? ? headlines ? ?
  • good evening everyone ...report on war ...
    gillian findlay a. b. c. news ?
  • turning to politics ... election - Gore ... a.
    b. c. news ? ?
  • this is ron claiborne ... election - Bush ...
    a. b. c. news ? ?
  • ? as for the two other candidates ... said the
    same
  • still ahead ... teaser ... camera man
  • this is world news ... commercials ... was a
    woman
  • turning to news overseas ... election ... no
    matter what
  • its just days after a deadly ferry sinking in
    greece ... safety tests
  • mehrbod/rttmx/eng/20001001_1830_1900_ABC_WNT.rttm
    x
  • mehrbod/out/eng.ANC_WNT.txt

7
Task Story Segmentation
  • How difficult is it?
  • Topic vs. Story
  • Segment classes
  • New story
  • Teaser
  • Misc.
  • Under-transcribed
  • Error accumulated from previous processes

8
Current Approach - Summary
  • Align story boundaries with sentence boundaries
  • Extract sentence level features
  • Lexical
  • Acoustic
  • Speaker-dependent
  • Train and evaluate a decision tree classifier
    (J48 or JRip)
  • http//www1.cs.columbia.edu/amaxwell/pubs/storyse
    g-final-hlt.pdf

9
Current Approach - Features
  • Lexical (various windows)
  • TextTiling, LCSeg, keywords, sentence position
    and length
  • Acoustic
  • Pitch and Intensity min, max, median, mean, std.
    dev., mean absolute slope
  • Pause, speaking rate (voiced frame / total)
  • Vowel Duration Mean vowel length, sentence final
    vowel length, sentence final rhyme length
  • Second order of the above
  • Speaker
  • speaker distribution, speaker turn, first in the
    show

10
Current Approach - Results
  • Report in the HLT paper for full feature set at
    the sentence level

  F1 (p,r) Pk WinDiff Cseg
English .421(.67,.32) 0.194 0.318 0.067
Mandarin .592(.73,.50) 0.179 0.245 0.068
Arabic .300(.65,.19) 0.264 0.353 0.085
pk (Beeferman et al., 1999) WindowDiff (Pevzner
and Hearst, 2002) Cseg (Doddington, 1998)
11
Improvements In Progress
  • Looking for ways to reduce the negative effect of
    error inherited from upstream processes (ASR, SU
    and speaker detection)
  • Adding/modifying features to make them more
    flexible to error
  • Analyzing the current features and discard those
    that are not discriminative or descriptive enough
  • Improving the framework for the package

12
Word Level vs. Sentence Level
  • Pros
  • Eliminate the error on sentence boundary
    detection (it becomes a feature)
  • No need for story boundary alignment
  • Cons
  • More chance for error and lower baseline
  • Higher risk of over fitting

13
Word Level vs. Sentence Level
14
Word Level - Features
  • Providing information about a window preceding,
    surrounding or following the current word to
    provide more information
  • Acoustic features were done for windows of five
    words
  • Similar idea for other features, e.g.,
  • _at_attribute speaker_boundary TRUE,FALSE
  • _at_attribute same_speaker_5 TRUE,FALSE
  • _at_attribute same_speaker_10 TRUE,FALSE
  • _at_attribute same_speaker_20 TRUE,FALSE

15
Word Level - Features
  • Feature analysis for sentence level features
  • e.g., for ABC show using Weka (ordered list)

Chi Square Information Gain
sent_position sent_position
pauselen pauselen
start_time start_time
keywords_after_5 speaker_distribution
keywords_after_10 end_time
end_time keywords_after_5
16
Word Level - Features
  • Word ASR confidence score, (_at_reject_at_ or score lt
    0.8) Boolean and count in various window widths
  • Word introduction

17
Word Level - Results
18
Future Directions
  • Finding a reasonable segmentation strategy,
    followed by
  • clustering on featured extracted from segments
  • Sentences gt ALS
  • Pause gt L
  • acoustic tiling gt LS
  • Sequential Modeling
  • Performing more morphological analysis
    particularly in Arabic
  • Using the rest of the story and topic labels
  • Using other parts of the TDT and/or external
    information for training WordNet, WSJ, etc.
  • Experimenting with other classifiers JRip, SVM,
    Bayesian, GMM, etc.

19
Thank you. Questions?
Write a Comment
User Comments (0)
About PowerShow.com