Question-Answering of Large News Video Archives - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Question-Answering of Large News Video Archives

Description:

Question-Answering of Large News Video Archives CHUA, Tat-Seng, Yang, Hui, Chaisorn, Lekha & Zhao, Yun-Long School of Computing National University of Singapore – PowerPoint PPT presentation

Number of Views:157
Avg rating:3.0/5.0
Slides: 37
Provided by: Chua164
Category:

less

Transcript and Presenter's Notes

Title: Question-Answering of Large News Video Archives


1
Question-Answering ofLarge News Video Archives
  • CHUA, Tat-Seng,
  • Yang, Hui, Chaisorn, Lekha Zhao, Yun-Long
  • School of Computing
  • National University of Singapore
  • Email chuats_at_comp.nus.edu.sg
  • Web http//www.comp.nus.edu.sg/chuats

2
Outline of Talk
  • Introduction and Motivation
  • News Video Processing Story Segmentation
  • Video Transcript Correction
  • Question-answering on News Video
  • Results
  • Conclusion

3
PersonalizedNews Video Retrieval
  • Infotainment, including news video, is one of the
    major applications of MM Technology
  • In a personalized news video scenario, users
    interact with the system to enquire info such
    as
  • show me latest news video on Iraq ? Iraq
  • highlight of last nights European football ?
    European football
  • Results are time-specific
  • Users increasingly want to see video news,
    supplemented with audio and text
  • and summarized to as much detail as is necessary
  • In a more futuristic setup, these will be
    accomplished through natural human-oriented I/O

4
Issues to Resolve
  • Imprecision of users queries
  • highlight of football match last night?
  • Extraction of semantic contents of video
  • Multi-modality
  • Multi-sources
  • Segmentation of news video into story units with
    genre classifications
  • Summarization of info for viewing at different
    level of details

5
What Kinds of Data Do we Have?
  • Most research in the past has looked into only
    one source
  • Example, video and its accompanying audio track,
    ASR
  • In most real-life applications, information is
    readily available in multiple sources
  • Broadcast news -- video and audio
  • Web-based news articles (by news stations)
  • On-line wired news (by news agencies)
  • Other general resources ontologies, dictionary
    etc
  • Other types of info increasingly used in IR
    community
  • User models query logs, user profiles etc.
  • A challenge in developing usable systems ..
  • ?How to use these available data effectively
  • In co-training/ testing type framework??
  • Ignoring these obvious data resources will
    result in unsatisfactory solutions.

6
Outline of Our Approach
  • In this talk, I will describe our approach in
    developing systems to handle large scale video
    corpuses TREC video
  • Sources of data used
  • News video itself visual, audio features, ASR
  • External sources on-line news articles of the
    same period
  • General resources ontology of countries,
    dictionary - WORDNET
  • Approach (see architecture)

7
Overview of QA on News Video
8
Outline of Talk
  • Introduction and Motivation
  • News Video Processing Story Segmentation
  • Video Transcript Correction
  • Question-answering on News Video
  • Results
  • Conclusion

9
Video Story Segmentationfor News Video
  • First basic problem break the news video into
    meaningful units based on stories. Issues
  • How to classify shots into the correct
    class/category?
  • How to detect story boundaries?
  • Most news adopt the structure similar to CNNs (?)

Intro News Com1 News Finance News Com2 Sports News Weather
? ? ?
10
Video Story Segmentationfor News Video -2
  • To help alleviate the estimation problem in
    statistical learning, we adopt a two stage
    process
  • Stage 1 Shot classification
  • Stage 2 Scene segmentation classification
  • The set of features considered
  • Visual (color histogram, b/g change)
  • Temporal Motion activity, Audio type, Shot
    duration, speaker change
  • Mid-Level of Faces, Shot type, of Text
    Lines, and text-position, cue phrases

11
Stage 1 Shot Classification
  • Divide video sequence into shots
  • Consider 13 categories of shots
  • Intro/Highlight
  • Anchor 2-Anchor Meeting Speech
  • Still image shot Text Scene
  • Sports Live reporting
  • Finance Weather Commercial Special
  • Perform classification using Decision Tree (SEE
    6.0)

12
Stage 2 Scene Detection
  • Employ Hidden Markov Model (HMM) to detect story
    boundaries
  • Features (sequence level features) used at this
    stage
  • Shot classes shot tags
  • Scene change c/u
  • Speaker change c/u
  • Cue phrases at the beginning of new stories
  • Input to HMM
  • 1cc 1uu 1cu ..2cc 4c 4uu 6uu 6uu . 2cc .
  • Tested on 120 hours of TREC video and achieve
    around 76 in F1 accuracy in story segmentation
  • TREC data may be down-loaded from TREC web sites
    later (?)
  • (Chaisorn Chua et al, ICME02, WWW Journal02,
    TREC03)

13
Outline of Talk
  • Introduction and motivation
  • News Video Processing Story Segmentation
  • Video Transcript Correction
  • Question-answering on News Video
  • Results
  • Conclusion

14
Text Transcript from Speech to Text
  • Need accurate transcript for QA
  • not a problem for document or story retrieval
  • Performance of speech recognition system
  • Accuracy about 80 for news
  • Most errors are named entities likely answer
    targets (ATs)
  • Most such errors are type substitution ? homonym
    problem
  • Examples pneumonia ? new area Tony Blair ?
    Teddy Bear
  • How to correct errors in ATs?
  • ? use phonetic sound matching to correct the
    errors
  • May use confusion matrix successfully used in
    spoken docm retrieval
  • Problem low precision ? match to many irrelevant
    phrases
  • One solution limit scope of phonetic sound match
  • By utilizing on-line text news of same period
    (extract base noun phrases and named entities)
    reasonable

15
Use of External Resourceto correct Speech Errors
  • Extract all ATs from on-line news articles, Ai
    (ai1,.. aiq)
  • Given video transcript Ti with a list of terms
    (ti1, .., tip)
  • The basic problem is then to select an aik?Ai to
    replace a sequence of terms sj?Ti that maximizes
    the probability
  • where sj contains one or more consecutive terms
    in Ti
  • Basic idea use co-occurrence probabilities
    phonetic matching to find most likely aik?Ai to
    replace sequence of terms sj?Ti,
  • a) Extract list of probable ATs using
    co-occurrence probabilities
  • a) Matching at phonetic syllable level
  • b) Matching at confusion syllable string level
  • (see Wang Chua, ACL03)

16
Outline of Talk
  • Introduction and Motivation
  • News Video Processing Story Segmentation
  • Video Transcript Correction
  • Question-answering on News Video
  • Results
  • Conclusion

17
Overview ofQA on News Video
(Similar to our text-based QA work Yang Chua,
SIGIR03)
18
Question Processing
  • Users typical issue short queries (several
    keywords)
  • development in North Korea
  • match last night
  • Query is ambiguous!!
  • Analyze the query
  • to extract
  • Key terms in query
  • Likely answer target
  • NP NE in query
  • Type of video genre
  • Temporal constraint
  • Duration constraint
  • Example
  • football match last night?
  • ? football, match
  • ? football team (ORG-NAME)
  • ? football match
  • ? SPORTS
  • ? LAST-NIGHT
  • ? 30 seconds (default)

19
Query Reinforcement
  • The query, however, is ambiguous!
  • Use on-line news articles to provide the context
    (user independent)
  • Basic Idea Given original query q(o)
  • Use web (or news sites) and dictionary WordNet
  • Find terms (from web articles) co-occur
    frequently with q(o)
  • Extract semantically related terms from WordNet
  • Add high probability terms into q(0) to get q(1)
  • Expect q(1) to contain more context terms than
    q(0)
  • For the football example we expect q(1) to also
    contain terms like arsenal, inter milan,
    soccer, etc (the big match last night)

20
Query ReinforcementAnother Example
  • q(0) What are the symptoms of atypical
    pneumonia?
  • q(1) symptoms, pneumonia, virus, spread,
    fever, cough, breath, doctor

? Use q(1) to retrieve a list of news transcripts
at story level
21
Candidate Sentence Extraction
  • For the retrieved transcript Ti, we select
    sentences Sentij that best match the user query
    as follows
  • noun phrases, wnj
  • named entities, whj
  • original query words q(0), wcj
  • expanded query words q(1-0) q(1) - q(0), wej
  • video genre, wvj

22
Outline of Talk
  • Introduction
  • News Video Processing Story Segmentation
  • Video Transcript Correction
  • Question-answering on News Video
  • Results
  • Conclusion

23
Results
  • Use 7 days of CNN news video from 13-19 Mar 2003
  • contained a total of 350 minutes of news video
  • retrieved about 600 news articles per day from
    the Alta Vista news web site during these 7 days
  • Designed 40 factoid questions
  • 28 general questions that are asked everyday
  • 12 questions are date-specific
  • Give a total of 208 questions
  • Results

Transcript Correct Answers Accuracy
without error correction 116 55.8
with error correction 153 73.6
(To present in ACM Multimedia 03)
24
Results -- Example
  • Query What are the symptoms of atypical
    pneumonia?,
  • the 3-sentence window selected by the QA engine
    is
  • S1 He and his two companions are now in
    isolation and the one hundred and fifty five
    passengers on the flight were briefly
    quarantined.
  • S2 Symptoms include high fever, coughing,
    shortness of breath and difficulty breathing.
  • S3 But health officials say there's no reason to
    panic.

25
Outline of Talk
  • Introduction
  • News Video Processing Story Segmentation
  • Video Transcript Correction
  • Question-answering on News Video
  • Results
  • Conclusion

26
Related Work
  • Research in correcting speech recognition errors
  • (ACL03, EMNLP02)
  • News story and dialogue segmentation (Columbia U)
  • (ICME03, ACL03)
  • Question-answering in text
  • (TREC02, SIGIR03)
  • Infomedia Project
  • Uses multi-modality features effectively, esp
    speech
  • Insufficient emphasis on external resources
  • Works on Video-TREC - Large scale testing
  • Collaboration with Ramesh jain (Georgia Tech) as
    part of Video Tagging Project
  • Employ TV-Anytime metadata for news (collaborate
    with ETRI Korea)
  • Automatic tagging of TV-Anytime metadata, and use
    it as basis for video QA

27
Summary
  • Works are preliminary
  • Many processes needs to be automated
  • Participating in this years Video-TREC and test
    on large scale corpuses (120 hours of news video)
  • On both story segmentation and retrieval
  • Experience
  • Story Segmentation content features are
    important, text or ASR feature less important
  • Retrieval Text or ASR is important content
    features help in enhancing precision
  • Current Work
  • Build appropriate meta model to encode domain
    knowledge
  • Use higher order statistics to analyze data
  • KEY MESSAGE Must incorporate domain model and
    utilize multi-modality, multi-source information

28
THANK YOU
29
Question classification and possible video genres
Answer Target Likely Video Genre Example
Human Anchor, meeting, speech, General-news Who is the Secretary of State of the United States?
Location Live report, Anchor, General-news Where is Saddam Hussein hiding?
Organization Live report, anchor Which hospital is the center for SARS treatment in Singapore?
Time Anchor, General-news When did the Iraq war start?
Number Finance What is the expected GDP of Singapore this year?
Number Sports, Text-scene How many points did Yao Ming score?
Number Weather, Text-scene What is the highest temperature tomorrow?
Object Anchor, Still-image, Text-scene Which kinds of bombs are used in the current Iraq war?
Description Anchor, Text-scene What does SARS stand for?
30
Question analysis
Question What is the score of the football match last night? What are the symptoms of atypical pneumonia?
q(0) score, football, match, last, night symptoms atypical pneumonia
n football match, last night symptom, atypical pneumonia
h football atypical pneumonia
Answer Target Number Description
Video Genre Sports, Text-scene General News
31
List of Questions
  1. Who is the British Prime Minister?
  2. Who is elected to be China's President?
  3. Who is the President of the United States?
  4. What is the name of the former Premier of
    China?
  5. What is the name of the new Premier of China?
  6. Who will pay the heaviest tallies?
  7. Who was arrested in Pakistan?
  8. Which musician called off his US tour?
  9. When will NASA resume shuttle flights?
  10. When will Germany, France and Russia meet?
  11. When is the funeral of DjinDjic?
  12. Which are the three countries involved in the
    summit today?
  13. Where was the summit held?
  14. Which city is the capital of Central African
    Republic?
  15. Which are the three major war opponent
    countries?
  16. To whom US withdrew the aid offer?
  17. Which country vowed to veto the resolution
    today?
  18. Which country's compromise proposal was rejected
    by US?
  19. Where is Kashmir Hotel?

32
List of Questions cont.
21. Which city has the largest anti war
demonstration? 22. Where did a AL QUEDA suspect
arrested? 23. How many people attended the rally
in San Francisco? 24. What is the cost of
war? 25. How many people were killed in a Kashmir
Hotel? 26. How many people participated in the
rally in Madrid? 27. How many people were killed
by the new pneumonia? 28. What are the symptoms
of the atypical pneumonia? 29. What sanction did
President Bush lift? 30. What was the name of the
space shuttle broken apart in February? 31.
Which rally shows the support for President Bush?
32. What is the official name for the mysterious
pneumonia? 33. Which company tests their new
passenger profiling system? 34. Name one Jewish
holiday. 35. What is British stance? 36. How did
Serbs Prime Minister die? 37. How is the anti-war
protest in Madrid? 38. How is tomorrow's
weather? 39. What is the conflict between US and
Turkey? 40. What does the WHO call the new
pneumonia?
33
Some Remarks onStory Segmentation Task
  • Our 2-stage approach helps alleviate the
    statistical estimation problem requires less
    training data
  • Similar works done in Columbia U
  • Using maximum entropy method
  • For video segmentation (ICME03) and dialogue
    segmentation (ACL03)
  • Achieves similar performance
  • Our current work
  • Integration of multiple machine learning methods
    HMM, ME, heuristic rule methods, and co-training
    approach
  • Fusion of multiple modal features visual/audio
    features, text (speech to text), meta-data
    domain knowledge
  • Note Use only text feature (ASR) performs badly

34
Multi-tier mapping(Wang, Chua, ACL03)
  • We perform matching at 2 levels to find the most
    likely aik?Ai to replace the sequence of terms
    sj?Ti,
  • a) Phonetic syllable level
  • b) confusion syllable string level

35
Query Reinforcement
  • The query, however, is ambiguous!
  • Use on-line news articles to provide the context
    (user independent)
  • Basic Idea Given original query q(o)
  • Go to web (or news sites) to retrieve top N
    documents
  • Extract terms with high co-location probabilities
    with q(o), Cq
  • Extract semantically related terms from WordNet,
    Gq Sq
  • Extra terms to be added Kq Cq (Gq ? Sq)
  • (q(1) q(0)top m terms?Kq with weightsgts
  • Expect q(1) to contain more context terms than
    q(0)
  • For the football example expect q(1) to also
    contain terms like real madrid, manchester
    united, soccer

36
Query ReinforcementAnother example
  • q(0) What are the symptoms of atypical
    pneumonia?
  • q(1) symptoms, pneumonia, virus, spread,
    fever, cough, breath, doctor

? Use q(1) to retrieve a list of news transcripts
at story level
Write a Comment
User Comments (0)
About PowerShow.com