Question Answering from Errorful Multimedia Streams ARDA AQUAINT

About This Presentation

Title:

Question Answering from Errorful Multimedia Streams ARDA AQUAINT

Description:

Questions contain text and visual components. Want a good image that ... people spending leisure time at the beach, one or more musicians with audible ... – PowerPoint PPT presentation

Number of Views:53

Avg rating:3.0/5.0

Slides: 29

Provided by: alexhau

Category:

more less

Transcript and Presenter's Notes

Title: Question Answering from Errorful Multimedia Streams ARDA AQUAINT

1
Question Answering from Errorful Multimedia
StreamsARDA AQUAINT
Finding Better Answers in Video Using Pseudo
Relevance Feedback Informedia Project Carnegie
Mellon University
Carnegie Mellon
2
Outline

Pseudo-Relevance Feedback for Imagery
Experimental Evaluation
Results
Conclusions

3
Motivation

Question Answering from multimedia streams
Questions contain text and visual components
Want a good image that represents the answer
Improve performance of images retrieved as
answers
Relevance feedback works for text retrieval !

4
Finding Similar Images by Color
5
Finding Similar Scenes
6
Similarity Challenge Images containing similar
content
7
What is Pseudo Relevance Feedback

Relevance Feedback (Human intervention)

8
Original System Architecture

Simply weighted linear combination of video,
audio and text retrieval score

Retrieval Agents
9
System Architecture with PRF

New step
Classification through Pseudo Relevance Feedback
(PRF)
Combine with all other information agents (text,
image)

10
Classification from Modified PRF

Automatic retrieval technique
Modification use negative data as feedback
Step-by-step
Run base retrieval algorithm on image collection
K-Nearest neighbor (KNN) on color and texture
Build classifier
Negative examples least relevant images in the
collection
Positive examples image queries
Classify all data in the collection to obtain
ranked results

11
The Basic PRF Algorithm for Image Retrieval

Input
Query Examples q1 qn
Target Examples t1 tn
Output
Final score Fi and final ranking for every target
ti
Algorithm
Given initial score s0i for each ti based on
f0(ti, q1 qn)
Using an initial similarity measure f0 as a base
Iterate k 1 max
Given score ski, sample positive instances pki
and negative instances nki using sampling
strategy S
Compute updated retrieval score sik1 fik1(ti)
where fik1 is trained/learned using nki,pki
Combine all scores for final score Fi g(s0
smax)

12
Analysis PRF on Synthetic Data
13
PRF on Synthetic Data
14
Evaluation using the 2002 TREC Video Retrieval
Task

Independent collection, queries, relevant results
available
Search Collection
Total Length 40.16 hours
MPEG-1 format
Collected from Internet Archive and Open Video
websites, documentaries from the 50s
14,000 shots
292,000 I-frames (images)
Query
25 queries
Text, Image(Optional), Video(Optional)

15
Summary of 02 Video Queries
16
Analysis of Queries (2002)

Specific item or person
Eddie Rickenbacker, James Chandler, George
Washington, Golden Gate Bridge, Price Tower in
Bartlesville, OK
Specific fact
Arch in Washington Square Park in NYC, map of
continental US
Instances of a category
football players, overhead views of cities, one
or more women standing in long dresses
Instances of events/activities
people spending leisure time at the beach, one or
more musicians with audible music, crowd walking
in an urban environment, locomotive approaching
the viewer

17
Sample Query and Target

Query
Find pictures of Harry Hertz, Director of the
National Quality Program, NIST

18
Sample Query and Target

Query
Find pictures of Harry Hertz, Director of the
National Quality Program, NIST

19
Example Images
20
Example Images Selected for PRF
21
Combination of Agents

Multiple Agents
Text Retrieval Agent
Base Image Retrieval Agent
Nearest Neighbor on Color
Nearest Neighbor on Texture
Classification PRF Agent
Combination of multiple agents
Convert scores to posterior probability
Linear combination of probabilities

22
2002 Results
Video OCR was not relevant in this collection
23
Distance Function for Query 75
24
Distance Function for Query 89
25
Effect of Pos/Neg Ratio and Combination Weight
26
Selection of Negative Images Combination
27
Discussion Future Work

Discussion
Results are sensitive to queries with small
numbers of answers
Images alone cannot fully represent the query
semantics
Future Work
Incorporate more agents
Utilize the relationship between multiple agent
information
Better combination scheme
Include web image search (e.g. Google) as query
expansion

28
Conclusions

Pseudo-relevance feedback works for text
retrieval
This is not directly applicable to image
retrieval from video due to low precision in the
top answers
Negative PRF was effective for finding better
images

Write a Comment

User Comments (0)