Title: Question Answering from Errorful Multimedia Streams ARDA AQUAINT
1Question Answering from Errorful Multimedia
StreamsARDA AQUAINT
Finding Better Answers in Video Using Pseudo
Relevance Feedback Informedia Project Carnegie
Mellon University
Carnegie Mellon
2Outline
- Pseudo-Relevance Feedback for Imagery
- Experimental Evaluation
- Results
- Conclusions
3Motivation
- Question Answering from multimedia streams
- Questions contain text and visual components
- Want a good image that represents the answer
- Improve performance of images retrieved as
answers - Relevance feedback works for text retrieval !
4Finding Similar Images by Color
5Finding Similar Scenes
6Similarity Challenge Images containing similar
content
7What is Pseudo Relevance Feedback
- Relevance Feedback (Human intervention)
8Original System Architecture
- Simply weighted linear combination of video,
audio and text retrieval score
Retrieval Agents
9System Architecture with PRF
- New step
- Classification through Pseudo Relevance Feedback
(PRF) - Combine with all other information agents (text,
image)
10Classification from Modified PRF
- Automatic retrieval technique
- Modification use negative data as feedback
- Step-by-step
- Run base retrieval algorithm on image collection
- K-Nearest neighbor (KNN) on color and texture
- Build classifier
- Negative examples least relevant images in the
collection - Positive examples image queries
- Classify all data in the collection to obtain
ranked results
11The Basic PRF Algorithm for Image Retrieval
- Input
- Query Examples q1 qn
- Target Examples t1 tn
-
- Output
- Final score Fi and final ranking for every target
ti -
- Algorithm
- Given initial score s0i for each ti based on
f0(ti, q1 qn) - Using an initial similarity measure f0 as a base
- Iterate k 1 max
- Given score ski, sample positive instances pki
and negative instances nki using sampling
strategy S - Compute updated retrieval score sik1 fik1(ti)
where fik1 is trained/learned using nki,pki - Combine all scores for final score Fi g(s0
smax)
12Analysis PRF on Synthetic Data
13PRF on Synthetic Data
14Evaluation using the 2002 TREC Video Retrieval
Task
- Independent collection, queries, relevant results
available - Search Collection
- Total Length 40.16 hours
- MPEG-1 format
- Collected from Internet Archive and Open Video
websites, documentaries from the 50s - 14,000 shots
- 292,000 I-frames (images)
- Query
- 25 queries
- Text, Image(Optional), Video(Optional)
15Summary of 02 Video Queries
16Analysis of Queries (2002)
- Specific item or person
- Eddie Rickenbacker, James Chandler, George
Washington, Golden Gate Bridge, Price Tower in
Bartlesville, OK - Specific fact
- Arch in Washington Square Park in NYC, map of
continental US - Instances of a category
- football players, overhead views of cities, one
or more women standing in long dresses - Instances of events/activities
- people spending leisure time at the beach, one or
more musicians with audible music, crowd walking
in an urban environment, locomotive approaching
the viewer
17Sample Query and Target
- Query
- Find pictures of Harry Hertz, Director of the
National Quality Program, NIST
18Sample Query and Target
- Query
- Find pictures of Harry Hertz, Director of the
National Quality Program, NIST
19Example Images
20Example Images Selected for PRF
21Combination of Agents
- Multiple Agents
- Text Retrieval Agent
- Base Image Retrieval Agent
- Nearest Neighbor on Color
- Nearest Neighbor on Texture
- Classification PRF Agent
- Combination of multiple agents
- Convert scores to posterior probability
- Linear combination of probabilities
222002 Results
Video OCR was not relevant in this collection
23Distance Function for Query 75
24Distance Function for Query 89
25Effect of Pos/Neg Ratio and Combination Weight
26Selection of Negative Images Combination
27Discussion Future Work
- Discussion
- Results are sensitive to queries with small
numbers of answers - Images alone cannot fully represent the query
semantics - Future Work
- Incorporate more agents
- Utilize the relationship between multiple agent
information - Better combination scheme
- Include web image search (e.g. Google) as query
expansion
28Conclusions
- Pseudo-relevance feedback works for text
retrieval - This is not directly applicable to image
retrieval from video due to low precision in the
top answers - Negative PRF was effective for finding better
images