Title: Question Answering from Errorful Multimedia Streams ARDA AQUAINT
1Question Answering from Errorful Multimedia
Finding Better Answers in Video Using Pseudo
Relevance Feedback Informedia Project Carnegie
Mellon University
Carnegie Mellon
- Pseudo-Relevance Feedback for Imagery
- Experimental Evaluation
- Results
- Conclusions
- Question Answering from multimedia streams
- Questions contain text and visual components
- Want a good image that represents the answer
- Improve performance of images retrieved as
answers - Relevance feedback works for text retrieval !
4Finding Similar Images by Color
5Finding Similar Scenes
6Similarity Challenge Images containing similar
7What is Pseudo Relevance Feedback
- Relevance Feedback (Human intervention)
8Original System Architecture
- Simply weighted linear combination of video,
audio and text retrieval score
Retrieval Agents
9System Architecture with PRF
- New step
- Classification through Pseudo Relevance Feedback
(PRF) - Combine with all other information agents (text,
10Classification from Modified PRF
- Automatic retrieval technique
- Modification use negative data as feedback
- Step-by-step
- Run base retrieval algorithm on image collection
- K-Nearest neighbor (KNN) on color and texture
- Build classifier
- Negative examples least relevant images in the
collection - Positive examples image queries
- Classify all data in the collection to obtain
ranked results
11The Basic PRF Algorithm for Image Retrieval
- Input
- Query Examples q1 qn
- Target Examples t1 tn
- Output
- Final score Fi and final ranking for every target
ti -
- Algorithm
- Given initial score s0i for each ti based on
f0(ti, q1 qn) - Using an initial similarity measure f0 as a base
- Iterate k 1 max
- Given score ski, sample positive instances pki
and negative instances nki using sampling
strategy S - Compute updated retrieval score sik1 fik1(ti)
where fik1 is trained/learned using nki,pki - Combine all scores for final score Fi g(s0
12Analysis PRF on Synthetic Data
13PRF on Synthetic Data
14Evaluation using the 2002 TREC Video Retrieval
- Independent collection, queries, relevant results
available - Search Collection
- Total Length 40.16 hours
- MPEG-1 format
- Collected from Internet Archive and Open Video
websites, documentaries from the 50s - 14,000 shots
- 292,000 I-frames (images)
- Query
- 25 queries
- Text, Image(Optional), Video(Optional)
15Summary of 02 Video Queries
16Analysis of Queries (2002)
- Specific item or person
- Eddie Rickenbacker, James Chandler, George
Washington, Golden Gate Bridge, Price Tower in
Bartlesville, OK - Specific fact
- Arch in Washington Square Park in NYC, map of
continental US - Instances of a category
- football players, overhead views of cities, one
or more women standing in long dresses - Instances of events/activities
- people spending leisure time at the beach, one or
more musicians with audible music, crowd walking
in an urban environment, locomotive approaching
the viewer
17Sample Query and Target
- Query
- Find pictures of Harry Hertz, Director of the
National Quality Program, NIST
18Sample Query and Target
- Query
- Find pictures of Harry Hertz, Director of the
National Quality Program, NIST
19Example Images
20Example Images Selected for PRF
21Combination of Agents
- Multiple Agents
- Text Retrieval Agent
- Base Image Retrieval Agent
- Nearest Neighbor on Color
- Nearest Neighbor on Texture
- Classification PRF Agent
- Combination of multiple agents
- Convert scores to posterior probability
- Linear combination of probabilities
222002 Results
Video OCR was not relevant in this collection
23Distance Function for Query 75
24Distance Function for Query 89
25Effect of Pos/Neg Ratio and Combination Weight
26Selection of Negative Images Combination
27Discussion Future Work
- Discussion
- Results are sensitive to queries with small
numbers of answers - Images alone cannot fully represent the query
semantics - Future Work
- Incorporate more agents
- Utilize the relationship between multiple agent
information - Better combination scheme
- Include web image search (e.g. Google) as query
- Pseudo-relevance feedback works for text
retrieval - This is not directly applicable to image
retrieval from video due to low precision in the
top answers - Negative PRF was effective for finding better