Title: Passage Retrieval using HMMs
1Passage Retrieval using HMMs
- HARD 2004
- University of Illinois at Urbana-Champaign
- Jing Jiang ChengXiang Zhai
2Motivation Variable Length Passages
Nokia, the worlds biggest acquired Sega
Japanese video game maker,
its mobile N-Gage game
features of a cell phone, MP3-player
Nokia is the cell phone market leader
Nintendo Co.s now works as a videophone
which makes mobile and Internet
equipment
Nintendo has sold more than 10 million Game Boy
APE20030911.0887
APE20030922.0156
3Motivation Variable Length Passages
Nintendo Co.s now works as a videophone
which makes mobile and Internet
equipment
Nintendo has sold more than 10 million Game Boy
Nokia, the worlds biggest acquired Sega
Japanese video game maker,
its mobile N-Gage game
features of a cell phone, MP3-player
Nokia is the cell phone market leader
HARD-422
video game crash
APE20030911.0887
APE20030922.0156
4Motivation Variable Length Passages
Nintendo Co.s now works as a videophone
which makes mobile and Internet
equipment
Nintendo has sold more than 10 million Game Boy
Nokia, the worlds biggest acquired Sega
Japanese video game maker,
its mobile N-Gage game
features of a cell phone, MP3-player
Nokia is the cell phone market leader
HARD-422
video game crash
HARD-443
hand-held electronics
APE20030911.0887
APE20030922.0156
5Research Question
- Passage length is
- document-dependent
- query-dependent
How to detect variable-length passages?
6Previous Work on Passage Retrieval
- Structural or semantic boundary
- Passage is not query-specific.
- Fixed-length
- Passage length is not query-specific.
- Passage content may not be coherent.
- Arbitrary MultiText
- Only query words are considered.
- Heuristics are used to reduce search space.
- HMM-based
- The method is promising, but previous work didnt
fully explore its potential.
7HMM-Based Method
document
8HMM-Based Method
Q hand-held electronics
relevant passage
document
9HMM-Based Method
Q hand-held electronics
relevant passage
document
B
R
B
B
R
R
R
R
B
B
R
10Constructing the HMM
11Constructing the HMM
end-of-doc state
12Constructing the HMM
end-of-doc state
0.01
0.005
smoothing achieved by transitions
0.99
13Constructing the HMM
end-of-doc state
0.01
0.005
expanded query LM to incorporate feedback
smoothing achieved by transitions
0.99
14Constructing the HMM
transition probabilities trained for each document
end-of-doc state
0.01
0.005
expanded query LM to incorporate feedback
smoothing achieved by transitions
0.99
15Passage Extension
16Retrieval Approach 1
17Retrieval Approach 1
ranking
18Retrieval Approach 1
passage extraction
ranking
19Retrieval Approach 2
20Retrieval Our Approach
21Passage-Level Results
- Overall, baseline was the best.
22Effectiveness of HMM method
- HMM method improved performance over
fixed-length passages - Less improvement if fixed-length closer to
optimal length
23Diagnosis Runs
KL-divergence works poorly on passages
non-optimal parameter setting
HMM improves boundaries
24Discussions and Conclusions
- HMM method improved the performance over
fixed-length passages - LM (KL-divergence) method gives worse
performance on passage ranking than on document
ranking
25The End