Finding Question-Answer Pairs from Online Forums - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Finding Question-Answer Pairs from Online Forums

Description:

Experiment 1,535 questions from 600 threads, ... Cosine Similarity. ... Experiment Data from three forum Tripadvisor, Lonely Planet, Bootsnall. Thank You ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 16
Provided by: nlgCsieN3
Category:

less

Transcript and Presenter's Notes

Title: Finding Question-Answer Pairs from Online Forums


1
Finding Question-Answer Pairs from Online Forums
  • ACM , SIGIR 08

Gao Cong Aalborg University, Aalborg, Denmark
Long Wang Tianjin University, Tianjin, China
Chin-Yew Lin Microsoft Research Asia, Beijing,
China Young-In Song Korea University, Seoul,
South Korea Yueheng Sun Tianjin University,
Tianjin, China
2
Introduction
  • Yahoo! Answers.
  • Forums contain a huge amount of valuable user
    generated content on a variety of topics.
  • Find Question-Answer pair in forums.

3
Algorithms
  • Question Detection
  • 5W1H
  • Most of questions are not begin with 5W1H.
  • Question Mark
  • 30 questions do not end with question mark.
  • I am wondering where I can buy cheap and good
    clothing in beijing.
  • Labeled Sequential Pattern (LSP)

4
Graph based propagation method
  • Building Graph
  • Given a question q, and the set A_q of its
    candidate answers.
  • For 2 candidate answers a1 a2 , compute KL(a1a2)
  • If 1/(1KL(a1a2)) is lager than a threshold ?,
    then add an edge from a1 to a2.

5
Graph based propagation method
  • Edge Weight
  • Normalized
  • ?0.01

6
Computing Propagated Scores
  • Propagation without initial score
  • Propagation with initial score

7
Answer Detection
  • score(q,a)
  • Cosine Similarity.
  • Query likelihood language model.
  • KL-divergence language model.

8
Experiment
  • Data
  • Select three forums of different scales to obtain
    source data.
  • Two annotators
  • The kappa statistic for identifying questions is
    0.96.
  • The kappa statistic for linking answers and
    questions given a question is 0.69.

9
Experiment
  • Q-Tinter intersection of two annotators.

10
Experiment
  • 1,535 questions from 600 threads, 284 questions
    do not have answers.

11
Experiment
  • Improved results on subsets
  • Of 486 first questions, only 21 of them do not
    have answers for A-TUnion data and 45 for
    A-TInter data.

12
Experiment
  • G_K Computing weight with KL-Divergence alone.
  • G_1 Propagation without initial score.
  • G_2 Propagation with initial score.

13
Experiment
  • Data from three forum
  • Tripadvisor, Lonely Planet, Bootsnall.

14
  • Thank You.

15
Experiment
Write a Comment
User Comments (0)
About PowerShow.com