Chunyi Peng, Zaoyang Gong, Guobin shen - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Chunyi Peng, Zaoyang Gong, Guobin shen

Description:

MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006 Firstly, we begin with a ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 23
Provided by: chun60
Learn more at: http://web.cs.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: Chunyi Peng, Zaoyang Gong, Guobin shen


1
MEASUREMENT AND MODELING OF A WEB-BASED QUESTION
ANSWERING SYSTEM
  • Chunyi Peng, Zaoyang Gong, Guobin shen
  • Microsoft Research Asia
  • HotWeb 2006

2
Outline
  • A short introduction to Web-based QA system
  • QA Measurement of behavior pattern on time,
    topics, users and incentive effects
  • QA Modeling
  • Discussion How can be better?

3
When you have a question
  • Solve it yourself! Ooh, out of our scope!
  • Usually, Search it! A common and good way in
    many cases, but
  • Search engine typically returns pages of links,
    not direct answers.
  • Some time it is very difficult for people to
    describe their questions in a precise way.
  • not all information is readily available in the
    web.
  • So, Ask! A natural and effective way
  • Question-Answering (QA) utilizes grassroots
    intelligence and collaboration
  • Especially as a specific information acquisition.

4
Difference from other QA systems
  • Different from AI-type QA
  • Back to 1960s - Kill the semantic ambiguity
  • Web as a resource of QA Search Natural
    Language I/O
  • Limited to fact-/knowledge-based questions
  • However, many questions are
  • communicative-specific
  • location-specific
  • time-specific
  • Another (interactive) QA system enable
    grassroots intelligence and collaboration

5
So, our goals
  • Measurement and modeling o f a real large-scale
    QA system
  • how a real QA system works?
  • What are the typical user behaviors and their
    impacts?
  • Seek Better QA system
  • How to design a QA system?
  • How to make performance tradeoffs?

6
iAsk (http//iask.sina.com.cn)
  • A topic-based web-QA system
  • Question lifecycle
  • questioning-gtwait for reply -gt confirmation
    (closed)
  • Provide optimal reply selection reply rewarding

7
Measurement Results
  • Data Set
  • 2-month (Nov 22, 2005 to Jan 23, 2006)
  • 350K questions and 2M replies
  • 220K users, 1901 topics
  • Measurement on
  • Question/reply patterns over time
  • Question/reply pattern over topics
  • Question/reply pattern across users
  • Question/reply Incentive mechanisms

8
Behavior Pattern over Time
  • On Hourly Scale a consistent usage pattern

9
Behavior Pattern over Topics
  • Topic characteristics
  • P--Popularity (Q) (Zipf-Popularity)
  • questioning and replying activities
  • Q--Question Proneness (Q/U)
  • the likelihood that a user will ask a question
  • R-- Reply Proneness (R/U)
  • the likelihood that a user will reply a question
  • Our measurement shows that topic characteristics
    vary intensively and user behaves quite
    differently.

10
Behavior Pattern across Users
  • Active and non-active users
  • about 9 users to 80 replies VS.
  • about 22 users to 80 questions
  • asymmetric questioning/replying pattern
  • 4.7 altruists
  • VS. 17.7 free-riders
  • Narrow user interests
  • topic (Q) 1.8
  • topic (R) 3.3

11
Performance Metric
  • Reply-Rate
  • how likely his question can be replied
  • Reply-Number
  • How likely his question can get an expected
    answer
  • Reply-Latency
  • how quickly he can get an answer

12
iAsk performance
  • Long-term performance
  • Reply-Rate 99.8
  • Reply-Number about 5
  • Reply-Latency about 10hr
  • Within 24hrs
  • Reply-Rate 85
  • Reply-Number about 4
  • Reply-Latency about 6hr
  • In summary, the performance is quite satisfactory
    except sometimes users need tolerate a relative
    long delay

13
Measurement on Incentive Mechanism
14
Modeling
  • The question arrival distribution Poisson
    distribution
  • The reply behavior an approximate
    exponentially-decaying model
  • ? Performance formula
  • Define dynamic performance

15
Parameter Impact
16
Possible Improvement
  • Active or Push-based Question Delivery
  • Better Webpage Layout, e.g. adding shortcuts
  • Better Incentive mechanism
  • Utilize Power of Social Networks

17
Conclusions
  • Web-QA that leverages the grassroots
    intelligence and collaboration is hot and getting
    hotter
  • Our measurement and model revealed that the QAs
    QoS heavily depends on three key factors user
    scale, user reply probability and a system design
    artifact, e.g. webpage design.
  • Current simple Web-QA System achieved the
    acceptable performance, but there still is
    improvement room

18
Backup
19
Behavior Pattern over Topics
  • Topic characteristics
  • P--Popularity (Q) (Zipf-Popularity)

20
Behavior Pattern over Topics
  • Topic characteristics
  • P--Popularity (Q), Zipf-Popularity
  • Q--Question Proneness (Q/U)
  • R-- Reply Proneness (R/U)

21
Narrow User Interest Scope
22
Reply distribution (measured)
23
Static Performance Formula
Reply-Rate Reply-Number Reply-Latency
24
Dynamic Performance Formula
Define dynamic performance We have,
Write a Comment
User Comments (0)
About PowerShow.com