Chunyi Peng, Zaoyang Gong, Guobin shen

About This Presentation

Title:

Chunyi Peng, Zaoyang Gong, Guobin shen

Description:

MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006 Firstly, we begin with a ... – PowerPoint PPT presentation

Number of Views:153

Avg rating:3.0/5.0

Slides: 23

Provided by: chun60

Learn more at: http://web.cs.ucla.edu

Category:

more less

Transcript and Presenter's Notes

Title: Chunyi Peng, Zaoyang Gong, Guobin shen

1
MEASUREMENT AND MODELING OF A WEB-BASED QUESTION
ANSWERING SYSTEM

Chunyi Peng, Zaoyang Gong, Guobin shen
Microsoft Research Asia
HotWeb 2006

2
Outline

A short introduction to Web-based QA system
QA Measurement of behavior pattern on time,
topics, users and incentive effects
QA Modeling
Discussion How can be better?

3
When you have a question

Solve it yourself! Ooh, out of our scope!
Usually, Search it! A common and good way in
many cases, but
Search engine typically returns pages of links,
not direct answers.
Some time it is very difficult for people to
describe their questions in a precise way.
not all information is readily available in the
web.
So, Ask! A natural and effective way
Question-Answering (QA) utilizes grassroots
intelligence and collaboration
Especially as a specific information acquisition.

4
Difference from other QA systems

Different from AI-type QA
Back to 1960s - Kill the semantic ambiguity
Web as a resource of QA Search Natural
Language I/O
Limited to fact-/knowledge-based questions
However, many questions are
communicative-specific
location-specific
time-specific
Another (interactive) QA system enable
grassroots intelligence and collaboration

5
So, our goals

Measurement and modeling o f a real large-scale
QA system
how a real QA system works?
What are the typical user behaviors and their
impacts?
Seek Better QA system
How to design a QA system?
How to make performance tradeoffs?

6
iAsk (http//iask.sina.com.cn)

A topic-based web-QA system
Question lifecycle
questioning-gtwait for reply -gt confirmation
(closed)
Provide optimal reply selection reply rewarding

7
Measurement Results

Data Set
2-month (Nov 22, 2005 to Jan 23, 2006)
350K questions and 2M replies
220K users, 1901 topics
Measurement on
Question/reply patterns over time
Question/reply pattern over topics
Question/reply pattern across users
Question/reply Incentive mechanisms

8
Behavior Pattern over Time

On Hourly Scale a consistent usage pattern

9
Behavior Pattern over Topics

Topic characteristics
P--Popularity (Q) (Zipf-Popularity)
questioning and replying activities
Q--Question Proneness (Q/U)
the likelihood that a user will ask a question
R-- Reply Proneness (R/U)
the likelihood that a user will reply a question
Our measurement shows that topic characteristics
vary intensively and user behaves quite
differently.

10
Behavior Pattern across Users

Active and non-active users
about 9 users to 80 replies VS.
about 22 users to 80 questions
asymmetric questioning/replying pattern
4.7 altruists
VS. 17.7 free-riders
Narrow user interests
topic (Q) 1.8
topic (R) 3.3

11
Performance Metric

Reply-Rate
how likely his question can be replied
Reply-Number
How likely his question can get an expected
answer
Reply-Latency
how quickly he can get an answer

12
iAsk performance

Long-term performance
Reply-Rate 99.8
Reply-Number about 5
Reply-Latency about 10hr
Within 24hrs
Reply-Rate 85
Reply-Number about 4
Reply-Latency about 6hr
In summary, the performance is quite satisfactory
except sometimes users need tolerate a relative
long delay

13
Measurement on Incentive Mechanism
14
Modeling

The question arrival distribution Poisson
distribution
The reply behavior an approximate
exponentially-decaying model
? Performance formula
Define dynamic performance

15
Parameter Impact
16
Possible Improvement

Active or Push-based Question Delivery
Better Webpage Layout, e.g. adding shortcuts
Better Incentive mechanism
Utilize Power of Social Networks

17
Conclusions

Web-QA that leverages the grassroots
intelligence and collaboration is hot and getting
hotter
Our measurement and model revealed that the QAs
QoS heavily depends on three key factors user
scale, user reply probability and a system design
artifact, e.g. webpage design.
Current simple Web-QA System achieved the
acceptable performance, but there still is
improvement room

18
Backup
19
Behavior Pattern over Topics

Topic characteristics
P--Popularity (Q) (Zipf-Popularity)

20
Behavior Pattern over Topics

Topic characteristics
P--Popularity (Q), Zipf-Popularity
Q--Question Proneness (Q/U)
R-- Reply Proneness (R/U)

21
Narrow User Interest Scope
22
Reply distribution (measured)
23
Static Performance Formula
Reply-Rate Reply-Number Reply-Latency
24
Dynamic Performance Formula
Define dynamic performance We have,

Write a Comment

User Comments (0)