Title: Study Goal
1(No Transcript)
2Study Goal
- Measure the effect of different labels to
represent clusters of news documents on users
browsing collections of news stories.
Michael Cole Measuring label effect 3.22.05
3Practical Benefit
- The practical benefit of this study is to
validate an instrument to rank the performance of
automated labeling algorithms. - Better performing algorithms produce more
effective labels. If high performance labeling
algorithms can be identified, the effectiveness
of browsers for very large document collections
may be significantly improved.
Michael Cole Learning cluster labels 12.11.04
4Basic Approach
- Vary the labels while the other representational
properties are held constant
Michael Cole Learning cluster labels 12.11.04
5Interface
Michael Cole Learning cluster labels 12.11.04
6Michael Cole Learning cluster labels 12.11.04
7Labeling is a Representation Problem
Michael Cole Learning cluster labels 12.11.04
8Theory
- The user recognizes some semantic similarity
between the label and a description of their task
(Polson Lewis 1990) - Here the task understanding includes the user's
general knowledge of words that are germane to
the interest that drives the browsing experience.
Michael Cole Learning cluster labels 12.11.04
9Tasks
- Difficult Problem
- Browsing tasks are all quite similar in that they
rely on a user interest in a topic area. So
browsing for information on sports is a similar
task to browsing for business information. - Yet, the familiarity of the user with the topic
is likely to have an impact on the number of
words that might be recognized as semantically
similar
Michael Cole Learning cluster labels 12.11.04
10Tasks
The goal, however, to find good labeling
algorithms over all, or many classes, of users,
so familiarity is ignored in evaluating the main
results of this study. A pretest questionnaire
concerning general news familiarity is
administered as it may be useful in interpreting
results later.
Michael Cole Learning cluster labels 12.11.04
11Learning Effects
Since only the labels are varied, there is the
probability that a subject will remember the
labels from the previous treatment. To mitigate
this learning effect a Latin Square Design is
used to assign the treatments (Tague-Sutcliffe,
1997).
Michael Cole Learning cluster labels 12.11.04
12Measurement
- The measurement of effective labels is the
selection of clusters that contain in aggregate,
the highest percentage of relevant documents that
could be obtained given the test procedure
(selecting three clusters)
Michael Cole Learning cluster labels 12.11.04
13Linear Model
- A linear model is adopted for this study. The
rationale is based on the assumption that
scanning the interface and selecting cluster
representations are independent of one another. - There is no sequential process that takes place.
Michael Cole Learning cluster labels 12.11.04
14Cluster Labeling
- Not much previous work
- All of the work has concentrated on creating
labels strictly from the cluster contents. - Lin, Chen, and Nunamaker (1999)
- Schweighofer, Raubner, and Dittenbach (2001)
- Popescul and Ungar (2000)
Michael Cole Learning cluster labels 12.11.04
15Corpus
- NIST TDT-3 collection
- gt 37,000 English documents from six news sources,
including newswires, transcriptions of radio and
TV broadcasts. - 113 human coded topic clusters with gt 7400
stories - About 8000 documents explicitly coded as negative
examples for some topic.
Michael Cole Learning cluster labels 12.11.04
16Corpus Cluster Example
0003. Pinochet Trial Seminal Event WHAT
Pinochet, who ruled Chile from 1973-1990, is
arrested on charges of genocide and torture
during his reign. WHO Former Chilean dictator
General Augusto Pinochet Judge Baltasar Garzon
("Superjudge") WHERE Pinochet is arrested and
held in London, then later extradited to Spain.
WHEN The arrest occurs on 10/16/98 court
negotiations last the rest of the year. Topic
Explication Pinochet was arrested in a London
hospital on a warrant issued by Spanish Judge
Baltasar Garzon. Pinochet appealed his arrest
and a London court agreed, but the decision was
overturned by Britain's highest court. After
much legal wrangling over the site of the trial,
the British Courts ruled that Spain should
proceed with the extradition request Pinochet
continues to fight it. On topic stories
covering any angle of the legal process
surrounding this trial (including Pinochet's
initial arrest in October, his appeals, British
Court rulings, reactions of world leaders and
Chilean citizens to the trial, etc.). Stories
about Pinochet's reign or legacy are not on topic
unless they explicitly discuss this trial. Rule
of Interpretation ule 3 Legal/Criminal Cases
Michael Cole Learning cluster labels 12.11.04
17Popescul Ungar (2000)
- Chi-square test for common terms
- 1. generate a tree with bags of words at each
node, - 2. calculate confidence that a term in a root has
a similar frequency in all subtree nodes it has
no distinguishing power - Frequent Predictive words
Michael Cole Learning cluster labels 12.11.04
18Popescul Ungar results
FP frequent predictive product least/most FP
blend of highest and lowest FP scores
Michael Cole Learning cluster labels 12.11.04
19 Subjects Procedure
- Drawn from a convenience sample of undergraduates
and graduates at a large Northeastern university. - Fill out pre-test questionnaire.
- Perform assigned interface tasks.
- Fill out intermediate questionnaire asking about
experience with interface. - Repeat with another interface and task set
(twice).
Michael Cole Learning cluster labels 12.11.04
20Measuring the Effect of Labels
- Hypothesis 1 Automatically-generated labels will
be associated with reduced usability and
effectiveness compared to 'gold standard' labels. - Hypothesis 2 Different labeling algorithms will
be associated with differences in browsing
effectiveness.
Michael Cole Learning cluster labels 12.11.04
21Results
Michael Cole Learning cluster labels 12.11.04
22Results
Michael Cole Learning cluster labels 12.11.04
23Results
Michael Cole Learning cluster labels 12.11.04
24Results
Michael Cole Learning cluster labels 12.11.04
25ANOVA
Michael Cole Learning cluster labels 12.11.04
26Conclusions
The human-assigned labels that defined the topic
clusters were clearly the most effective
labels. The other two label sets were clearly
distinguishable from the human-assigned labels
and from each other. The instrument is a
promising approach to provide an objective
ranking of the quality of labels and, so,
automated labeling algortihms.
27References
Popescul, A., and Ungar, L. (2000). Automatic
Labeling of Document Clusters. Unpublished
retrieved September 23, 2004 from
ttp//www.citeseer.com/ Tague-Sutcliffe, Jean.
"The Pragmatics of Information Retrieval
Experimentation, Revisited." In Readings in
Information Retrieval, ed. Karen Sparck Jones and
Peter Willett, 205-216. San Francisco, CA Morgan
Kaufmann, 1997. Originally published in
Information Processing Management 28 (1992)
467-490. Soto,R.(1999) Learning and performing
by exploration label quality measured by latent
semantic analysis, Proceedings of the SIGCHI
conference on Human factors in computing
systemsthe CHI is the limit, p.418-425, May
15-20, 1999, Pittsburgh, Pennsylvania, United
States
28Tasks
- Difficult Problem
- Browsing tasks are all quite similar in that they
rely on a user interest in a topic area. So
browsing for information on sports is a similar
task to browsing for business information. - Yet, the familiarity of the user with the topic
is likely to have an impact on the number of
words that might be recognized as semantically
similiar
Michael Cole Learning cluster labels 12.11.04