Usability of Grouping of Retrieval Results - PowerPoint PPT Presentation

About This Presentation
Title:

Usability of Grouping of Retrieval Results

Description:

Usability of Grouping of Retrieval Results – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 93
Provided by: Pres3
Category:

less

Transcript and Presenter's Notes

Title: Usability of Grouping of Retrieval Results


1
Usability of Grouping of Retrieval Results
  • Marti Hearst
  • School of Information, UC Berkeley
  • September 1, 2006

2
The Need to Group
  • Interviews with lay users often reveal a desire
    for better organization of retrieval results
  • Useful for suggesting where to look next
  • People prefer links over generating search terms
  • But only when the links are for what they want

Ojakaar and Spool, Users Continue After Category
Links, UIETips Newsletter, http//world.std.com/u
ieweb/Articles/, 2001
3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
Conundrum
  • Everyone complains about disorganized search
    results.
  • There are lots of ideas about how to organize
    them.
  • Why dont the major search engines do so?
  • What works what doesnt?

8
Different Types of Grouping
Clusters (Document similarity based) (polythetic)
Scatter/Gather Grouper
Keyword Sharing (any doc with keyword in
group) (monothetic) Findex DisCover
Single Category Swish Dynacat
Multiple (Faceted) Categories Flamenco Phlat/Stu
ff Ive seen
Monothetic vs Polythetic After Kummamuru et al,
2004
9
Clusters
  • Fully automated
  • Potential benefits
  • Find the main themes in a set of documents
  • Potentially useful if the user wants a summary of
    the main themes in the subcollection
  • Potentially harmful if the user is interested in
    less dominant themes
  • More flexible than pre-defined categories
  • There may be important themes that have not been
    anticipated
  • Disambiguate ambiguous terms
  • ACL
  • Clustering retrieved documents tends to group
    those relevant to a complex query together

Hearst, Pedersen, Revisiting the Cluster
Hypothesis, SIGIR96
10
Categories
  • Human-created
  • But often automatically assigned to items
  • Arranged in hierarchy, network, or facets
  • Can assign multiple categories to items
  • Or place items within categories
  • Usually restricted to a fixed set
  • So help reduce the space of concepts
  • Intended to be readily understandable
  • To those who know the underlying domain
  • Provide a novice with a conceptual structure
  • There are many already made up!

11
Cluster-based Grouping
  • Document Self-similarity
  • (Polythetic)

12
Scatter/Gather Clustering
  • Developed at PARC in the late 80s/early 90s
  • Top-down approach
  • Start with k seeds (documents) to represent k
    clusters
  • Each document assigned to the cluster with the
    most similar seeds
  • To choose the seeds
  • Cluster in a bottom-up manner
  • Hierarchical agglomerative clustering
  • Can recluster a cluster to produce a hierarchy of
    clusters

Pedersen, Cutting, Karger, Tukey, Scatter/Gather
A Cluster-based Approach to Browsing Large
Document Collections, SIGIR 1992
13
The Scatter/Gather Interface
14
Two Queries Two Clusterings
AUTO, CAR, ELECTRIC
AUTO, CAR, SAFETY
8 control drive accident 25 battery
california technology 48 import j. rate
honda toyota 16 export international unit
japan 3 service employee automatic
6 control inventory integrate 10
investigation washington 12 study fuel death
bag air 61 sale domestic truck import 11
japan export defect unite
The main differences are the clusters that are
central to the query
15
Scatter/Gather Evaluations
  • Can be slower to find answers than linear search!
  • Difficult to understand the clusters.
  • There is no consistence in results.
  • However, the clusters do group relevant documents
    together.
  • Participants noted that useful for eliminating
    irrelevant groups.

16
S/G Example query on star
  • Encyclopedia text
  • 14 sports
  • 8 symbols 47 film, tv
  • 68 film, tv (p) 7 music
  • 97 astrophysics
  • 67 astronomy(p) 12 steller phenomena
  • 10 flora/fauna 49 galaxies, stars
  • 29 constellations
  • 7 miscelleneous
  • Clustering and re-clustering is entirely
    automated

17
(No Transcript)
18
(No Transcript)
19
S/G Example query on star
  • Newspaper/Magazine text
  • 22 products / business
  • 41 software / computers 35 hollywood
  • 58 restaurants / food (reviews) 54
    astronomers/movies
  • 98 movies / tv (reviews) 9 film mini-reviews
  • 31 wall street / finance
  • Topics quite different from encyclopedia text!

20
Visualizing Clustering Results
  • Use clustering to map the entire huge
    multidimensional document space into a huge
    number of small clusters.
  • User dimension reduction and then project these
    onto a 2D/3D graphical representation

21
Clustering Visualizationsimage from Wise et al
95
22
Clustering Visualizations(image from Wise et al
95)
23
Are visual clusters useful?
  • Four Clustering Visualization Usability Studies

24
Clustering for Search Study 1
  • This study compared
  • a system with 2D graphical clusters
  • a system with 3D graphical clusters
  • a system that shows textual clusters
  • Novice users
  • Only textual clusters were helpful (and they were
    difficult to use well)

Kleiboemer, Lazear, and Pedersen. Tailoring a
retrieval system for naive users. SDAIR96
25
Clustering Study 2 Kohonen Feature Maps, Chen
et al.
  • Comparison Kohonen Map and Yahoo
  • Task
  • Window shop for interesting home page
  • Repeat with other interface
  • Results
  • Starting with map could repeat in Yahoo (8/11)
  • Starting with Yahoo unable to repeat in map (2/14)

Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
26
Kohonen Feature Maps(Lin 92, Chen et al. 97)
27
Study 2 (cont.), Chen et al.
  • Participants liked
  • Correspondence of region size to documents
  • Overview (but also wanted zoom)
  • Ease of jumping from one topic to another
  • Multiple routes to topics
  • Use of category and subcategory labels

Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
28
Study 2 (cont.), Chen et al.
  • Participants wanted
  • hierarchical organization
  • other ordering of concepts (alphabetical)
  • integration of browsing and search
  • correspondence of color to meaning
  • more meaningful labels
  • labels at same level of abstraction
  • fit more labels in the given space
  • combined keyword and category search
  • multiple category assignment (sportsentertain)
  • (These can all be addressed with faceted
    categories)

Chen, Houston, Sewell, Schatz, Internet Browsing
and Searching User Evaluations of Category Map
and Concept Space Techniques. JASIS 49(7)
582-603 (1998)
29
Clustering Study 3 Sebrechts et al.
  • Each rectangle is a cluster. Larger clusters
    closer to the pole. Similar clusters near one
    another. Opening a cluster causes a projection
    that shows the titles.

30
Study 3, Sebrechts et al.
  • This study compared
  • 3D graphical clusters
  • 2D graphical clusters
  • textual clusters
  • 15 participants, between-subject design
  • Tasks
  • Locate a particular document
  • Locate and mark a particular document
  • Locate a previously marked document
  • Locate all clusters that discuss some topic
  • List more frequently represented topics

Visualization of search results a comparative
evaluation of text, 2D, and 3D interfaces
Sebrechts, Cugini, Laskowski, Vasilakis and
Miller, SIGIR 99.
31
Study 3, Sebrechts et al.
  • Results (time to locate targets)
  • Text clusters fastest
  • 2D next
  • 3D last
  • With practice (6 sessions) 2D neared text
    results 3D still slower
  • Computer experts were just as fast with 3D
  • Certain tasks equally fast with 2D text
  • Find particular cluster
  • Find an already-marked document
  • But anything involving text (e.g., find title)
    much faster with text.
  • Spatial location rotated, so users lost context
  • Helpful viz features
  • Color coding (helped text too)
  • Relative vertical locations

32
Clustering Study 4
  • Compared several factors
  • Findings
  • Topic effects dominate (this is a common finding)
  • Strong difference in results based on spatial
    ability
  • No difference between librarians and other people
  • No evidence of usefulness for the cluster
    visualization

Aspect windows, 3-D visualizations, and indirect
comparisons of information retrieval systems,
Swan, Allan, SIGIR 1998.
33
SummaryVisualizing for Search Using Clusters
  • Huge 2D maps may be inappropriate focus for
    information retrieval
  • cannot see what the documents are about
  • space is difficult to browse for IR purposes
  • (tough to visualize abstract concepts)
  • Perhaps more suited for pattern discovery and
    gist-like overviews.

34
Clustering Algorithm Problems
  • Doesnt work well if data is too homogenous or
    too heterogeneous
  • Often is difficult to interpret quickly
  • Automatically generated labels are unintuitive
    and occur at different levels of description
  • Often the top-level can be ok, but the subsequent
    levels are very poor
  • Need a better way to handle items that fall into
    more than one cluster

35
Term-based Grouping
  • Single Term from Document Characterizes the Group
  • (Monothetic)

36
Findex, Kaki Aula
  • Two innovations
  • Used very simple method to create the groupings,
    so that it is not opaque to users
  • Based on frequent keywords
  • Doc is in category if it contains the keyword
  • Allows docs to appear in multiple categories
  • Did a naturalistic, longitudinal study of use
  • Analyzed the results in interesting ways
  • Kaki and Aula Findex Search Result Categories
    Help Users when Document Ranking Fails, CHI 05

37
(No Transcript)
38
Study Design
  • 16 academics
  • 8F, 8M
  • No CS
  • Frequent searchers
  • 2 months of use
  • Special Log
  • 3099 queries issued
  • 3232 results accessed
  • Two questionnaires (at start and end)
  • Google as search engine rank order retained

39
After 1 Week After 2 Months
40
Kaki Aula Key Findings (all significant)
  • Category use takes almost 2 times longer than
    linear
  • First doc selected in 24.4 sec vs 13.7 sec
  • No difference in average number of docs opened
    per search (1.05 vs. 1.04)
  • However, when categories used, users select gt1
    doc in 28.6 of the queries (vs 13.6)
  • Num of searches without 0 result selections is
    lower when the categories are used
  • Median position of selected doc when
  • Using categories 22 (sd38)
  • Just ranking 2 (sd8.6)

41
Kaki Aula Key Findings
  • Category Selections
  • 1915 categories selections in 817 searches
  • Used in 26.4 of the searches
  • During the last 4 weeks of use, the proportion of
    searches using categories stayed above the
    average (27-39)
  • When categories used, selected 2.3 cats on
    average
  • Labels of selected cats used 1.9 words on average
    (average in general was 1.4 words)
  • Out of 15 cats (default)
  • First quartile at 2nd cat
  • Median at 5th
  • Third quartile at 9th

42
Kaki Aula Survey Results
  • Subjective opinions improved over time
  • Realization that categories useful only some of
    the time
  • Freeform responses indicate that categories
    useful when queries vague, broad or ambiguous
  • Second survey indicated that people felt that
    their search habits began to change
  • Consider query formulation less than before (27)
  • Use less precise search terms (45)
  • Use less time to evaluate results (36)
  • Use categories for evaluating results (82)

43
Conclusions from Kaki Study
  • Simplicity of category assignment made groupings
    understandable
  • (my view, not stated by them)
  • Keyword-based Categories
  • Are beneficial when result ranking fails
  • Find results lower in the ranking
  • Reduce empty results
  • May make it easier to access multiple results
  • Availability changed user querying behavior

44
Highlight, Wu et al.
  • Select terms from document summaries, organize
    into a subsumption hierarchy.
  • Highlight the terms in the retrieved documents.

Wu, Shankar, Chen, Finding More Useful
Information Faster from Web Search Results CICM
03
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
Highlight, Wu et al.
  • First study
  • 19 undergraduates
  • Used the system for their own queries
  • Significant preference for the grouping interface
  • Second study
  • 6 participants
  • Their own queries
  • Accesses were sequential in linear interface
  • Accesses went deeper in grouping interface
  • Participants saved more documents per query

49
Category-based Grouping
  • General Categories
  • Domain-Specific Categories

50
(No Transcript)
51
SWISH, Chen Dumais
  • 18 participants, 30 tasks, within subjects
  • Significant (and large, 50) timing differences
    in favor of categories
  • For queries where the results are in the first
    page, the differences are much smaller.
  • Strong subjective preferences.
  • BUT the baseline was quite poor and the queries
    were very cooked.
  • Very small category set (13 categories)
  • Subhierarchy wasnt used.

Chen, Dumais, Bringing Order to the Web
Automatically Categorizing Search Results CHI 2000
52
Test queries, Chen Dumais
Information Need Pre-specified Query
giants ridge ski resort giants
book about "numerical recipes" for computer software recipes
information about Indian motorcycles Indian
"the home page for the band, "They Might be Giants"" giants
"the home page for the basketball team, the Washington Wizards" washington
Chen, Dumais, Bringing Order to the Web,
Automatically Categorizing Search Results. CHI
2000
53
Dumais, Cutrell, Chen, Bringing Order to the Web,
Optimizing Search by Showing Results in Context,
CHI 2001
54
Revisiting the Study, Dumais, Cutrell, Chen
55
Revisiting the Study, Dumais, Cutrell, Chen
56
Revisiting the Study, Dumais, Cutrell, Chen
57
Revisiting the Study, Dumais, Cutrell, Chen
  • This followup study reveals that the baseline had
    been unfairly weakened.
  • The speedup isnt so much from the category
    labels as the grouping of similar documents.
  • For queries where the answer is in the first
    page, the category effects are not very strong.

58
DynaCat, Pratt et al.
  • Medical Domain
  • Decide on important question types in an advance
  • What are the adverse effects of drug D?
  • What is the prognosis for treatment T?
  • Make use of MeSH categories
  • Retain only those types of categories known to be
    useful for this type of query.

Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
59
DynaCat, Pratt et al.
Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
60
DynaCat Study, Pratt et al.
  • Design
  • Three queries
  • 24 cancer patients
  • Compared three interfaces
  • ranked list, clusters, categories
  • Results
  • Participants strongly preferred categories
  • Participants found more answers using categories
  • Participants took same amount of time with all
    three interfaces

Pratt, W., Hearst, M, and Fagan, L. A
Knowledge-Based Approach to Organizing Retrieved
Documents. AAAI-99
61
DynaCat study, Pratt et al.
62
Faceted Category Grouping
  • Multiple Categories per Document

63
Search Usability Design Goals
  1. Strive for Consistency
  2. Provide Shortcuts
  3. Offer Informative Feedback
  4. Design for Closure
  5. Provide Simple Error Handling
  6. Permit Easy Reversal of Actions
  7. Support User Control
  8. Reduce Short-term Memory Load

From Shneiderman, Byrd, Croft, Clarifying
Search, DLIB Magazine, Jan 1997. www.dlib.org
64
How to Structure Information for Search and
Browsing?
  • Hierarchy is too rigid
  • KL-One is too complex
  • Hierarchical faceted metadata
  • A useful middle ground

65
The Problem with Hierarchy
  • Inflexible
  • Force the user to start with a particular
    category
  • What if I dont know the animals diet, but the
    interface makes me start with that category?
  • Wasteful
  • Have to repeat combinations of categories
  • Makes for extra clicking and extra coding
  • Difficult to modify
  • To add a new category type, must duplicate it
    everywhere or change things everywhere

66
The Idea of Facets
  • Facets are a way of labeling data
  • A kind of Metadata (data about data)
  • Can be thought of as properties of items
  • Facets vs. Categories
  • Items are placed INTO a category system
  • Multiple facet labels are ASSIGNED TO items

67
The Idea of Facets
  • Create INDEPENDENT categories (facets)
  • Each facet has labels (sometimes arranged in a
    hierarchy)
  • Assign labels from the facets to every item
  • Example recipe collection

Ingredient
Cooking Method
Chicken
Stir-fry
Bell Pepper
Curry
Course
Cuisine
Main Course
Thai
68
The Idea of Facets
  • Break out all the important concepts into their
    own facets
  • Sometimes the facets are hierarchical
  • Assign labels to items from any level of the
    hierarchy

Preparation Method Fry Saute Boil
Bake Broil Freeze
Desserts Cakes Cookies Dairy
Ice Cream Sorbet Flan
Fruits Cherries Berries Blueberries
Strawberries Bananas Pineapple
69
Using Facets
  • Now there are multiple ways to get to each item

Preparation Method Fry Saute Boil
Bake Broil Freeze
Desserts Cakes Cookies Dairy
Ice Cream Sherbet Flan
Fruits Cherries Berries Blueberries
Strawberries Bananas Pineapple
Fruit gt Pineapple Dessert gt Cake Preparation gt
Bake
Dessert gt Dairy gt Sherbet Fruit gt Berries gt
Strawberries Preparation gt Freeze
70
Using Facets
  • The system only shows the labels that correspond
    to the current set of items
  • Start with all items and all facets
  • The user then selects a label within a facet
  • This reduces the set of items (only those that
    have been assigned to the subcategory label are
    displayed)
  • This also eliminates some subcategories from the
    view.

71
The Advantage of Facets
  • Lets the user decide how to start, and how to
    explore and group.
  • After refinement, categories that are not
    relevant to the current results disappear.
  • Seamlessly integrates keyword search with the
    organizational structure.
  • Very easy to expand out (loosen constraints)
  • Very easy to build up complex queries.

72
Advantages of Facets
  • Cant end up with empty results sets
  • (except with keyword search)
  • Helps avoid feelings of being lost.
  • Easier to explore the collection.
  • Helps users infer what kinds of things are in the
    collection.
  • Evokes a feeling of browsing the shelves
  • Is preferred over standard search for collection
    browsing in usability studies.
  • (Interface must be designed properly)

73
Advantages of Facets
  • Seamless to add new facets and subcategories
  • Seamless to add new items.
  • Helps with categorization wars
  • Dont have to agree exactly where to place
    something
  • Interaction can be implemented using a standard
    relational database.
  • May be easier for automatic categorization

74
(No Transcript)
75
(No Transcript)
76
(No Transcript)
77
(No Transcript)
78
(No Transcript)
79
(No Transcript)
80
(No Transcript)
81
(No Transcript)
82
(No Transcript)
83
(No Transcript)
84
(No Transcript)
85
(No Transcript)
86
(No Transcript)
87
Facets vs. Hierarchy
  • Early Flamenco studies compared allowing multiple
    hierarchical facets vs. just one facet.
  • Multiple facets was preferred and more successful.

88
Limitation of Facets
  • Do not naturally capture MAIN THEMES
  • Facets do not show RELATIONS explicitly

Aquamarine Red Orange
Door Doorway Wall
  • Which color associated with which object?

Photo by J. Hearst, jhearst.typepad.com
89
Usability Studies
  • Usability studies done on 3 collections
  • Recipes 13,000 items
  • Architecture Images 40,000 items
  • Fine Arts Images 35,000 items
  • Conclusions
  • Users like and are successful with the dynamic
    faceted hierarchical metadata, especially for
    browsing tasks
  • Very positive results, in contrast with studies
    on earlier iterations.

90
Post-Interface Assessments
All significant at plt.05 except simple and
overwhelming
91
Post-Test Comparison
Which Interface Preferable For
Faceted
Baseline
Find images of roses Find all works from a given
period Find pictures by 2 artists in same media
Overall Assessment
More useful for your tasks Easiest to use Most
flexible More likely to result in dead
ends Helped you learn more Overall preference
92
Phind, Edgar et al.
  • Participants didnt like being restricted to one
    term only.
  • Preferred linear ordering.
  • There are many interface design flaws here.
  • Edgar, Nichols, Paynter, Thomson, Witten, A User
    Evaluation of Hierarchical Phrase Browsing

93
(No Transcript)
94
(No Transcript)
95
Summary Evaluation Good Ideas
  • Longitudinal studies of real use
  • Match the participants to the content of the
    collection and the tasks
  • Test against a strong baseline

96
Summary Evaluation Problems
  • Bias participants towards a system
  • Try our interface versus linear view
  • Tailor tasks unrealistically to benefit the
    target interface
  • Impoverish the baseline relative to the test
    condition
  • Conflate test conditions

97
Conclusions
  • Grouping search results seems beneficial in two
    circumstances
  • General web search, using transparent labeling
    (monothetic terms) or category labels rather than
    cluster centroids.
  • Effects
  • Works primarily on ambiguous queries,
  • (so used a fraction of the time)
  • Promotes relevant results up from below the first
    page of hits
  • So important to group the related items together
    visually
  • Users tend to select more documents than with
    linear search
  • May work even better with meta-search
  • Positive subjective responses (small studies)
  • Visualization does not work.

98
Conclusions
  • Grouping search results seems beneficial in two
    circumstances
  • Collection navigation with faceted categories
  • Multiple angles better than single categories
  • searchers turn into browsers
  • Becoming commonplace in e-commerce, digital
    libraries, and other kinds of collections
  • Extends naturally to tags.
  • Positive subjective responses (small studies)

99
Discussion
  • So why arent the major web search engines
    doing it?
Write a Comment
User Comments (0)
About PowerShow.com