Organizing Search Results - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Organizing Search Results

Description:

Text classification models. UI for integrating search results ... Model = weighted ... models for LS directory. 1 model for top level; N models for ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 29
Provided by: susand55
Learn more at: http://vw.indiana.edu
Category:

less

Transcript and Presenter's Notes

Title: Organizing Search Results


1
Organizing Search Results
  • Susan Dumais
  • Microsoft Research

2
Organizing Search Results
  • Algorithms and interfaces that improve the
    effectiveness of search
  • Beyond ranked lists
  • Main goal to support search
  • Also information analysis and discovery
  • Example applications
  • SWISH, results classification
  • GridViz, results summarization
  • SIS, personal landmarks for context

3
Searching with Information Structured
Hierarchically (SWISH)
  • Collaborators
  • Edward Cutrell, Hao Chen (Berkeley)
  • Key Themes
  • Going beyond long lists of results
  • Classification algorithms
  • UI techniques
  • More about it
  • http//research.microsoft.com /sdumais

4
Organizing Search Results
Query jaguar
5
Web Directory
  • LookSmart Directory Structure
  • 400k pages 17k categories 7 levels
  • 13 top-level categories 150 second-level
    categories
  • Top-level Categories

Automotive Business Finance Computers
Internet Entertainment Media Health
Fitness Hobbies Interests Home Family People
Chat Reference Education Shopping
Services Society Politics Sports
Recreation Travel Vacations
6
SWISH System
  • Combines the advantages of
  • Directories - Manually crafted structure but
    small
  • Search engines - Broad coverage but limited
    metadata
  • Project search engine results to category
    structure
  • Two main components
  • Text classification models
  • UI for integrating search results and structure
  • Context (category structure) plus focus (search
    results)

7
SWISH Architecture
8
Learning Classification
  • Support Vector Machine (SVM)
  • Accurate and efficient for text classification
    (Dumais et al., Joachims)
  • Model weighted vector of words
  • Automobile motorcycle, vehicle, parts,
    automobile, harley, car, auto, honda, porsche
  • Computers Internet rfc, software, provider,
    windows, user, users, pc, hosting, os, downloads
    ...
  • Hierarchical models for LS directory
  • 1 model for top level N models for second
  • Very useful in conjunction w/ user interaction

9
User Interface Experiments
List Organization
Category Organization
10

11
Effect of Query Difficulty
12
SWISH Summary and Design Implications
  • Text Classification
  • Learn accurate category models
  • Classify new web pages on-the-fly
  • Organize search results
  • User Interface
  • Tightly couple search results with category
    structure
  • User manipulation of presentation of category
    structure

13
Organizing Search Results, other examples
14
GridViz
  • Collaborators
  • George Robertson, Edward Cutrell, Jeremy Goecks
    (Georgia Tech)
  • Key Themes
  • Abstract beyond individual results
  • Highly interactive interface to support
    understanding of trends and relationships
  • More about it
  • http//research.microsoft.com/sdumais

15
GridViz
  • Summarize the results of a search
  • Grid-based design
  • Axes represent topic, time, people
  • Cells encode frequency, recency
  • Supports activities like
  • What newsgroups are active (on topic x)?
  • What people are active, authoritative (on topic
    x)?
  • When did I last interact w/ people?

16
GridViz Demo
17
User Interface Experiments
18
GridViz Summary
  • Abstracting beyond individual results
  • Highly interactive interface
  • Grid-based design
  • Axes represent people, topic, time
  • Cells encode frequency, recency
  • Preliminary but promising

19
Stuff Ive Seen (SIS)
  • Collaborators
  • Edward Cutrell, Raman Sarin, JJ Cadiz, Gavin
    Jancke, Daniel Robbins, Merrie Ringel (Stanford)
  • Key Themes
  • Your content
  • Information re-use
  • Integration across sources
  • More about it
  • internal for now

20
Search Today
  • Many locations, interfaces for finding things
    (e.g., web, mail, local files, help, history,
    intranet)
  • Often slow

21
Search with SIS
  • Unified index of stuff youve seen
  • Unify access to information regardless of source
    mail, archives, calendar, files, web pages,
    etc.
  • Full-text index of content plus metadata
    attributes (e.g., creation time, author, title,
    size)
  • Automatic and immediate update of index
  • Rich UI possibilities, since its your content
  • Architecture
  • Client side indexing and storage
  • Built using MS Search components

22
SIS Demo
23
SIS Alpha Observations
  • 800 internal users
  • Usage logs (incl different interfaces), survey
    data
  • File types opened
  • 76 Email
  • 14 Web pages
  • 10 Files
  • Age of items accessed
  • 7 today
  • 22 within the last week
  • 46 within the last month

24
SIS Alpha Observations
  • Use of other search tools
  • Non-SIS search for web, email, and files
    decreases
  • Importance of people
  • 25 of the queries involve peoples names
  • Importance of time
  • Date by far the most popular sort field, followed
    by rank, author, title
  • Even when rank is the default

25
SIS UI InnovationsTimeline w/ Landmarks
  • Importance of time
  • Timeline interface
  • Contextualize results using important landmarks
    as pointers into human memory
  • General holidays, world events
  • Personal important photos, appointments

26
Milestones in Time Demo
27
Milestones in Timeline
28
SIS Summary
  • Unified index of stuff youve seen
  • Fast access to full-text and metadata, from
    heterogeneous sources
  • Automatic and immediate update of index
  • Rich UI possibilities
  • Next steps
  • Better support for tagging - flatland
  • Implicit queries for finding related info, and
    identifying Stuff I Should See
  • Integration with richer activity-based info, Eve

29
Organizinging Search Results
  • Algorithms and interfaces to improve search
  • Use structure and context
  • Examples and key themes
  • SWISH grouping
  • GridViz abstraction
  • SIS personal content and landmarks
  • Also
  • Important attributes People, topics, time
  • Interaction
  • Evaluation
  • More information
  • http//research.microsoft.com/sdumais
  • sdumais_at_microsoft.com
  • Christopher Lee of (SIG)IR
  • http//www.cdvp.dcu.ie/SIGIR/index.html
Write a Comment
User Comments (0)
About PowerShow.com