Title: Understanding Web Searching
1Understanding Web Searching
- Secondary Readings and So On
- Will Meurer for WIRED
- October 7, 2004
2Introduction
- Why do we care about how people use the Web?
- Todays topics (10/7, not the present age)
- Implicit vs. explicit feedback
- Representation effectiveness
- Browser-based activities
- History mechanisms
- How do we cater to the people?
- Resources
- Research
3Implicit vs. Explicit FeedbackReading Time,
Scrolling and (Kelly Belkin, 2001)
- Implicit feedback (Morita Shinoda)
- Time spent on a page is directly related to user
interest. Backed by many studies.
- Explicit feedback (this study)
- Time spent on a page is similar for relevant and
irrelevant content.
- Results suggest
- Generalizability is severely affected by
explicit feedback methods.
- Spend time to choose the right feedback type!
4Implicit vs. Explicit FeedbackReading Time,
Scrolling and (Kelly Belkin, 2001)
- Why do the results differ?
- Relevance was difficult to distinguish this time
- Participants are truly interested in the content
former studies
- Users may have rushed to complete in this
experimental context
5Representation Effectiveness How we really use
the Web (Krug, 2000)
- Three facts of life
- We dont read pages. We scan them.
- Why? hurry, necessity, habit
- If we are to read its entirety, we save or
print!
- (ClearType project)
6Representation Effectiveness How we really use
the Web (Krug, 2000)
- We dont make optimal choices. We Satisfice.
- Why? hurry, quick access to and fro, less work
than thinking
- Generally, its more productive to guess.
7Representation EffectivenessHow we really use
the Web (Krug, 2000)
- We dont figure out how things work.
- Why? not important, if it aint broke
(baroque)
- Is it important to us whether the user
understands how it works or not? Why?
8Representation EffectivenessCognitive Strategies
in Web (Navarro-Prieto, et al, 1999)
- Users get lost on the Web. Why?
- It is not just interactivity between user and
system, rather user, task, and information
- Analysis structure of browsing behavior presented
and tested
- The Interactivity Framework or How we should
analyze cognitive strategies
9Representation EffectivenessCognitive Strategies
in Web (Navarro-Prieto, et al, 1999)
- The Interactivity Framework
- User Level Web experience, cognitive processes,
cognitive style, knowledge (CS majors knew more
about SE processes)
- User Strategies based on searching structure
(or lack of), task nature
10Representation EffectivenessCognitive Strategies
in Web (Navarro-Prieto, et al, 1999)
- Information Structure
- Internal (users) representation
- External (systems) representation
- Computational Offloading How much work does the
user have to do to understand and how much does a
representation help?
- Re-representation How much it makes problem
solving easier or more difficult
- Graphical Constraining How it constrains
inferences
- Temporal and Spatial Constraining How it helps
when distributed over time and space
11Representation EffectivenessCognitive Strategies
in Web (Navarro-Prieto, et al, 1999)
12Representation EffectivenessCognitive Strategies
in Web (Navarro-Prieto, et al, 1999)
- More Results
- Experienced users searched with a plan
- By having a plan you keep a more internal
representation and focus your search
- Inexperienced users were more influenced by
external representations
- Computational Offloading Results
- Must explain
- How have these issues changed?
13Representation Effectiveness Cognitive
Strategies in Web (Navarro-Prieto, et al, 1999)
- Conclusions
- Cognitive strategies used by the participants
depend on how the information is structured.
- Interaction is a multi-dimensioned concept.
- Search engine interfaces should be designed to
have less restrictive external representation.
14Browser-based ActivitiesCharacterizing Browsing
(Catledge Pitkow, 1995)
- User study of browsing events at the Georgia Tech
(xMosaic browser)
- Three main browsing strategies identified
- Search browsing directed search, goal known
- General purpose browsing consulting highly
likely sources for needed information
(dictionary.com)
- Serendipitous browsing random
- Most people use a combination of these
15Browser-based ActivitiesCharacterizing Browsing
(Catledge Pitkow, 1995)
- Results
- Users were patient 99 of the time for long page
loads
- 1222 unique sites accessed outside of GATech
(16 of Web servers)
- Paths were calculated (sequences of page
navigation)
- Per session, paths of 7 different sites occurred
5 times
- Per user, paths of 8 different sites occurred 9
times
16Browser-based ActivitiesCharacterizing Browsing
(Catledge Pitkow, 1995)
- More Results
- 2 of the retrieved pages were saved or printed
- Based on users slope, browsing strategy
categories were applied
- Slope can also categorize usage
- patterns of Web documents
- Users tended to operate in one
- small area of a site
17Browser-based Activities Characterizing
Browsing (Catledge Pitkow, 1995)
- Design Strategies
- Users averaged 10 pages per server
- Make most important info within 2 or 3 jumps from
the index
- Do not put too many links on one page increases
search time (back, forward, back, site map,
etc.)
- Facilitate the likely visitor browser patterns
- Maybe make more than one version of your page?
- Most work well in a hub and spoke environment
- The Future
- Offer site tour based on most frequently traveled
paths
- Alter page design dynamically based on site trends
18History Mechanisms (in browsers)Revisitation
Patterns in (Tauscher Greenberg, 1997)
- Purpose Provide empirical data to aid in the
development of effective history mechanisms
- Understand revisitation patterns
- Evaluate current mechanisms and suggest best
practices and methods
- Data Collection
- Altered version of xMosaic to record activity
- Survey of users afterward
19History Mechanisms (in browsers)Revisitation
Patterns in (Tauscher Greenberg, 1997)
- Revisitation Results
- 58 recurrence rate (40 are new pages!)
- As people search they build their vocabulary
- 7 browsing strategies
- First-time visits to cluster of pages
- Revisits to pages
- Authoring of pages (high reload percentage)
- Regular use of web-based apps
- Hub-and-spoke (breadth-first approach)
- Guided tour (e.g. next page links)
- Depth-first search (following links deeply before
returning to the index)
20History Mechanisms (in browsers)Revisitation
Patterns in (Tauscher Greenberg, 1997)
- Revisitation Results
- Visit frequency as a function of distance
- Users mostly revisit recently visited pages
(within about 6 jumps)
- 39 chance that the next URL will match one of
the previous 6 pages visited
- Access frequency
- 60 of pages visited only once
- 19 visited twice
- 8 visited 3 times
- 4 visited 4 times
- Locality (not valuable for predicting next page)
- Most locality sets were small
- Only 2.5 to 4.5 URLs per set
- Only 15 of pages were part of a locality set
- Paths (not valuable for predicting next page)
- Could these be captured and offered in a history
mechanism?
- Time per page could indicate path
21History Mechanisms (in browsers)Revisitation
Patterns in (Tauscher Greenberg, 1997)
- Mechanism types
- Recency Ordered
- Sequential order based on time accessed
- Repeated entries for revisitation
- Pruned by keeping only first instance or only
last
- Simple for users to understand (they remember
paths)
- Frequency Ordered
- Most visited at top, least visited at bottom
- User interest changes, latest URLs must have
frequency
- How to break ties last visited, earliest
visited
- When few items are on the list, this suffers
- Difficult for users to understand
22History Mechanisms (in browsers)Revisitation
Patterns in (Tauscher Greenberg, 1997)
- Stack-based
- Recently visited at top
- Order and availability depend on
- Loading causes page to be added to the top
- Recalling changes pointer to the currently
displayed page
- Revisiting user reloads the page, has no effect
on the stack
- Keeps duplicates
- Non-persistent vs. persistent (btw sessions)
- Better than recency at short distances
- Users have difficulty understanding this model
23History Mechanisms (in browsers)Revisitation
Patterns in (Tauscher Greenberg, 1997)
- Hierarchically Structured
- Recency ordered hyperlink sublists
- Like recency w/ latest position saved
- Each URL has its own sublist of links from that
page
- Helps with common linking paths
- Easier to understand
- Context-sensitive web subspace
- Somewhat of a combination of the above-mentioned
and stack-based approaches
- Gives user better understanding of context of
his/her searches
- May be difficult to remember where a certain URL
was
- I THINK this approach would be a great tool
24History Mechanisms (in browsers)Revisitation
Patterns in (Tauscher Greenberg, 1997)
- Do users actually use history mechanisms?
- Less than 1 of navigation
- 3 involve favorites
- 30 of navigation was back button usage
25How do we cater to the people?
- Inter-site browsing strategies are not easy to
tackle. How would you control that?
- Why should we attempt to understand user behavior
and search strategies?
- Formulate general design principles (e.g. 3 level
depth)
- Design for multiple searching personalities
- Understand how to survey your intended users or
get feedback most appropriately
- Identify importance of all aspects of the
development process and allocate resources
accordingly
26How do we cater to the people?
- Some Bright Ideas
- Personalized search
- Learning systems You might also like
- www.a9.com (history, favorites, personalized
interface)
- But what about changing for different types of
user behavior based on the users path history on
your server?
- Researched since 1995 and earlier!
- What has resulted?
- Microsoft ASP.net 2.0 Web Parts
27What resources are out there?
- xMosaic 2.6 download, for those of you so
excited
- Architecture of the World Wide Web
http//www.w3.org/TR/webarch/
- Sum Sun Sug Gestions http//www.sun.com/980713/web
writing/
- Jakob Nielsen research on content usability,
http//useit.com/alertbox/9710a.html
28Research
- Vox Populi The Public Searching Of The Web
(2001)
- Compares statistics from two studies
- Shows how public searching changed from 1997 to
1999
- Usage Patterns of a Web-Based Library Catalog
(2001), Michael D. Cooper
- Real Life, Real Users, and Real Needs A Study
and Analysis of User Queries on the Web (2000),
Jansen, Spink Saracevic
- Redefining the Browser History in Hypertext Terms
(), Mark Ollerenshaw