Title: WWW and Internet
1WWW and Internet
- CS 7450 - Information Visualization
- March 3, 2005
- John Stasko
2Internet and WWW
- By nature, abstract, so good target for
visualization - Often described in terms of metaphors
- Information Superhighway
3Agenda
- Two main topics
- Presentations of the Internet and WWW
- Focus on topology and navigation, similar to the
graph visualization work - Visual aids for browsing and using the WWW and
the Internet - Assistive visualizations not focusing on
presenting net structure and connectivity
41. Internet and WWW Topology
- Fundamentally, the Internet is a graph with some
existing physical topology, though that is often
not how we want to conceptualize it - Might think of it as having a structure
- Our discussions from graph visualization are
germane here
5The Problem
Mukherjea Foley WWW 95
6The Problem
- Websites simply are too big
- Huge graphs
- Layout is challenging
7Step Back
- Why would someone want to visualize the WWW?
8Some Reasons
- Aid authors and webmasters with production and
organization of content - Assist Web surfers making sense of the
information - Help researchers understand the Web
9Depictions of the Web
- Great web site that presents many different
conceptualizations of cyberspace - Atlas of Cyberspacehttp//www.cybergeography.org/
atlas/ - Lets take a few minutes to browse...
10Mapping the Internet
- Bill Cheswick at ATT
- Interesting visualizations plus the data sets are
available - www.cs.bell-labs.com/who/ches/map/index.html
11Internet Traffic Paths
www.caida.org/tools/measurement/skitter/
12MboneMap
www.cs.berkeley.edu/elan/mbone/map.html
13Immersive Systems
www.pnl.gov/remote/projects/starlight/
14View of Web Sites Pages
www.dynamicdiagrams.com/
15Web Site
www.mos.ics.keio.ac.jp/NattoView
16Web Site Visitations
www.inventix.com
17Task Analysis
- Potential web-related tasks
- How and when has info been accessed?
- Where do people enter and spend time?
- How do they move about?
- What paths arent traversed?
- Where are they coming from?
- What has been added, changed, deleted?
- Do changes affect navigation patterns?
- Do we need to do a redesign?
18Data Set
- Each server request is a data case
- Example variables
- IP Address/Client host
- Timestamp
- URL requested
- HTTP status (success, not found, )
- Bytes delivered
- Referencing URL (HTTP-Referrer)
- User agent (browser and OS info)
- ...
19One Approach
- Use existing InfoVis tool (Eureka, Spotfire,
InfoZoom, etc.), load the data set, and analyze
it - Get all the strengths and weakness of the InfoVis
tool for supporting particular analysis tasks
20Web Ecology
- Problem Most visualizations of the web fail to
present the dynamically changing ecology of users
and documents on the web - What do we mean by ecology metaphor?
Chi, et al CHI 98
21Web Ecology
- By understanding set of relationships (ecology)
among users and their information environment,
and its change through time (evolution)
individuals can better understand - Web Content
- Layout of physical and topological space
- Usage through time
22Existing Visualizations
- Despite useful functions, problems
- Difficulty visualizing large number of documents
- Considerable amount of screen real-estate used
- Only permits the visualization of a site at a
particular point in time, very difficult to make
comparisons across times - No mechanisms provided that allow differences in
usage to be identified
23Techniques
- Disk Tree
- Center-rooted tree that represents the hyperlink
structure of a web site - Time Tube
- Set of disk trees that organizes and visualizes
the evolution of web sites
24Task Application
- Visualizations designed to be useful for
- Local - Finding specific content
- Comparison - Comparing info at two places
- Global - Discovering a trend or pattern in the
site
25Analysis Domain
- www.xerox.com, April 97
- 7,588 items across a 30-day period
- 889 new items
- Daily log kept of additions, modifications, and
deletions of content - Base data comes from link info, usage log from
web servers - Topological info from custom hyperlink database
26Disk Trees
- Interested in shortest number of hops from one
document to another - Breadth-first traversal transforms the web graph
into a tree by placing the node as close to the
root node as possible - After obtaining this tree we then visualize the
structure using the Disk Tree technique
27Disk Tree
Lines - tree links Line size brightness -
page access frequency Color - page lifecycle
stage new red continued green
deleted yellow
28Advantages
- Structure is compact, with pattern easily
recognizable - When viewed straight on or at slight angles, no
occlusion problems, since entire layout is on a
2-D plane - Unlike cone trees, this 2-D representation can
utilize a third dimension for other information,
such as time - Circularity pleasing to the eye
29Time Tubes
- Time Tubes are multiple disk trees layered out
along a spatial axis - Advantages
- By using a spatial axis to represent time, we see
information space-time in a single visualization - Focus and Context
- Possibility for Animation
30Time Tubes
31Key Point
- Pages there any time during the studied period
are shown in all disk trees for period, even if
they didnt exist yet
32Real Use
- Time Tube answers following questions
- What devolved into dead wood? When did it? Was
there a correlation with the restructuring of the
web? - Product safety pages got darker and darker,
indicating lower usage - Doesnt tell why page is less popular, just
raises a flag to explore page further
33Real Use
- What evolved into a popular page? When did it?
Was there a correlation with the restructuring of
the Web site? - Redesign of site called attention to Fact Book
page - Became more popular and the corresponding Disk
Trees become greener and greener in successive
weeks
34Real Use
- How was usage affected by items added over time?
- Press release issued for new family of products,
shown as red links - Usage in the third week jumped from 1 access to
871 accesses, this example helps us understand
that this was probably a well received product
line
35Real Use
- How was usage affected by items deleted over
time? - Change in removing direct link from home page to
main driver page did not negatively affect the
overall use of driver information - Info stayed green indicating usage, but link from
home page was black, showing not much traffic
36E-Commerce Applications
- What if your focus is on understanding user
access patterns for web sites selling products to
consumers? - What tasks are important?
37One Approach
- Blue Martini Software
- Aggregate web data and visualize simplified graph
of user movements through web site - Highlight places where people leave before
purchasing - ...
Brainerd Becker InfoVis 01
38Different icons represent different kinds of
pages Only show most-used pages
39E-Commerce mimics mall shopping ) Gender
differences in purchase paths at websites
402. Aiding WWW Browsing
- Can we utilize information visualization
techniques to help people interact with the WWW
and the Internet? - Battle lost in hyperspace problem
- Help us know whats there
- Help us find things
41WebBook and Web Forager
- Personal computers viewed as knowledge processors
before - Spreadsheets and calculators
- Now viewed as knowledge sources, portals to vast
information worlds - Networking and WWW
Card, Robertson and York CHI 96
42WWW Problems
- Pages are hard to find
- Users get lost, cant relocate pages
- Difficulty organizing things once found
- Difficulty doing knowledge processing on found
thing - Interacting with web is too slow to incorporate
gracefully into other activities
43Information Foraging Theory
- From Ecological Biology
- Idea user stalks certain types of information
- Users have tendency to interact repeatedly with
small clusters of information (locality of
reference) - Information encountered at certain rate
- Users evolve to increase finding rate
- Sources evolve to be more attractive
44Mechanisms Evolved
- 3 mechanisms in the evolution of the web on the
server side - Indexes - Lycos search
- Table of contents - Yahoo
- Home pages provided by users with big lists of
related links
45Assisting People
- To provide insight
- must support sensemaking
- restructuring
- recoding
- Hotlists are one mechanism in this direction
46Improvements
- WebBook and Web Forager try to do two things to
foster information sensemaking - Move away from a single web page, and group and
manipulate related pages - Move from a work environment containing a single
element to a workspace in which the page is
contained with multiple other entities, including
Web Books
47WebBook
48Features
- WebBook allows for the rapid interaction with
object at a higher level of aggregation than
pages - 3D book representation, uses animation
- Can ruffle through pages, leave bookmarks
49Applications
- Hot List books
- Topic books
- Search reports
- Book books
- ...
50Web Forager
51Web Forager
- Application that embeds the WebBook and other
objects in a hierarchical 3D information
workspace - Workspace is intended to create patches from the
web where high density of relevant pages (grouped
together in Web Books) can be combined with rapid
access
52Constituents
- Hierarchical Workspace - 3 levels
- Focus Place - full page shown, direct interaction
- Intermediate memory space - books or pages placed
when they are in use but not immediate focus - Tertiary space - Storage (bookcase)
Video
53Discussion
54Data Mountain
- 3D document management system
- Prototype is an alternative to web browser
bookmarks or favorites - Could be used for any kind of document management
Robertson, et al UIST 98
55Make-Up
- 3D inclined plane in which thumbnails of web
pages are placed to serve as favorites - User is responsible for organization
- Uses smooth animation and audio to assist
interaction
56Video
57User Study
- Data Mountain versus IE4 Favorites
- Experienced IE4 users
- Stored 100 pages, then retrieved them
- DM fared about-as-well with title cue
- DM fared better for all other cues
58Leveraging Human Capabilities
- Spatial memory analogy with paper placed on a
pile on your desk - User is responsible for personal organization
- 3D perception minimal cognitive load, good
utilization of screen space
59Interaction Techniques
- Placing pages confinement to inclined plane
makes normal 2D drag-and-drop sufficient no
unfamiliar 3D navigation needed - Continuous feedback both audio and visual
feedback are natural minimized unexpected
interactions/surprises
60Limitations/Future
- Limits number of pages stored
- No explicit support for grouping
- Landmarks/contours as helpers
61Discussion
- Strengths/Weaknesses
- Could it be used elsewhere?
62Upcoming
- Trees Hierarchies (2 days)
- Reading
- Chapter 8
- Lamping Rao
63References
- Spence and CMS texts
- All referred to papers and websites
- McNamara Defnet and Craighill, Robeson
Sheridan F 99 slides