Title: WebQuilt: Capturing and Visualizing the Web Experience
1WebQuilt Capturing and Visualizing the Web
Experience
Jason I. Hong James A. Landay Group for User
Interface Research EECS Department University of
California at Berkeley World Wide Web 10
2Motivation
- Many websites have usability problems
- 62 web shoppers gave up past month (Spool)
- 39 failed in buying attempts (Creative Good)
- Two problems all web designers face
- Understanding users' tasks
- Understanding obstacles in completing tasks
- Many methods for understanding tasks
- E.g. interviews, ethnographic observations,
surveys, focus groups - Focus here is on understanding obstacles
3Understanding Obstacles Today
- Traditional usability tests
- Extremely useful qualitative information
- Lots of time, small websites, few people, local
- Server-side logging
- Easy to collect, remote testing, lots of tools
- Restricted access, little on tasks and problems
- Client-side logging
- Can track everything, remote testing
- Installation, platform-dependent, analysis tools
4Streamlining Current Practices
- Fast and easy to deploy on any website
- Compatible with range of OS and browsers
- Better tools for analyzing the data
5WebQuilt Approach
- Fast and easy to deploy on any website
- Compatible with range of OS and browsers
- Better tools for analyzing the data
6WebQuilt Approach
- Fast and easy to deploy on any website
- Compatible with range of OS and browsers
- Better tools for analyzing the data
Proxy
Client Browser
Web Server
WebQuilt Log
7WebQuilt Approach
- Fast and easy to deploy on any website
- Compatible with range of OS and browsers
- Better tools for analyzing the data
8WebQuilt Usage
- Setup several tasks, recruit 20100 people
- Email participants a URL that uses the proxy
- Ask them to complete the predefined tasks
- Collect lots of remote (or local) data
- Aggregate, view, and interact with data
- Find problems, fix, repeat
Design
Evaluate
Prototype
9Outline
- Background and Motivation
- WebQuilt Architecture
- Usage Experience and Visualizations
- Summary and Future Work
10Overall Architecture
Log Files
11Proxy
- Lies between browser and server
- http//domain.com/webquilt?replacehttp//www.yah
oo.com - One log file per user session
- Currently use Java servlets
- Important part is log file format
12Log File Format
Time (ms)
From TID
To TID
Parent ID
HTTP Response
Frame ID
Link ID
HTTP Method
URL Query
6062
0
1
-1
200
-1
-1
GET
http//www.google.com
11191
1
2
-1
200
-1
-1
GET
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
167525
2
3
-1
200
-1
1
GET
http//www.phish.com/bios.html
31043
3
4
-1
200
-1
2
GET
https//www.phish.com/bin/catalog.cgi
68772
2
5
-1
200
-1
15
GET
http//www.emusic.com/features/phish
13Log File Format
Time
From TID
To TID
Parent ID
HTTP Response
Frame ID
Link ID
HTTP Method
URL Query
6062
0
1
-1
200
-1
-1
GET
http//www.google.com
11191
1
2
-1
200
-1
-1
GET
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
(ms)
167525
2
3
-1
200
-1
1
GET
http//www.phish.com/bios.html
31043
3
4
-1
200
-1
2
GET
https//www.phish.com/bin/catalog.cgi
68772
2
5
-1
200
-1
15
GET
http//www.emusic.com/features/phish
14Log File Format
Time (ms)
From TID
To TID
Parent ID
HTTP Response
Frame ID
Link ID
HTTP Method
URL Query
6062
0
1
-1
200
-1
-1
GET
http//www.google.com
11191
1
2
-1
200
-1
-1
GET
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
167525
2
3
-1
200
-1
1
GET
http//www.phish.com/bios.html
31043
3
4
-1
200
-1
2
GET
https//www.phish.com/bin/catalog.cgi
68772
2
5
-1
200
-1
15
GET
http//www.emusic.com/features/phish
15The Proxy at Runtime
Store
16The Proxy at Runtime
1. Process Client Request
Store
17The Proxy at Runtime
2. Retrieve Requested Document
Store
18The Proxy at Runtime
3. Process and return the page
Store
19The Proxy at Runtime
Start with ltA HREF"computers.html"gt End up
with ltA HREF"http//tasmania.cs.berkeley.edu/web
quilt? replacehttp//www.yahoo.com/computers.html
tid1linkid12"gt
20The Proxy at Runtime
4. Store the page
5. Log the transaction
Store
21Additional Proxy Functionality
- Handling Cookies
- Cookies only sent from browser back to web server
that put it there
User ID
Domain
Cookie
AAA
yahoo.com
xyzzy
AAA
google.com
asdfg
BBB
yahoo.com
abcde
22Additional Proxy Functionality
- Handling Cookies
- Cookies only sent from browser back to web server
that put it there - Handling Secure Socket Layer
- Encrypts page requests and data
- E.g. Shopping Carts, Financials
SSL
23Additional Proxy Functionality
- Handling Cookies
- Cookies only sent from browser back to web server
that put it there - Handling Secure Socket Layer
- Encrypts page requests and data
- E.g. Shopping Carts, Financials
- Split into two SSL requests
Proxy
Client Browser
Web Server
SSL
SSL
24Action Inferencer
- Takes a single log file and converts into a list
of actions - "Clicked on link" or "Hit back button"
- Inference because still must guess
- Back and forward actions local
25Re-Assembling User Actions
Time
From TID
To TID
Parent ID
HTTP Response
Frame ID
Link ID
HTTP Method
URL Query
6062
0
1
-1
200
-1
-1
GET
http//www.google.com
11191
1
2
-1
200
-1
-1
GET
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
167525
2
3
-1
200
-1
1
GET
http//www.phish.com/bios.html
31043
3
4
-1
200
-1
2
GET
https//www.phish.com/bin/catalog.cgi
68772
2
5
-1
200
-1
15
GET
http//www.emusic.com/features/phish
26Re-Assembling User Actions
From TID
To TID
Parent ID
HTTP Response
Frame ID
Link ID
HTTP Method
URL Query
Time
0
1
-1
200
-1
-1
GET
http//www.google.com
6062
2
-1
200
-1
-1
GET
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
11191
3
-1
200
-1
1
GET
http//www.phish.com/bios.html
167525
4
-1
200
-1
2
GET
https//www.phish.com/bin/catalog.cgi
31043
5
-1
200
-1
15
GET
http//www.emusic.com/features/phish
68772
27Re-Assembling User Actions
From TID
To TID
URL Query
0
1
http//www.google.com
1
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
2
1
2
3
http//www.phish.com/bios.html
3
2
3
4
https//www.phish.com/bin/catalog.cgi
4
3
2
5
http//www.emusic.com/features/phish
5
2
28Re-Assembling User Actions
From TID
To TID
URL Query
Start
1
0
1
http//www.google.com
1
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
2
1
2
3
http//www.phish.com/bios.html
3
2
3
4
https//www.phish.com/bin/catalog.cgi
4
3
2
5
http//www.emusic.com/features/phish
5
2
29Re-Assembling User Actions
From TID
To TID
URL Query
Start
1
2
0
1
http//www.google.com
1
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
2
1
2
3
http//www.phish.com/bios.html
3
2
3
4
https//www.phish.com/bin/catalog.cgi
4
3
2
5
http//www.emusic.com/features/phish
5
2
30Re-Assembling User Actions
From TID
To TID
URL Query
Start
1
2
3
0
1
http//www.google.com
1
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
2
1
2
3
http//www.phish.com/bios.html
3
2
3
4
https//www.phish.com/bin/catalog.cgi
4
3
2
5
http//www.emusic.com/features/phish
5
2
31Re-Assembling User Actions
From TID
To TID
URL Query
Start
1
2
3
4
0
1
http//www.google.com
1
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
2
1
2
3
http//www.phish.com/bios.html
3
2
3
4
https//www.phish.com/bin/catalog.cgi
4
3
2
5
http//www.emusic.com/features/phish
5
2
32Re-Assembling User Actions
From TID
To TID
URL Query
Start
1
2
3
4
0
1
http//www.google.com
1
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
5
2
1
2
3
http//www.phish.com/bios.html
3
2
3
4
https//www.phish.com/bin/catalog.cgi
4
3
2
5
http//www.emusic.com/features/phish
5
2
33Action Inferencer
Start
1
2
3
4
5
34Action Inferencer
Start
Case 1
1
2
3
4
5
Start
1
2
3
4
3
2
5
Link
Back
Link
35Action Inferencer
Start
Case 2
1
2
3
4
5
Start
1
2
3
4
3
2
1
2
5
Link
Back
Link
Fwd
36Action Inferencer
Start
Case 1 by default (shortest path)
1
2
3
4
5
Start
1
2
3
4
3
2
5
37Merger
- Combines multiple log files into a single
directed graph - Web pages are nodes
- Actions are edges
38Graph Layout
- Assign (x,y) to all nodes
- Force-directed placement
- Keep connected nodes close
- Push unconnected nodes far apart
- Edge-weighted depth-first
- Most traffic along top
- Less followed paths below
- Grid to help organize and align
- Plug-in new algorithms here
39Visualization
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49Future Work
- More sophisticated logging
- Lower level events (e.g. ATT WET)
- Personalized web pages
- More sophisticated visualizations
- More use of semantic zooming
- Dynamic filtering
- Continue getting feedback from designers
- Initiated interviews with web designers
- Still need to do evaluations
50Take Home Ideas
- Need more tools for improving web site usability
- Proxy logging
- Logging where task is already known
- Any website, any browser, remote testing
- Visualizing logged data
- Aggregates large data sets
- Interact with in a zooming interface
- Pluggable architecture
51Acknowlegements
- Special thanks to Jeff Heer, Tim Sohn, and
Sarah Waterson - Group for User Interface Research
- EECS Department
- University of California at Berkeley
- Download WebQuilt at
- http//guir.berkeley.edu/webquilt
52Extra Slides
53Berkeley Website A
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59Casa de Fruta A
60(No Transcript)
61(No Transcript)
62(No Transcript)
63Casa de Fruta B
64(No Transcript)
65(No Transcript)
66(No Transcript)
67(No Transcript)
68Log File Format
Time
From TID
To TID
Parent ID
HTTP Response
Frame ID
Link ID
HTTP Method
URL Query
6062
0
1
-1
200
-1
-1
GET
http//www.google.com
11191
1
2
-1
200
-1
-1
GET
http//www.phish.com/index.htmqPhishbtnII27mF
eelingLucky
167525
2
3
-1
200
-1
1
GET
http//www.phish.com/bios.html
31043
3
4
-1
200
-1
2
GET
https//www.phish.com/bin/catalog.cgi
68772
2
5
-1
200
-1
15
GET
http//www.emusic.com/features/phish
69In Case You're Feeling Evil
- URLs can be of the form
- http//userid_at_domain/page.html
- Most web servers ignore the userid part, but
- http//www.yahoo.com_at_tasmania.cs/
- Can auto-track people's actions once they hit
your page without their knowledge or consent