SBD: Usability Evaluation

1 / 42

About This Presentation

Title:

SBD: Usability Evaluation

Description:

Find most expensive house for sale? Best case (expert) ... Video-tape screen & mouse. Eye tracking, biometrics? Analyze. Initial reaction: 'stupid user! ... – PowerPoint PPT presentation

Number of Views:30

Avg rating:3.0/5.0

Slides: 43

Provided by: chris78

Learn more at: http://courses.cs.vt.edu

more less

Transcript and Presenter's Notes

Title: SBD: Usability Evaluation

1
SBDUsability Evaluation

Chris North
cs3724 HCI

2
ANALYZE
claims about current practice
analysis of stakeholders, field studies
Problem scenarios
Scenario-Based Design
DESIGN
Activity scenarios
iterative analysis of usability claims
and re-design
metaphors, information technology, HCI
theory, guidelines
Information scenarios
Interaction scenarios
PROTOTYPE EVALUATE
summative evaluation
formative evaluation
Usability specifications
3
Evaluation

Formative vs. Summative
Analytic vs. Emprical

4
Usability Engineering
Reqs Analysis
Design
Evaluate
Develop
many iterations
5
Usability Engineering
Formative evaluation
Summative evaluation
6
Usability Evaluation

Analytic Methods
Usability inspection, Expert review
Heuristic Evaluation
Cognitive walk-through
GOMS analysis
Empirical Methods
Usability Testing
Field or lab
Observation, problem identification
Controlled Experiment
Formal controlled scientific experiment
Comparisons, statistical analysis

7
User Interface Metrics

Ease of learning
learning time,
Ease of use
perf time, error rates
User satisfaction
surveys
Not user friendly

8
Usability Testing
9
Usability Testing

Formative helps guide design
Early in design process
when architecture is finalized, then its too
late!
A few users
Usability problems, incidents
Qualitative feedback from users
Quantitative usability specification

10
Usability Specification Table
Scenario task Worst case Planned Target Best case (expert) Observed
Find most expensive house for sale? 1 min. 10 sec. 3 sec. ??? sec

11
Usability Test Setup

Set of benchmark tasks
Easy to hard, specific to open-ended
Coverage of different UI features
E.g. find the 5 most expensive houses for sale
Different types learnability vs. performance
Consent forms
Not needed unless video-taping users face
(new rule)
Experimenters
Facilitator instructs user
Observers take notes, collect data, video tape
screen
Executor run the prototype if faked
Users
3-5 users, quality not quantity

12
Usability Test Procedure

Goal mimic real life
Do not cheat by showing them how to use the UI!
Initial instructions
We are evaluating the system, not you.
Repeat
Give user a task
Ask user to think aloud
Observe, note mistakes and problems
Avoid interfering, hint only if completely stuck
Interview
Verbal feedback
Questionnaire
1 hour / user

13
Usability Lab

E.g. McBryde 102

14
Data

Note taking
E.g. _at_ user keeps clicking on the wrong
button
Verbal protocol think aloud
E.g. user thinks that button does something else
Rough quantitative measures
HCI metrics e.g. task completion time, ..
Interview feedback and surveys
Video-tape screen mouse
Eye tracking, biometrics?

15
Analyze

Initial reaction
stupid user!, thats developer Xs fault!,
this sucks
Mature reaction
how can we redesign UI to solve that usability
problem?
the user is always right
Identify usability problems
Learning issues e.g. cant figure out or didnt
notice feature
Performance issues e.g. arduous, tiring to
solve tasks
Subjective issues e.g. annoying, ugly
Problem severity critical vs. minor

16
Cost-Importance Analysis

Importance 1-5 (task effect, frequency)
5 critical, major impact on user, frequent
occurance
3 user can complete task, but with difficulty
1 minor problem, small speed bump, infrequent
Ratio importance / cost
Sort by this
3 categories Must fix, next version, ignored

Problem Importance Solutions Cost Ratio I/C

17
Refine UI

Simple solutions vs. major redesigns
Solve problems in order of importance/cost
Example
Problem user didnt know he could zoom in to
see more
Potential solutions
Better zoom button icon, tooltip
Add a zoom bar slider (like moosburg)
Icons for different zoom levels boundaries,
roads, buildings
NOT more help documentation!!! You can do
better.
Iterate
Test, refine, test, refine, test, refine,
Until? Meets usability specification

18
Project Usability Evaluation

Usability Evaluation
gt3 users Not (tainted) HCI students
Simple data collection (Biometrics optional!)
Exploit this opportunity to improve your design
Report
Procedure (users, tasks, specs, data collection)
Usability problems identified, specs not met
Design modifications

19
Controlled Experiments
20
Usability test vs. Controlled Expm.

Usability test
Formative helps guide design
Single UI, early in design process
Few users
Usability problems, incidents
Qualitative feedback from users
Controlled experiment
Summative measure final result
Compare multiple UIs
Many users, strict protocol
Independent dependent variables
Quantitative results, statistical significance

21
What is Science?

Measurement
Modeling

22
Scientific Method

Form Hypothesis
Collect data
Analyze
Accept/reject hypothesis
How to prove a hypothesis in science?
Easier to disprove things, by counterexample
Null hypothesis opposite of hypothesis
Disprove null hypothesis
Hence, hypothesis is proved

23
Empirical Experiment

Typical question
Which visualization is better in which
situations?
Spotfire vs. TableLens

24
Cause and Effect

Goal determine cause and effect
Cause visualization tool (Spotfire vs.
TableLens)
Effect user performance time on task T
Procedure
Vary cause
Measure effect
Problem random variation
Cause vis tool OR random variation?

random variation
Realworld
Collecteddata
uncertain conclusions
25
Stats to the Rescue

Goal
Measured effect unlikely to result by random
variation
Hypothesis
Cause visualization tool (e.g. Spotfire ?
TableLens)
Null hypothesis
Visualization tool has no effect (e.g. Spotfire
TableLens)
Hence Cause random variation
Stats
If null hypothesis true, then measured effect
occurs with probability lt 5 (e.g. measured
effect gtgt random variation)
Hence
Null hypothesis unlikely to be true
Hence, hypothesis likely to be true

26
Variables

Independent Variables (what you vary), and
treatments (the variable values)
Visualization tool
Spotfire, TableLens, Excel
Task type
Find, count, pattern, compare
Data size ( of items)
100, 1000, 1000000
Dependent Variables (what you measure)
User performance time
Errors
Subjective satisfaction (survey)
HCI metrics

27
Example 2 x 3 design

n users per cell

Ind Var 2 Task Type
Task1 Task2 Task3
Spot-fire
Table-Lens
Ind Var 1 Vis. Tool
Measured user performance times (dep var)
28
Groups

Between subjects variable
1 group of users for each variable treatment
Group 1 20 users, Spotfire
Group 2 20 users, TableLens
Total 40 users, 20 per cell
With-in subjects (repeated) variable
All users perform all treatments
Counter-balancing order effect
Group 1 20 users, Spotfire then TableLens
Group 2 20 users, TableLens then Spotfire
Total 40 users, 40 per cell

29
Issues

Eliminate or measure extraneous factors
Randomized
Fairness
Identical procedures,
Bias
User privacy, data security
IRB (internal review board)

30
Procedure

For each user
Sign legal forms
Pre-Survey demographics
Instructions
Do not reveal true purpose of experiment
Training runs
Actual runs
Give task
measure performance
Post-Survey subjective measures
n users

31
Data

Measured dependent variables
Spreadsheet

User Spotfire Spotfire Spotfire TableLens TableLens TableLens
User task 1 task 2 task 3 task 1 task 2 task 3

32
Step 1 Visualize it

Dig out interesting facts
Qualitative conclusions
Guide stats
Guide future experiments

33
Step 2 Stats
Ind Var 2 Task Type
Task1 Task2 Task3
Spot-fire 37.2 54.5 103.7
Table-Lens 29.8 53.2 145.4
Ind Var 1 Vis. Tool
Average user performance times (dep var)
34
TableLens better than Spotfire?