Formal User Testing - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

Formal User Testing

Description:

Formal User Testing. MIS 441: User Interface Design, Prototyping, ... Software / hardware engineer. System designers and programmers. Technical communicators ... – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 35

Provided by: mattth5

Category:

more less

Transcript and Presenter's Notes

Title: Formal User Testing

1
Formal User Testing

MIS 441 User Interface Design, Prototyping, and
Evaluation
Class 19 - March 27, 2000

2
Agenda for Today

Administrivia
Milestone 4 due today
Heuristic evaluation assignment due
Essay 1 should be returned next class
Milestone 5 (Hi-Fi and user test plan) due Mon,
Apr 17
Milestone 6 (HE) due Wed, Apr 26
Review heuristic evaluation (HE)
lets talk about the aggregated list of heuristic
violations
Formal user testing

3
...
Multiple evaluators independently produce a list
of usability problems (i.e., identify design
elements that violate one or more heuristics)
Evaluator 1
Evaluator 2
The findings are aggregated into a single list of
problems and the heuristics violated. At this
stage, redundancies are eliminated and
clarifications are made
Problem Heur Violated Description
...
The aggregated list is then sent back out to each
evaluator who then independently review the list
and assign a severity rating to each problem.
Apply severity ratings
Apply severity ratings
The lists are collected and a summary report is
created that includes the average severity rating
for each problem. The evaluators and design team
then go through a debriefing session , discuss
the problems, potential fixes, and add fix
ratings to the summary report
Summary Report
Final Report
UI Redesign / Next Prototype
4
What is User Testing?

Participants are real (or representative) users
Participants perform real tasks in a real work
context
The administrator
observes / records what participants do and say
need to decide what to measure and how to measure
it
quantitative and qualitative performance and
preference measures
analyzes the data
diagnoses the problem
recommends changes to fix those problems

5
Why Do User Testing?

Cant tell how good or bad UI is until
people use it!
preference vs. choice
e.g., surveys, interviews, focus groups, beta
testing
Other methods are based on evaluators who
may know too much (about the intent of the
design) or
may not know enough (about tasks, etc.)
e.g., cognitive walkthroughs, heuristic
evaluations
Hard to predict what real users will do until
they do them

6
Observation A Critical Difference

Observing seems easy but is very complicated
Requires careful consideration and skill
Types of observation
direct observation
video recording
data logging software
Disadvantages of observation??
experiment effect
Hawthorne effect (1939)

7
Who Should Be On a User Testing Team?

Humans factors specialist
Product marketing specialist
Software / hardware engineer
System designers and programmers
Technical communicators
Job training specialists
Customer service representatives
And many more...

8
Planning a User TestUser Test Proposal

Problem statement or test objective
Participant profile
Scenarios
Measures to collect
Data collection methods
Testing environment
Roles of design team members

9
Test Objectives
Test Objective User profile Scenarios Measure to
collect Data collection methods Testing
environment

What is the focus of each user test (evaluation)?
easy to learn, easy to remember, efficient to
use, few errors, aesthetically pleasing
General objective example
will new users be able to navigate through the
menus quickly and easily? Learnability
Specific objective example
will new users be able to find the right menu
path to read, write, send, respond to, forward,
save, and delete a message
What you want to learn from the test will lead to
who are the participants, what tasks will they
perform during the evaluation, what measures to
collect

10
Measures to Collect
Test Objective User profile Scenarios Measures to
collect Data collection methods Testing
environment

Two types of data
process data
observations of what users are doing and thinking
bottom-line data (i.e., performance measures)
counts of actions / behaviors that you see
time, errors, successes

11
Using the Results of Process Data (Think Aloud)

Summarize the data
make a list of all critical incidents (CI)
positive something they liked or that worked
well
negative difficulties with the UI
include references back to the original data
try to judge why each difficulty occurred
What does the data tell you?
UI work the way you thought it would?
is your model consistent with the users
conceptual model?
great way to better understand users conceptual
model
something missing?

12
Using the Results (Think Aloud)

Update task analysis and rethink design
rate severity and ease of fixing critical
incidents
fix severe problems and make the easy fixes
Will thinking aloud give the right answers
not always
if you ask a question, people will always give an
answer, even when it has nothing to do with the
facts

13
Measuring Bottom-Line Usability

Situations in which numbers are useful
time requirements for task completion
number of successful completions
number of errors made by users
compare 2 designs on speed or number of errors
Do not combine with think aloud protocol
talking can affect speed and accuracy (neg. and
pos.)
your project is an exception to this general rule
Time is easy to record
Error or successful completion is harder
define in advance what this means

14
Bottom-Line Data

Typical Performance Measures
time to finish a task
time spent navigating menus
time spent in the online help
time to find information in the manual
time spent recovering from errors
number of wrong menu choices
number of incorrect choices in the dialog boxes
number of wrong icon choices
number of repeated errors (the same error more
than once)
number of calls to the help desk or for aid
number of screens of on-line help looked at
number of repeated looks at the same help screen
number of times turned to the manual
number of pages looked at in each visit to the
manual

Typical Subjective User Preference
Measures Ratings of ease of learning ease of
using the product ease of doing a particular
task ease of installing the product helpfulnes
s of the on-line help ease of finding
information in the manual ease of understanding
the information usefulness of the examples in
the help Preferences over a previous version and
reasons over a competitors product for the over
the way they are doing their tasks
now preferences Predictions Would you buy this
product? Would you pay extra for the
manual? How much would you pay for this
product? Spontaneous Comments I dont
understand this message!
15
Statistical Analysis of Bottom-Line Data

Example trying to get task time lt30 min.
test gives 20, 15, 40, 90, 10, 5
Sample Mean 30, Median 17.5, Looks good!
wrong answer, not certain of anything
Factors contributing to our uncertainty
small number of test users (n6)
results are very variable (standard deviation
32)
general rule 95 confident that true mean lies
within 2 standard deviations from the sample mean
Confidence Interval is about -34 minutes, 94
minutes

16
Measuring User Preferences

How much users like or dislike the system
Likert scale
Semantic differential scale
can ask users to rate on a scale of 1 to 10
can have them choose among statements
Best UI Ive ever used, better than
average...
If you get many low ratings, you are in trouble
Can get some useful data by asking open-ended
questions about
what they liked, disliked, where they had
trouble, best part, worst part, etc.

17
Simple Single-Room Setup
Test Objective User profile Scenarios Measures to
collect Data collection methods Testing
environment

Advantages
test monitor can see is going on with the
participant
verbal cues, facial expressions, mannerisms
allows interaction with participant in early,
exploratory tests
may be more natural to think aloud with someone
in the room
Disadvantages
test monitors behavior may affect the
participants behavior
there is limited space for observers

18
Modified Single-Room Setup

Advantages
Test monitor can be less concerned about
controlling body language, mannerisms, taking
notes, etc.
Participant does not feel isolated since monitor
is still in the room
Participant more likely to think aloud
Disadvantages
Monitor cant see subtle facial expressions /
mannerisms as well
Monitor location may make user feel
self-conscious or uneasy

19
Electronic Observation-Room Setup

Advantages
Same as single-room setup
Observers dont interfere with or bias the users
Disadvantages
Monitor behavior can bias user
Requires the use of 2 rooms at a time

20
Classic Testing Laboratory Setup

Advantages
Unobtrusive data collection (but user still knows
she is being videotaped)
Monitors and observes can talk to each other and
discuss how to solve problems that come up
Setup can accommodate many observers
Disadvantages
Requires lots of money, resources, and commitment
to testing

21
Testing Environment Trade-Offs

Test monitor access to participant
Accommodations for the observers
location
number of observers allowed
Cost
equipment video cameras, data-logging
equipment, one-way mirrors, etc.
space number and size of rooms occupied during
testing

22
Roles of the Design Team MembersDuring Evaluation

Test monitor / administrator
greets, interacts with, and debriefs the test
users
accumulates and communicates test results
Timers
keep track of beginning, ending, and elapsed time
of test activities
Video recording operators
record comments by test users, instructions by
monitor, and all interactions between monitor,
participant, and prototype
camera angles to maximize user/product visibility

23
Roles of the Design Team MembersDuring
Evaluation (Continued)

Product / technical experts
make sure system does not malfunction during the
test
Other testing roles
play a customer role in the test
simulate help calls on a hotline
Test observers
development team Leads to better appreciation
for user-centered design perspective and the
problems users will have
do not let managers of test users be observers at
the test
members of other project development teams

24
Characteristics of an Effective Test Monitor

Grounding in basic usability engineering
cognitive/information processing, user-centered
design, human factors expertise
Quick learner
understand / interpret the comments / actions of
test users
able to probe users and ask effective follow-up
questions
Instant rapport with participants
make friends, put user at easy, develop a trust
Excellent memory

25
Characteristics of an Effective Test Monitor
(Continued)

Good listener
listen with new ears each time
Comfortable with ambiguity
Flexibility
know when to deviate from the test plan
Long attention span
There is no predicting when a gem of a discovery
will arise during a test session
Usually 10 -20 sessions, 2-3 hours each watching
the same tasks repeatedly

26
Characteristics of an Effective Test Monitor
(Continued)

Empathetic people person
Good communicator
presenting information to design team
making recommendations
writing skills for written report
presentation skills for convincing team members
of changes that need to be made
Good organizer

27
Preparing Test Materials

Recruiting letter and pretest questionnaire
Test / orientation script (sample in Rubin, page
150)
read verbatim usually
tells users what will happen during the test
intended to put them at easy
product is being evaluated, not the user
Nondisclosure agreement and tape consent form

28
Preparing Test Materials (Continued)

Task scenarios
List of measures / data to be collected
performance and preference data
Posttest questionnaire
preference information (opinions and feelings)
from the user
usually lots of Likert and semantic differential
scales
Debriefing topics (issues)
get open-ended feedback and clarifications from
the user

29
Usability Testing Services

Usability Sciences
http//www.usabilitysciences.com/
seeking users to usability test software products
and get paid!
Users are videotaped and asked for feedback as
they perform a set of tasks with the product(s)
being tested. Your feedback is turned into
recommendations for the client. In most studies,
tests last roughly 2-3 hours. All users are
compensated for their time. In most cases, all
testing is conducted in Usability Sciences'
testing labs in Las Colinas in the Dallas/Fort
Worth metroplex. If you would like to
participate in a usability test, please contact
Stephanie Farley at testing_at_usabilitysciences.com,
or call us at (972) 550-1599.

30
Usability Testing Services (Continued)

Interface Analysis Associates
http//www.interface-analysis.com/home.shtml
Egosoft Laboratories Incorporated
http//www.ergolabs.com/
Check out Ergosofts links and downloads page
http//www.ergolabs.com/links_and_downloads/links_
and_downloads.htm
There are many others...

31
Usability Testing Services (Continued)

Human Factors International, Inc.
Design and UT consultants (colors / layout /
wording)
http//www.humanfactors.com/
Siemens Usability Center
http//www.aut.sea.siemens.com/usability/testing.h
tm

32
On-line PC Magazine Article

Making Software Easier Through Usability
Testing
http//www.zdnet.com/pcmag/pctech/content/17/17/tu
1717.001.html
Microsofts usability lab
Talks about the setup of Microsofts usability
testing labs
User testing of Windows 95, Windows 98, and
Office 97
http//www.microsoft.com/usability
And there are job openings at Microsoft for
usability groups
IBM (user-centered design)
http//www-3.ibm.com/ibm/easy/eou_ext.nsf/Publish/
17

33
Milestone 5

Milestone 5a
develop the revised lo-fi storyboards
develop the hi-fi prototype based on these
storyboards
create the hi-fi storyboards with screen shots
ltPrint Screengt captures the entire monitor screen
ltAltgtltPrint Screengt captures just the active
window only
Milestone 5b
develop the formal user test proposal
from test objectives to roles of the design team
members
prepare the test materials to be used in the test
however, you should not perform the user test at
this point

34

Next Class

More specifics on conducting a user test
continue reading through the assigned readings
from Rubin
you can skim through sections with which you are
familiar (e.g., discussions about the user
profile, task analysis, scenarios, etc.)

Write a Comment

User Comments (0)