Usability Evaluation

About This Presentation

Title:

Usability Evaluation

Description:

SD2: Each digit is displayed as typed and flashing cursor moves to next position ... SD9: Stream number at the top-right corner of the display flashes ... – PowerPoint PPT presentation

Number of Views:77

Avg rating:3.0/5.0

Slides: 45

Provided by: yliu

more less

Transcript and Presenter's Notes

Title: Usability Evaluation

1
Usability Evaluation

Dr. Yan Liu
Department of Biomedical, Industrial and Human
Factors Engineering
Wright State University

2
Introduction

What is Usability Evaluation
Assess the extent to which the product can be
used by specified users to achieve specified
goals with effectiveness, efficiency, and
satisfaction in a specified context of use
Usability Evaluation in Design Process
Should occur throughout the design life cycle,
with the results of the evaluation feeding back
into modification to the design
Usability Evaluation Methods
Based on expert evaluation, without direct user
involvement
Particularly useful for assessing early designs
and prototypes
Involve users to study actual use of the system
Usually require a working prototype or
implementation
Users may also be involved in assessing early
design ideas
e.g. focus groups in which a group people are
asked about their opinions of the system

3
Goals of Usability Evaluation

Assess Systems Functionality
The systems functionality must accord with the
users requirements
Making the appropriate functionality available
within the system
Making the functionality clearly reachable by the
user in terms of the actions that the user needs
to take to perform the tasks
Assess Users Experience with the Interaction
Aspects such as how easy the system is to learn
and use and the users satisfaction with it
Users enjoyment and emotional response
(particularly in systems aimed at entertainment)
Identify Specific Problems with the Design
Aspects of the design which, when used in their
intended context, cause unexpected results or
confusion among users

4
Evaluation Though Expert Analysis

Overview
The basic intention is to identify any areas that
are likely to cause difficulties because they
violate known design rules or ignore accepted
empirical results
Flexible and can be used at any stage in the
development process
Design specification, storyboards and prototypes,
full implementations
Relatively cheap
Do not assess actual use of the system
Approaches
Cognitive walkthrough
Heuristic evaluation
Use of models
Use of previous work

5
Cognitive Walkthrough

Overview
Proposed in Polson et al. (1992)
Main focus is usually to establish how easy a
system is to support exploratory learning
Phase One
Collect Information about the System, Users, and
Tasks
A fairly detailed specification or prototype of
the system
An indication of who the users are and what kind
of experience and knowledge the evaluators can
assume about them
A description of the representative tasks most
users would want to perform on the system
A complete, written list of actions needed to
complete the tasks with the proposed system

6
Cognitive Walkthrough

Phase Two
The evaluators step through the action sequence
identified earlier to critique the system and
tell a believable story about its usability by
asking themselves a set of questions for each
step
Q1 Is the effect of the action the same as the
users goal at that point?
e.g. If the effect of an action is to save a
document, is saving a document what the user
wants to do?
Q2 Will the user see that the action is
available? (visibility of the action)
e.g. Will the user see the control that is used
to save a document?
Q3 Once the user has found the correct action,
will he/she know it is the one he/she needs?
(meaning and effect of the action)
e.g. Even if the user can see the control, will
the user recognize that it is the one he/she is
looking for to complete the task?
Q4 After the action is taken, will the user
understand the feedback he/she gets?
Appropriate feedback should be provided to inform
the user of what has happened

7
Cognitive Walkthrough

Phase Two (Cont.)
Document the cognitive walkthrough to keep a
record of the evaluators evaluation results
Pros and cons of the system
It is a good idea to produce some standard
evaluation forms for the walkthrough
The cover form would list the four questions
asked during the walkthrough process, as well as
the dates and time of the walkthrough and the
names of the evaluators
For each action, a separate standard form is
filled out that answers each of the four
questions
Any negative answer for a particular action
should be documented in detail on a separate
usability problem report sheet, including the
severity of the problem

8
Suppose we are designing a remote control for a
video recorder (VCR) and interested in the task
of programming the VCR to do timed recordings.
Our initial design is shown in the following
figures. This VCR allows the user to program up
to three timed recordings in different streams.
The next available stream number is automatically
assigned. We want to evaluate the design using
the cognitive walkthrough method.
After the Time-Record Button Has Been Pressed
9

Collect information about the system, users, and
tasks
We can assume that the user is familiar with
VCRs but not with this particular design
Identify a representative task programming the
video to time record a program starting at 1800
and finishing at 1915 on channel 4 on Oct. 16,
2008
Specify the action sequence for the task in
terms of the users action (UA) and the systems
display or response (SD)
UA1 Press the time-record button
SD1 Display moves to timer mode. Flashing
cursor appears after Start
UA2 Press digits 1 8 0 0
SD2 Each digit is displayed as typed and
flashing cursor moves to next position
UA3 Press the time-record button
SD3 Flashing cursor moves to after End
UA4 Press digits 1 9 1 5
SD4 Each digit is displayed as typed and
flashing cursor moves to next position
UA5 Press the time-record button
SD5 Flashing cursor moves to after Channel
US6 Press digit 4
SD6 Digit is displayed as typed and flashing
cursor moves to next position
UA7 Press the time-record button
SD7 Flashing cursor moves to after Date
US8 Press digits 16 10 0 8

After the Time-Record Button Has Been Pressed
10

Step through the action sequence and for each
action, we must answer the four questions and
tell a story about the usability of the system.
UA1 Press the time-record button
Q1 Is the effect of the action the same as the
users goal at that point?
The time-record button initiates timer
programming. It is reasonable to assume that a
user who is familiar with VCRs would be trying to
do this as his/her first goal
Q2 Will the user see that the action is
available?
The time-record button is visible on the
remote control
Q3 Once the user has found the correct action,
will he/she know it is the one he/she need?
It is not clear which button is the
time-record button. The icon of clock is a
possible candidate but this could be interpreted
as a button to change the time. Other possible
candidates might be the button with a filled
circle or the button at the leftmost of the 4th
row. The correct choice is the icon of clock, but
it is quite possible that the user could fail at
this point. This identifies a potential usability
problem
Q4 After the action is taken, will the user
understand the feedback he/she gets?
Once the action is taken, the display changes to
the time-record mode and shows familiar headings
(Start, End, Channel, and Date). Therefore, it is
reasonable to assume the user would recognize
these as indicating successful completion of the
first action

We have found a potential usability problem
regarding recognizing the time-record button.
Therefore, we may have to check whether our
target user group could correctly distinguish the
time-record button from others on the remote
control.
The same procedure is followed for each action in
the action sequence .
11
Heuristic Evaluation

Overview
Proposed in Molich Nielsen (1990)
A method for structuring the critique of a system
using a set of relatively simple and general
heuristics
Heuristics are guidelines, general principles, or
rules of thumb that can guide a design decision
or be used to critique a decision that has been
made
A flexible and cheap approach
Can be performed on a design specification for
evaluating early design, storyboards and
prototypes, and fully functioning systems
Often considered as a discount usability
technique
The general idea is that several evaluators
independently critique a system to come up with
potential usability problems
Between three and five evaluators is sufficient,
with five usually resulting in about 75 of the
overall usability problems being discovered

12
Heuristic Evaluation

Nielsens Ten Heuristics (Nielsen, 1994)
A set of ten heuristics are provided to aid the
evaluators in discovering usability problems
Related to design principles and guidelines and
can be supplemented where required by heuristics
that are specific to the particular domain
Each evaluator assesses the system and notes
violations of any of these heuristics that would
indicate a potential usability problem
Each evaluator assesses the severity of each
usability problem, based on four factors
How common the problem is
How easy it is for the user to overcome
Whether it will be a one-off problem or a
persistent one
How seriously the problem will be perceived
Once each evaluator has completed his/her
separate assessment, all the problems are
collected and the mean severity ratings are
calculated to help the designers to determine the
most important problems

13
Overall severity rating on a scale of 0 4 in
heuristic evaluation
0 I dont agree that this is a usability
problem at all 1 Cosmetic problem only need
not be fixed unless extra time is available on
the project 2 Minor usability problem fixing
this should be given low priority 3 Major
usability problem important to fix, so should be
given high priority 4 Usability catastrophe
imperative to fix this before product can be
released
14
Heuristic Evaluation

Nielsens Ten Heuristics (Cont.)
Heuristic 1. Visibility of system status
Always keep users informed about what is going
on, through appropriate feedback within
reasonable time
If a system will take some time, give an
indication of how long and how much is complete
Heuristic 2. Match between the system and real
world
The system should speak the users' language, with
words, phrases and concepts familiar to the user,
rather than system-oriented terms
Follow real-world conventions, making information
appear in a natural and logical order
Heuristic 3. User control and freedom
Users often choose system functions by mistake
and will need a clearly marked "emergency exit"
to leave the unwanted state without having to go
through an extended dialogue
Support undo and redo

15
Heuristic Evaluation

Nielsens Ten Heuristics (Cont.)
Heuristic 4. Consistency and standards
Users should not have to wonder whether different
words, situations, or actions mean the same thing
Follow platform conventions and accepted
standards
Heuristic 5. Error prevention
Make it difficult to make error
Even better than good error messages is a careful
design which prevents a problem from occurring in
the first place
Either eliminate error-prone conditions or check
for them and present users with a confirmation
option before they commit to the action
Heuristic 6. Recognition rather than recall
Minimize the user's memory load by making
objects, actions, and options visible
The user should not have to remember information
from one part of the dialogue to another
Instructions for use of the system should be
visible or easily retrievable whenever
appropriate

16
Heuristic Evaluation

Nielsens Ten Heuristics (Cont.)
Heuristic 7. Flexibility and efficiency of use
Allow users to tailor frequent actions
Accelerators -- unseen by the novice user -- may
often speed up the interaction for the expert
user such that the system can cater to both
experienced and inexperienced users
Heuristic 8. Aesthetic and minimalist design
Dialogs should not contain information which is
irrelevant or rarely needed
Every extra unit of information in a dialogue
competes with the relevant units of information
and diminishes their relative visibility
Heuristic 9. Help users recognize, diagnose, and
recover from errors
Error messages should be expressed in plain
language (no codes), precisely indicate the
problem, and constructively suggest a solution
Heuristic 10. Help and documentation
Even though it is better if the system can be
used without documentation, it may be necessary
to provide help and documentation
Any such information should be easy to search,
focused on the user's task, list concrete steps
to be carried out, and not be too large

17
Model-Based Approach

Cognitive and Design Models
Provide a means of combining design specification
and evaluation into the same framework
GOMS (goals, operators, methods and selection)
model
Predicts user performance with a particular
interface and can be used to filter particular
design options
Low-level modeling techniques
Provides predictions of the time users will take
to perform low-level physical tasks
e.g. Keystroke-level model

18
Use of Previous Studies

Experimental Results and Empirical Evidence from
Previous Studies
Can be used to support or refute some aspects of
the design
Some are specific to particular domains, but many
deal with more generic issues and can be applied
in a variety of situations
Review Previous Studies Carefully
Experimental design, participants, data analyses,
and assumptions
e.g. An experiment testing the usability of a
particular style of help system using novice
participants may not be applicable to the
evaluation of a help system designed for expert
users

19
Evaluation Though User Participation

Overview
User participation in evaluation tends to occur
in the later stages of development
Tested on a working prototype
Range from a simulation of the systems
interactive capabilities without its underlying
functionality, a basic functional prototype, to a
fully implemented system
Observing and surveying users can contribute to
earlier design stages
Design specification and requirement capture
Evaluation Styles
Laboratory studies
Field studies

20
Evaluation Though User Participation

Laboratory Studies
Users are taken out of their normal work
environment to take part in controlled tests
(often in a specialist usability laboratory)
Advantages
Allow manipulation of the situation in order to
uncover problems or observe less used procedures
Allow comparison of alternative designs with a
controlled context which reduces ambiguity in
interpretation of results regarding cause and
effect
The possible influence of other extraneous
factors is reduced
Laboratory observation is the only option in some
situations
e.g. The system is located in a dangerous or
remote location (such as space station)

21
Evaluation Though User Participation

Laboratory Studies (Cont.)
Disadvantages
Artificiality of laboratory experiments
A well-equipped usability laboratory may contain
sophisticated equipment (e.g. audio/ visual
recording and analysis facilities) that cannot be
replicated in the work environment
Participants usually operate in an
interruption-free environment in the laboratory
setting, which is seldom the case in the real
world
It is especially difficult to observe several
people cooperating on a task in a laboratory
situation
Interpersonal communication is so heavily
dependent on context

22
Evaluation Though User Participation

Field Studies
The designer or evaluator goes into the users
work environment in order to observe the system
in action
Advantage
Users are observed in their natural environment
Allow observation of interactions between systems
and between individuals that would have been
missed in laboratory studies
Disadvantages
Lack of control in many aspects of the situation
makes it difficult to draw cause and effect
relationships in interpretation of results
High levels of ambient noise, greater levels of
movement and constant interruptions (e.g. phone
calls) make field observation difficult

23
Experimental Evaluation

Overview
One of the most powerful methods to compare
alternative designs
Provides empirical evidence to support particular
hypotheses
Important Elements
A hypothesis to test
e.g. Icons with naturalistic images are easier to
remember than icons with abstract images
Independent variable(s)
The variable(s) that will be manipulated by the
experimenter to study its(their) impacts on the
dependent variable
e.g. The type of icons (naturalistic images vs.
abstract images)
Dependent variable(s)
The variable(s) that will be measured to describe
the outcome of experimental runs
e.g. The number of mistakes made in using the
icons

24
Experimental Evaluation

Important Elements (Cont.)
Experiment method
Depends on the available resources and the tasks
performed in the experiment
Between-subject design
Participants are randomly assigned to the various
conditions so that each participates in only one
group
Within-subject design
Each participant participants in all conditions
Mixed design
A combination of between-subject and
within-subject designs
Research participants
How to recruit the research participants, their
characteristics, how many, etc.

25
Observational Techniques

Overview
Gather information about actual use of a system
through observing users interacting with it
Users are usually asked to complete a set of
predetermined tasks
Users may be observed going about their normal
duties if the observation is carried out in their
place of work
Think Aloud
A form of observation during which the user is
asked to speak loud what he is doing as he is
being observed
Advantages
Simple, requires little expertise to perform
Can provide useful insight into problems with an
interface
Can be used for evaluation throughout the design
process
Disadvantages
The information provided is often subjective and
may be selective, depending on the tasks provided
The very act of describing what himself/herself
is doing often changes the way the user does it

26
Observational Techniques

Cooperative Evaluation
A more relaxed variation of the think aloud
process
The user is encouraged to see himself/herself as
a collaborator in the evaluation and not simply
as an experimental participant
The evaluator can ask the user questions if
his/her behavior is unclear
The user can ask the evaluator for clarification
if a problem arises
Advantages
The process is less constrained and therefore
easier to learn to use by the evaluator
The user is encouraged to criticize the system
The evaluator can clarify points of confusion at
the time they occur and so maximize the
effectiveness of the approach for identifying
problem areas

27
Observational Techniques

Post-Task Walkthrough (Retrospective Recall)
Transcripts of the participants actions are
played back to the participant who is invited to
comment or directly questioned by the evaluator
Usually done straightaway
May be done after a delay
The evaluator has some time to frame suitable
questions and focus on specific incidents
The answers are more likely to be the
participants post hoc interpretation
Useful to identify reasons for actions and
alternatives considered
Necessary in cases where think aloud is not
possible
e.g. during a critical task or when the task is
too intensive

28
Observational Techniques

Protocol Recording
A protocol refers to the record of an evaluation
session
Paper-and-pencil
Primitive and cheap
Allows the evaluator to note interpretations and
extraneous events as they occur
Hard to get detailed information, limited by the
evaluators writing speed
Coding schemes for frequent activities can
improve the rate of recording substantially, but
can take some time to develop
A variation is to use a notebook computer for
direct entry
Limited by the evaluators typing speed
Loses the flexibility of paper for writing
styles, quick diagrams and spatial layout
A specific note-taker, separate from the
evaluator, is recommended if it is the only
recording facility available

29
Observational Techniques

Protocol Recording (Cont.)
Audio recording
Useful if the user is actively thinking aloud
May be difficult to record sufficient information
to identify exact actions in later analysis
Can be difficult to match an audio recording to
some other form of protocol (e.g. handwritten
script)
Video recording
Allow us to see what the participant is doing
Choosing suitable camera positions and viewing
angles to get sufficient detail can be difficult
when the user may move out of the view of the
camera
For single-user computer-based tasks, two video
cameras are typically used
One camera looks at the computer screen (may not
be necessary if the computer system is being
logged)
One camera with a wider focus records the users
face and hands

30
Observational Techniques

Protocol Recording (Cont.)
Computer logging
Advantages
Relatively easy and cheap method to record user
actions at a keystroke level
One of the most popular recording methods that
observe users without interrupting their plans
and actions
Can be used for longitudinal studies where we
observe users over periods of weeks or months
Disadvantages
Keystroke data only tell us about the
lowest-level actions but not why they are
performed or how they are structured
The sheer volume of data collected can become
unmanageable without automatic analysis

31
Observational Techniques

Protocol Recording (Cont.)
User notebooks
The participants are asked to keep logs of their
activities or problems
Records at a very coarse level
Records every few minutes or hourly
Especially useful in longitudinal studies and
when we want a log of unusual or infrequent tasks
and problems
Mixture of recording methods
Different methods can complement one another
e.g. Keep a paper note of special events as well
as use more sophisticated audio/visual recording
Synchronization problems when using a collection
of different sources

32
Observational Techniques

Automatic Protocol Analysis Tools
Very important as evaluation tools by offering a
means of handling large volumes of data collected
in observational studies and allowing a
systematic approach to the data analysis
Noldus Observer XT (http//www.noldus.com)
Select data for analysis
Visualize data
Analyze data with different techniques
Multi-level analysis
Statistical analysis
Compare results from different analyses
Calculate inter-and intrarater reliability
etc.

33
Query Techniques

Overview
Directly ask the user about the interface
Useful in eliciting detail of the users view of
a system
Advantages
Get the users viewpoint directly
Reveal issues that have not been considered by
the designer
Relatively simple and cheap to administer
Disadvantages
The information gathered is necessarily
subjective
The information may be a rationalized account
of events rather than a wholly accurate one
Difficult to get accurate feedback about
alternative designs if the user has not
experienced them
Provide useful supplementary material to other
methods

34
Query Techniques

Interviews
A direct and structured way of gathering
information
Advantages
The level of questions can be varied to suit the
context
Can be effective for high-level evaluation,
particularly in eliciting information about user
preferences, impressions and attitudes
May also reveal problems that have not been
anticipated by the designer or that have not
occurred under observation
The evaluator can probe the user more deeply on
interesting issues as they arise
Interviews should be planned in advance
Interviews are structured around a set of
prepared central questions
Helps to focus the purpose of the interview
Ensures a base of consistency between the
interviews of different users

35
Query Techniques

Questionnaires
Disadvantages
Less flexible than interviews
Questions are fixed in advance
Questions are less probing
Advantages
Can reach a wider participant group
Take less time to administer
Can be analyzed more rigorously
Types of questions
General questions
Help to establish the background of the user and
his/her place within the user population
e.g. Age, gender, occupation, previous experience
with computers, etc.

36
Query Techniques

Questionnaires
Types of questions (Cont.)
Open-ended questions
Ask the user to provide his/her unprompted
opinion on a question
e.g. Can you suggest any improvements to the
interface?
Useful for gathering subjective information but
difficult to analyze in any rigorous way
May identify errors or make suggestions that have
not been considered by the designer
Ranking
Ask the user to judge a specific statement on a
numeric scale, usually corresponding to a measure
of agreement or disagreement with the statement
e.g. It is easy to recover from mistake. Disagree
1 2 3 4 5 Agree
Multi-choice
Offer the user a choice of explicit responses
The user may select only one response or as many
as apply

37
Monitoring Physiological Responses

Eye Tracking for Usability Evaluation
Eye movements are believed to reflect the amount
of cognitive processing a display requires and
thus how easy or difficult it is to process
Measuring not only where people look but also
their patterns of eye movement may tell us which
areas of a screen they are finding easy or
difficult to understand
Possible measurements
Number of fixations (where the eyes retains a
stable position for a period time)
The more fixations, the less efficient the search
strategy
Fixation duration
Longer fixations may indicate difficulty with a
display
Scan path
Indicates areas of interest, search strategy and
cognitive load
Plotting scan paths and fixation can indicate
what people look at, how often and how long
Eye tracking for usability is still very new and
the equipment is prohibitively expensive for
everyday use

38
Monitoring Physiological Responses

Physiological Measures
Recordings of responses of the body which may
reflect the users emotional response to the
system
Galvanic skin response (GSR)
A measure of general emotional arousal and
anxiety
Measures the electrical conductance of the skin,
which changes when sweating occurs
Electromyogram (EMG)
A measure of tension or stress
Measures muscle tension
Electroencephalogram (EEG)
A measure of electrical activity of brain cells
Record general brain arousal as a response to
different situations, activity in different parts
of the brain as learning occurs, etc.

39
Monitoring Physiological Responses

Physiological Measures (Cont.)
Magnetic resonance imaging (MRI)
Provides an image of the brain structure of an
individual
Allows comparing the brain structure of
individuals with a particular condition (e.g.
cognitive impairment) with the brain structure of
those without the condition
Functional MRI (fMRI) can be used to scan areas
of the brain while a participant performs a
physical or cognitive task
Provide evidence for what brain processes are
involved in these tasks
Other physiological measures
Body temperature, heart rate, etc.

40
Choosing An Evaluation Method

The Stage in the Cycle at Which the Evaluation is
Carried Out
Evaluation at the early design stage needs to be
quick and cheap, hence it might involve design
experts only and be analytic
Evaluation of the implementation needs to be more
comprehensive and thus brings in users as
participants
There are exceptions
Participatory design involves users throughout
the design process
Cognitive walkthrough is expert-based and
analytic but can be used to evaluate
implementations as well as designs
Laboratory vs. Field studies
Laboratory studies allow controlled
experimentation and observation while losing some
naturalness of the users environment
Field studies retain the naturalness of the
users environment but do not allow control over
user activity

41
Choosing An Evaluation Method

Subjective vs. Objective
The more subjective techniques rely to a large
extent on the knowledge and expertise of the
evaluator who must recognize problems and
understand what the user is doing
Can be powerful if used correctly and provide
information that may not be available from more
objective methods
The problem of evaluator bias should be
recognized and avoided
One way to decrease the possibility of bias is to
use more than one evaluator
Objective techniques can produce repeatable
results which are not dependent on the persuasion
of the particular evaluator
Avoid bias and provide comparable results
May not reveal the unexpected problem or give
detailed feedback on user experiences
Both objective and subjective approaches should
be used

42
Choosing An Evaluation Method

Quantitative vs. Qualitative Measures
Quantitative measures are usually numeric and can
be easily analyzed using statistical techniques
Qualitative measures are non-numeric and
therefore more difficult to analyze, but can
provide important detail that cannot be
determined from numbers
Information Provided
The information provided by an evaluator at any
stage of the design process may range from
low-level information to enable a design decision
to be made to high-level information
Controlled experiments are excellent at providing
low-level information
An experiment can be designed to measure a
particular aspect of the interface
Higher-level information can be gathered using
questionnaire and interview questions
Provide a more general impression of the users
view of the system

43
Choosing An Evaluation Method

Immediacy of Response
Some methods (e.g. post-talk walkthrough) rely on
the users recollection of events
Recollection is liable to suffer from bias in
recall and reconstruction, with users
interpreting events according to their
preconceptions
Recall may also be incomplete
Some methods (e.g. think aloud) record the users
behavior at the time of the interaction itself
The process of measurement can actually alter the
way the user works
Intrusiveness of Response
Related to the immediacy of response
Most immediate evaluation techniques are
intrusive to the user during the interaction and
thus run the risk of influencing the way the user
behaves

44
Choosing An Evaluation Method

Resources
Resources to consider include equipment, time,
money, participants, expertise of evaluator and
context
e.g. It is impossible to produce a video protocol
without access to a video camera Cognitive
walkthrough relies more on evaluator expertise
than laboratory studies

Tables 9.4 9.6 in the textbook show the
classification of evaluation techniques, which
can help you choose the techniques that most
closely fit your evaluation requirements

Write a Comment

User Comments (0)