Title: Evaluation of usability tests
1Evaluation ofusability tests
2Why evaluate?
- choose the most suitable data-collection
techniques - identify methodological strength and weaknesses
of a user test
3Evaluation Criteria fordata-collection techniques
- Utility
- how useful are the data?
- Costs
- resources needed?
- Objectivity
- how much subjective judgement is involved?
- Level of detail
- is the amount and resolution of the data
suitable? - Intrusiveness
- does the method interfere with the users
performance?
4Observations in real time
- Strengths
- Level of detail Allows you to experience the
context in which performance takes place
- Weaknesses
- Level of detail Difficult to keep up with the
pace of the user - Objective Based on your own subjective judgement
as an observer
5Observations from video
- Strengths
- Utility Allows you to conduct detailed analysis
of various usability attributes - Utility Can obtain data about the users
reasoning (Think-aloud)
- Weaknesses
- Costs Time consuming
- Utility Lots of data not being used
- Intrusiveness Think-aloud may disturb the user
6Observations Real time or Video?
Real time
Video
Context
Product
Product
Context
Level of detail
7Event logs
- Strengths
- Objective The data are collected automatically
- Costs Automated data collection requires little
effort from the test team
- Weaknesses
- Level of detail Both the amount of data and the
resolution can be too high - Utility It can be difficult to create useful
measures
8http//zing.ncsl.nist.gov/WebTools/VisVIP/overview
.html
9Questionnaire, self-made
- Strengths
- Level of detail Can be tailored to fit the
purpose of the test - Utility Can be used in several setting with
different products - Costs It doesnt take long time to develop
- Weaknesses
- Objectivity Based on subjective judgement
- Utility Difficult to construct good items
10Questionnaire, validated
- Strengths
- Utility Can be used in several setting with
different products - Costs the data are typically easy to transform
into measures
- Weaknesses
- Level of detail Validated questionnaires may not
address the features of the interface you are
interested in. - Objectivity based on subjective judgement
11Summary data-collection techniques
The assessment concern MEASURES and not
use/problem descriptions very good
good - not so good -- poor
12Use/problem descriptions
- Observation and Interviews are the most suitable
data-collection techniques for use/problem
descriptions
13Evaluation of measures
- The evaluation criteria of thedata-collection
techniques - Validitity
- Reliability
14Validity
- Do you measure what you believe you measure?
15Reliability
- Do you obtain the same results when you measure
the same thing during similar conditions at
different points in time?
16Relationship betweenValidity Reliability
- Evaluating the validity of a measure is primarily
based on subjective judgement, while reliability
is typically evaluated by means of statistics - It is possible to obtain reliable results that
are invalid, but not unreliable results that are
valid!
17How can you avoid invalid results?
- Use several measures!
- Triangulation
- Multiple operationalism
18Ethical issues
- Be well prepared - act professionally!
- Create a script
- Introduction
- During test
- Debriefing
- Create a consent form
19Ethical issues
- The product is being tested, not the user!
- Respectful treatment preserve integrity
- Informed consent
- Inform the user what will happen, how the
collected data will be used etc. - Make sure the user understands and agrees
- The user may leave whenever she/he wants
- Confidentiality
20Types of measures
- Experience-attitude
- Performance
- Cognitive
21Experience-attitude
- Strengths
- Utility Can address most usability attributes
- Validity User-centered we ask for the users
opinions
- Weaknesses
- Validity/Objectivity based on the users
subjective judgement
22Performance completeness
- Strengths
- Utility Can be used for most tasks and in
different settings - Cost-effective Quite easy to create a list of
activities
- Weaknesses
- Validity/reliability The user may choose a
solution path you didnt think of, but that
nevertheless is satisfactory - Validity(senitivity) Ceiling or flooring
effects the task is too easy or too difficult
23Summary of measures
very good good - not so good --
poor
24Relation between data-collection techniques and
measures
very good good - not so good --
poor
25Relation between data-collection techniques and
measures
Measure
Practicle limitations
Purpose of test
Data-collection technique