The evolution of evaluation

1 / 32

About This Presentation

Title:

The evolution of evaluation

Description:

Title: The evolution of evaluation Author: Joseph 'Jofish' Kaye Last modified by: Joseph 'Jofish' Kaye Created Date: 10/6/2006 6:29:33 AM Document presentation format – PowerPoint PPT presentation

Number of Views:0

Avg rating:3.0/5.0

Slides: 33

Provided by: JosephJo9

more less

Transcript and Presenter's Notes

Title: The evolution of evaluation

1
The evolution of evaluation

Joseph Jofish Kaye
Microsoft Research, Cambridge
Cornell University, Ithaca, NY
jofish _at_ cornell.edu

2
What is evaluation?

Something you do at the end of a project to show
it works
so you can publish it.
A tradition in a field
A way of defining a field
A process that changes over time
A reason papers get rejected

3
HCI Evaluation Validity

Methods for establishing validity vary depending
on the nature of the contribution. They may
involve empirical work in the laboratory or the
field, the description of rationales for design
decisions and approaches, applications of
analytical techniques, or proof of concept
system implementations
CHI 2007 Website

4
So

How did we get to where we are today?
Why did we end up with the system(s) we use
today?
How can our current approaches to evaluation deal
with novel concepts of HCI, such as
experience-focused (rather than task focused) HCI?

5
Experience focused HCI

(a question to think about during this talk)
What does it mean when this is your evaluation
method?

6
A Brief History and plan for the talk

Evaluation by Engineers
Evaluation by Computer Scientists
Evaluation by Experimental Psychologists
Cognitive Scientists
Evaluation by HCI Professionals
Evaluation in CSCW
Evaluation for Experience

7
A Brief History and plan for the talk

Evaluation by Engineers
Evaluation by Computer Scientists
Evaluation by Experimental Psychologists
Cognitive Scientists
Case study Evaluation of Text Editors
Evaluation by HCI Professionals
Case Study The Damaged Merchandise Debate
Evaluation in CSCW
Evaluation for Experience

8
3 Questions to ask about an era

Who are the users?
Who are the evaluators?
What are the limiting factors?

9
Evaluation by Engineers

Users are engineers mathematicians
Evaluators are engineers
The limiting factor is reliability

10
Evaluation by Computer Scientists

Users are programmers
Evaluators are programmers
The speed of the machine is the limiting factor

11
Evaluation by Experimental Psychologists
Cognitive Scientists

Users are users the computer is a tool, not an
end result
Evaluators are cognitive scientists and
experimental psychologists theyre used to
measuring things through experiment
The limiting factor is what the human can do

12
Evaluation by Experimental Psychologists
Cognitive Scientists

Perceptual issues such as print legibility and
motor issues arose in designing displays,
keyboards and other input devices new interface
developments created opportunities for cognitive
psychologists to contribute in such areas as
motor learning, concept formation, semantic
memory and action.
In a sense, this marks the emergence of the
distinct discipline of human-computer
interaction. (Grudin 2006)

13
Case Study Text Editors

Roberts Moran, 1982, 1983.
Their methodology for evaluating text editors had
three criteria
objectivity
thoroughness
ease-of-use

14
Case Study Text Editors

objectivity
implies that the methodology not be biased in
favor of any particular editors conceptual
structure
thoroughness
implies that multiple aspects of editor use be
considered
ease-of-use (of the method, not the editor
itself)
the methodology should be usable by editor
designers, managers of word processing centers,
or other nonpsychologists who need this kind of
evaluative information but who have limited time
and equipment resources

15
Case Study Text Editors

objectivity
implies that the methodology not be biased in
favor of any particular editors conceptual
structure
thoroughness
implies that multiple aspects of editor use be
considered.
ease-of-use (of the method (not the editor
itself),
the methodology should be usable by editor
designers, managers of word processing centers,
or other nonpsychologists who need this kind of
evaluative information but who have limited time
and equipment resources.

16
Case Study Text Editors

Text editors are
the white rats of HCI
Thomas Green, 1984,
in Grudin, 1990.

17
Case Study Text Editors

Text editors are
the white rats of HCI
Thomas Green, 1984,
in Grudin, 1990.
which tells us more about HCI than it does about
text editors.

18
Evaluation by HCI Professionals

Usability professionals
They believe in expertise (e.g. Neilsen 1984)
Theyve made a decision decision to decide to
focus on better results, regardless of whether
they were experimentally provable or not.

19
Case Study The Damaged Merchandise Debate
20
Damaged Merchandise Setup

Early eighties
usability evaluation methods (UEMs)
- heuristics (Neilsen)
- cognitive walkthrough
- GOMS
-

21
Damaged Merchandise Comparison Studies

Jefferies, Miller, Wharton and Uyeda (1991)
Karat, Campbell and Fiegel (1992)
Neilsen (1992)
Desuirve, Kondziela, and Atwood (1992)
Neilsen and Phillips (1993)

22
Damaged Merchandise Panel

Wayne D. Gray, Panel at CHI95
Discount or Disservice? Discount Usability
Analysis at a Bargain Price or Simply Damaged
Merchandise

23
Damaged Merchandise Paper

Wayne D. Gray Marilyn Salzman
Special issue of HCI
Experimental Comparisons of Usability Evaluation
Methods

24
Damaged Merchandise Response

Commentary on Damaged Merchandise
Karat experiment in context
Jefferies Miller real-world
Lund McClelland practical
John case studies
Monk broad questions
Oviatt field-wide science
MacKay triangulate
Newman simulation modelling

25
Damaged Merchandise Whats going on?

Gray Salzman, p19
There is a tradition in the human factors
literature of providing advice to practitioners
on issues related to, but not investigated in, an
experiment. This tradition includes the clear
and explicit separation of experiment-based
claims from experience-based advice. Our
complaint is not against experimenters who
attempt to offer good advice the advice may be
understood as research findings rather than the
researchers opinion.

26
Damaged Merchandise Whats going on?

Gray Salzman, p19
There is a tradition in the human factors
literature of providing advice to practitioners
on issues related to, but not investigated in, an
experiment. This tradition includes the clear
and explicit separation of experiment-based
claims from experience-based advice. Our
complaint is not against experimenters who
attempt to offer good advice the advice may be
understood as research findings rather than the
researchers opinion.

27
Damaged Merchandise Clash of Paradigms

Experimental Psychologists Cognitive Scientists
(who believe in experimentation)
vs.
HCI Professionals
(who believe in experience and expertise, even if
unprovable) (and who were trying to present
their work in the terms of the dominant paradigm
of the field.)

28
Evaluation in CSCW

A story Im not telling
CSCW vs. HCI
Not just groups, but philosophy (ideology!)
Member-created, dynamic, not cognitive, modelable
Follows failure of workplace studies to
characterize
IE Plans and Situated Actions vs. The Psychology
of Human-Computer Interaction

29
Evaluation of Experience Focused HCI

A possibly emerging sub-field
Gaver et. al.
Isbister et. al.
Höök et. al.
Sengers et. al.
Etc.
How to evaluate?

30
Epistemology

How does a field know what it knows?
How does a field know that it knows it?
Science experiment
But literature? Anthropology? Sociology? Therapy?
Art? Theatre? Design?

31
Epistemology

Formally
The aim of this work is to recognize the ways in
which multiple epistemologies, not just the
experimental paradigm of science, can and do
inform the hybrid discipline of human-computer
interaction.

32
Shouts To My Homies