Title: The evolution of evaluation
1The evolution of evaluation
- Joseph Jofish Kaye
- Microsoft Research, Cambridge
- Cornell University, Ithaca, NY
- jofish _at_ cornell.edu
2What is evaluation?
- Something you do at the end of a project to show
it works - so you can publish it.
- A tradition in a field
- A way of defining a field
- A process that changes over time
- A reason papers get rejected
3HCI Evaluation Validity
- Methods for establishing validity vary depending
on the nature of the contribution. They may
involve empirical work in the laboratory or the
field, the description of rationales for design
decisions and approaches, applications of
analytical techniques, or proof of concept
system implementations - CHI 2007 Website
4So
- How did we get to where we are today?
- Why did we end up with the system(s) we use
today? - How can our current approaches to evaluation deal
with novel concepts of HCI, such as
experience-focused (rather than task focused) HCI?
5Experience focused HCI
- (a question to think about during this talk)
- What does it mean when this is your evaluation
method?
6A Brief History and plan for the talk
- Evaluation by Engineers
- Evaluation by Computer Scientists
- Evaluation by Experimental Psychologists
Cognitive Scientists - Evaluation by HCI Professionals
- Evaluation in CSCW
- Evaluation for Experience
7A Brief History and plan for the talk
- Evaluation by Engineers
- Evaluation by Computer Scientists
- Evaluation by Experimental Psychologists
Cognitive Scientists - Case study Evaluation of Text Editors
- Evaluation by HCI Professionals
- Case Study The Damaged Merchandise Debate
- Evaluation in CSCW
- Evaluation for Experience
83 Questions to ask about an era
- Who are the users?
- Who are the evaluators?
- What are the limiting factors?
9Evaluation by Engineers
- Users are engineers mathematicians
- Evaluators are engineers
- The limiting factor is reliability
10Evaluation by Computer Scientists
- Users are programmers
- Evaluators are programmers
- The speed of the machine is the limiting factor
11Evaluation by Experimental Psychologists
Cognitive Scientists
- Users are users the computer is a tool, not an
end result - Evaluators are cognitive scientists and
experimental psychologists theyre used to
measuring things through experiment - The limiting factor is what the human can do
12Evaluation by Experimental Psychologists
Cognitive Scientists
- Perceptual issues such as print legibility and
motor issues arose in designing displays,
keyboards and other input devices new interface
developments created opportunities for cognitive
psychologists to contribute in such areas as
motor learning, concept formation, semantic
memory and action. - In a sense, this marks the emergence of the
distinct discipline of human-computer
interaction. (Grudin 2006)
13Case Study Text Editors
- Roberts Moran, 1982, 1983.
- Their methodology for evaluating text editors had
three criteria - objectivity
- thoroughness
- ease-of-use
14Case Study Text Editors
- objectivity
- implies that the methodology not be biased in
favor of any particular editors conceptual
structure - thoroughness
- implies that multiple aspects of editor use be
considered - ease-of-use (of the method, not the editor
itself) - the methodology should be usable by editor
designers, managers of word processing centers,
or other nonpsychologists who need this kind of
evaluative information but who have limited time
and equipment resources
15Case Study Text Editors
- objectivity
- implies that the methodology not be biased in
favor of any particular editors conceptual
structure - thoroughness
- implies that multiple aspects of editor use be
considered. - ease-of-use (of the method (not the editor
itself), - the methodology should be usable by editor
designers, managers of word processing centers,
or other nonpsychologists who need this kind of
evaluative information but who have limited time
and equipment resources.
16Case Study Text Editors
- Text editors are
- the white rats of HCI
- Thomas Green, 1984,
- in Grudin, 1990.
17Case Study Text Editors
- Text editors are
- the white rats of HCI
- Thomas Green, 1984,
- in Grudin, 1990.
- which tells us more about HCI than it does about
text editors.
18Evaluation by HCI Professionals
- Usability professionals
- They believe in expertise (e.g. Neilsen 1984)
- Theyve made a decision decision to decide to
focus on better results, regardless of whether
they were experimentally provable or not.
19Case Study The Damaged Merchandise Debate
20Damaged Merchandise Setup
- Early eighties
- usability evaluation methods (UEMs)
- - heuristics (Neilsen)
- - cognitive walkthrough
- - GOMS
- -
21Damaged Merchandise Comparison Studies
- Jefferies, Miller, Wharton and Uyeda (1991)
- Karat, Campbell and Fiegel (1992)
- Neilsen (1992)
- Desuirve, Kondziela, and Atwood (1992)
- Neilsen and Phillips (1993)
22Damaged Merchandise Panel
- Wayne D. Gray, Panel at CHI95
- Discount or Disservice? Discount Usability
Analysis at a Bargain Price or Simply Damaged
Merchandise
23Damaged Merchandise Paper
- Wayne D. Gray Marilyn Salzman
- Special issue of HCI
- Experimental Comparisons of Usability Evaluation
Methods
24Damaged Merchandise Response
- Commentary on Damaged Merchandise
- Karat experiment in context
- Jefferies Miller real-world
- Lund McClelland practical
- John case studies
- Monk broad questions
- Oviatt field-wide science
- MacKay triangulate
- Newman simulation modelling
25Damaged Merchandise Whats going on?
- Gray Salzman, p19
- There is a tradition in the human factors
literature of providing advice to practitioners
on issues related to, but not investigated in, an
experiment. This tradition includes the clear
and explicit separation of experiment-based
claims from experience-based advice. Our
complaint is not against experimenters who
attempt to offer good advice the advice may be
understood as research findings rather than the
researchers opinion.
26Damaged Merchandise Whats going on?
- Gray Salzman, p19
- There is a tradition in the human factors
literature of providing advice to practitioners
on issues related to, but not investigated in, an
experiment. This tradition includes the clear
and explicit separation of experiment-based
claims from experience-based advice. Our
complaint is not against experimenters who
attempt to offer good advice the advice may be
understood as research findings rather than the
researchers opinion.
27Damaged Merchandise Clash of Paradigms
- Experimental Psychologists Cognitive Scientists
- (who believe in experimentation)
- vs.
- HCI Professionals
- (who believe in experience and expertise, even if
unprovable) (and who were trying to present
their work in the terms of the dominant paradigm
of the field.)
28Evaluation in CSCW
- A story Im not telling
- CSCW vs. HCI
- Not just groups, but philosophy (ideology!)
- Member-created, dynamic, not cognitive, modelable
- Follows failure of workplace studies to
characterize - IE Plans and Situated Actions vs. The Psychology
of Human-Computer Interaction
29Evaluation of Experience Focused HCI
- A possibly emerging sub-field
- Gaver et. al.
- Isbister et. al.
- Höök et. al.
- Sengers et. al.
- Etc.
- How to evaluate?
30Epistemology
- How does a field know what it knows?
- How does a field know that it knows it?
- Science experiment
- But literature? Anthropology? Sociology? Therapy?
Art? Theatre? Design?
31Epistemology
- Formally
- The aim of this work is to recognize the ways in
which multiple epistemologies, not just the
experimental paradigm of science, can and do
inform the hybrid discipline of human-computer
interaction.
32Shouts To My Homies
- Maria HÃ¥kansson
- Lars Erik Holmquist
- Alex Taylor MS Research
- Phoebe Sengers CEmCom
- Cornell STS Department
- Many discussions over the last year and this one
to come.