The evolution of evaluation

About This Presentation
Title:

The evolution of evaluation

Description:

The evolution of evaluation Joseph Jofish Kaye Microsoft Research, Cambridge Cornell University, Ithaca, NY jofish _at_ cornell.edu ... – PowerPoint PPT presentation

Number of Views:2
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: The evolution of evaluation


1
The evolution of evaluation
  • Joseph Jofish Kaye
  • Microsoft Research, Cambridge
  • Cornell University, Ithaca, NY
  • jofish _at_ cornell.edu

2
What is evaluation?
  • Something you do at the end of a project to show
    it works
  • so you can publish it.
  • Part of the design-build-evaluate iterative
    design cycle
  • A tradition in a field
  • A way of defining a field
  • A way a discipline validates the knowledge it
    creates.
  • A reason papers get rejected

3
HCI Evaluation Validity
  • Methods for establishing validity vary depending
    on the nature of the contribution. They may
    involve empirical work in the laboratory or the
    field, the description of rationales for design
    decisions and approaches, applications of
    analytical techniques, or proof of concept
    system implementations
  • CHI 2007 Website

4
So
  • How did we get to where we are today?
  • Why did we end up with the system(s) we use
    today?
  • How can our current approaches to evaluation deal
    with novel concepts of HCI, such as
    experience-focused (rather than task focused)
    HCI?
  • And in particular

5
Evaluation of the VIO
  • A device for couples in long distance
    relationships to communicate intimacy
  • Its about the experience its not about the
    task
  • www.intimateobjects.org
  • Kaye, Levitt, Nevins, Golden Schmidt.
    Communicating Intimacy One Bit at a Time. Ext.
    Abs. CHI 2005.
  • Kaye. I just clicked to say I love you.
    alt.chi, Ext. Abs. CHI 2006.

6
Experience focused HCI
  • What does it mean when this is your evaluation
    method?
  • Isbister, Höök, Sharp, Laaksolahti. The Sensual
    Evaluation Instrument Developing an Affective
    Evaluation Tool. Proc. CHI06

7
A Brief History and plan for the talk
  1. Evaluation by Engineers
  2. Evaluation by Computer Scientists
  3. Evaluation by Experimental Psychologists
    Cognitive Scientists
  4. Evaluation by HCI Professionals
  5. Evaluation in CSCW
  6. Evaluation for Experience

8
A Brief History and plan for the talk
  • Evaluation by Engineers
  • Evaluation by Computer Scientists
  • Evaluation by Experimental Psychologists
    Cognitive Scientists
  • Case study Evaluation of Text Editors
  • Evaluation by HCI Professionals
  • Case Study The Damaged Merchandise Debate
  • Evaluation in CSCW
  • Evaluation for Experience

9
3 Questions to ask about an era
  • Who are the users?
  • Who are the evaluators?
  • What are the limiting factors?

10
Evaluation by Engineers
  • Users are engineers mathematicians
  • Evaluators are engineers
  • The limiting factor is reliability

11
Evaluation by Computer Scientists
  • Users are programmers
  • Evaluators are programmers
  • The speed of the machine is the limiting factor

12
Evaluation by Experimental Psychologists
Cognitive Scientists
  • Users are users the computer is a tool, not an
    end result
  • Evaluators are cognitive scientists and
    experimental psychologists theyre used to
    measuring things through experiment
  • The limiting factor is what the human can do

13
Evaluation by Experimental Psychologists
Cognitive Scientists
  • Perceptual issues such as print legibility and
    motor issues arose in designing displays,
    keyboards and other input devices new interface
    developments created opportunities for cognitive
    psychologists to contribute in such areas as
    motor learning, concept formation, semantic
    memory and action.
  • In a sense, this marks the emergence of the
    distinct discipline of human-computer
    interaction. (Grudin 2006)

14
Case Study Text Editors
  • Roberts Moran, 1982, 1983.
  • Their methodology for evaluating text editors had
    three criteria
  • objectivity
  • thoroughness
  • ease-of-use

15
Case Study Text Editors
  • objectivity
  • implies that the methodology not be biased in
    favor of any particular editors conceptual
    structure
  • thoroughness
  • implies that multiple aspects of editor use be
    considered
  • ease-of-use (of the method, not the editor
    itself)
  • the methodology should be usable by editor
    designers, managers of word processing centers,
    or other nonpsychologists who need this kind of
    evaluative information but who have limited time
    and equipment resources

16
Case Study Text Editors
  • objectivity
  • implies that the methodology not be biased in
    favor of any particular editors conceptual
    structure
  • thoroughness
  • implies that multiple aspects of editor use be
    considered.
  • ease-of-use (of the method (not the editor
    itself),
  • the methodology should be usable by editor
    designers, managers of word processing centers,
    or other nonpsychologists who need this kind of
    evaluative information but who have limited time
    and equipment resources.

17
Case Study Text Editors
  • Text editors are
  • the white rats of HCI
  • Thomas Green, 1984,
  • in Grudin, 1990.

18
Case Study Text Editors
  • Text editors are
  • the white rats of HCI
  • Thomas Green, 1984,
  • in Grudin, 1990.
  • which tells us more about HCI than it does about
    text editors.

19
Evaluation by HCI Professionals
  • Usability professionals
  • They believe in expertise (e.g. Neilsen 1984)
  • Theyve made a decision decision to decide to
    focus on better results, regardless of whether
    they were experimentally provable or not.

20
Case Study The Damaged Merchandise Debate
21
Damaged Merchandise Setup
  • Early eighties
  • usability evaluation methods (UEMs)
  • - heuristics (Neilsen)
  • - cognitive walkthrough
  • - GOMS
  • -

22
Damaged Merchandise Comparison Studies
  • Jefferies, Miller, Wharton and Uyeda (1991)
  • Karat, Campbell and Fiegel (1992)
  • Neilsen (1992)
  • Desuirve, Kondziela, and Atwood (1992)
  • Neilsen and Phillips (1993)

23
Damaged Merchandise Panel
  • Wayne D. Gray, Panel at CHI95
  • Discount or Disservice? Discount Usability
    Analysis at a Bargain Price or Simply Damaged
    Merchandise

24
Damaged Merchandise Paper
  • Wayne D. Gray Marilyn Salzman
  • Special issue of HCI
  • Experimental Comparisons of Usability Evaluation
    Methods

25
Damaged Merchandise Response
  • Commentary on Damaged Merchandise
  • Karat experiment in context
  • Jefferies Miller real-world
  • Lund McClelland practical
  • John case studies
  • Monk broad questions
  • Oviatt field-wide science
  • MacKay triangulate
  • Newman simulation modelling

26
Damaged Merchandise Whats going on?
  • Gray Salzman, p19
  • There is a tradition in the human factors
    literature of providing advice to practitioners
    on issues related to, but not investigated in, an
    experiment. This tradition includes the clear
    and explicit separation of experiment-based
    claims from experience-based advice. Our
    complaint is not against experimenters who
    attempt to offer good advice the advice may be
    understood as research findings rather than the
    researchers opinion.

27
Damaged Merchandise Whats going on?
  • Gray Salzman, p19
  • There is a tradition in the human factors
    literature of providing advice to practitioners
    on issues related to, but not investigated in, an
    experiment. This tradition includes the clear
    and explicit separation of experiment-based
    claims from experience-based advice. Our
    complaint is not against experimenters who
    attempt to offer good advice the advice may be
    understood as research findings rather than the
    researchers opinion.

28
Damaged Merchandise Clash of Paradigms
  • Experimental Psychologists Cognitive Scientists
  • (who believe in experimentation)
  • vs.
  • HCI Professionals
  • (who believe in experience and expertise, even if
    unprovable) (and who were trying to present
    their work in the terms of the dominant paradigm
    of the field.)

29
Evaluation in CSCW
  • Briefly
  • CSCW vs. HCI
  • Not just groups, but philosophy (ideology!)
  • Member-created, dynamic, not cognitive, modelable
  • Follows failure of workplace studies to
    characterize
  • Plans and Situated Actions vs. The Psychology of
    Human-Computer Interaction
  • Garfinkel vs. Fitz

30
Experience Focused HCI
  • A possibly emerging sub-field, drawing from
    traditions and disciplines outside the field
  • Gaver cultural commentators,
  • Isbister game design, arts
  • Höök arts, design
  • Sengers critical theory, arts
  • Blythe literature
  • But how to evaluate?

31
Epistemology
  • How does a field know what it knows?
  • How does a field know that it knows it?
  • Science experiment
  • But literature? Anthropology? Sociology? Therapy?
    Art? Theatre? Design?

32
Epistemology
  • The aim of this work is to recognize the ways in
    which multiple epistemologies, not just the
    experimental paradigm of science, can, do and
    need to inform the hybrid discipline of
    human-computer interaction.

33
Shouts To My Homies
  • Andy Warr
  • Alex Taylor MS Research
  • Phoebe Sengers CEmCom
  • Cornell STS Department
  • Maria HÃ¥kansson the IT University Göteborg
Write a Comment
User Comments (0)