Title: How bad is Human Judgment?
1How bad is Human Judgment?
- Peter Ayton
- Department of Psychology
- City University, London
2How do psychologists study human judgment?
- Psychological experiments often compare the
actual with the ideal. The actual can be
measured by monitoring human decision making. The
ideal is usually determined from laws in logic or
statistics. - Discrepancies show that the human brain doesnt
seem to solve problems by applying laws of logic
or statistics - so how does it work? - Because people cant utilise vast amounts of
information, the brain uses heuristics simple
rules of thumb to make judgments and decisions
quickly.
3How bad is Human Judgment?
- But psychological research has undermined
confidence in the quality of human judgment. -
- E.g. Psychologist Daniel Kahneman awarded the
2002 Economics Nobel discovered how human
judgment may take heuristic shortcuts that
systematically depart from basic principles of
probability
4How do psychologists study human judgment?
- Reflecting on the fact that their intent in
studying heuristic errors was akin to the use
of optical illusions, forgetfulness, or tongue
twisters in order to understand sight, memory,
and language, the researchers wrote - Although errors of judgment are but a method by
which some cognitive processes are studied, the
method has become a significant part of the
message. (Kahneman Tversky, 1982, p. 492).
5Illusions
6Is the blue on the inner left back or the outer
left front?
7Is the left centre circle bigger?
No, they're both the same size
8It's a spiral, right?
No, these are a set of independent circles
9Count the black dots
10How many legs does this elephant have?
11Are the horizontal lines parallel or do they
slope?
12(No Transcript)
13How do people consider risks?
- 1) Relative insensitivity to probability
information. -
- 2) Driven by evaluation of qualities of outcomes
(Risk as Emotions)
14How do people consider risks?
15Judgement and DescriptionEffects of unpacking
hypotheses.
- E.g.
- p (death from unnatural causes) 32.
- But,
- p (death from accident) 32
- p (death by homicide) 10
- p (other unnatural causes) 11
- SUM 53
- Experts (stockbrokers stock forecasts Oil
Engineers safety assessments) show similar
effects.
16Judgement and DescriptionEffects of unpacking
hypotheses.
- How to Be Incoherent and Seductive Bookmakers
Odds and Support Theory
17Judgement and DescriptionEffects of unpacking
hypotheses.
- How to Be Incoherent and Seductive Bookmakers
Odds and Support Theory
18The Planning fallacy
- WHY does everything take longer to finish and
cost more than we think it will? - The Channel Tunnel was supposed to cost 2.6
billion. In fact, the final bill came to 15
billion. The Jubilee Line extension to the London
Underground cost 3.5 billion, about four times
the original estimate. There are many other
examples the London Eye, the Channel Tunnel rail
link, the Dome. - This is not an exclusively British disease. In
1957, engineers forecast that the Sydney Opera
House would be finished in 1963 at a cost of A7
million. A scaled-down version costing 102
million finally opened in 1973. In 1969, the
mayor of Montreal announced that the 1976
Olympics would cost C120 million and "can no
more have a deficit than a man can have a baby".
Yet the stadium roof alonewhich was not finished
until 13 years after the gamescost C120
million. - Is gross incompetence behind such fiascos? Or a
Machiavellian plot to secure approval for
projects that once started cannot easily be
cancelled? - Research carried out by psychologist Roger
Buehler suggests that the main cause may lie
deeper. Buehler found that students consistently
underestimated how long it would take them to
finish their assignments. They seemed to have an
over-idealised vision of a smooth future and
rarely anticipated more than trivial impediments.
19Partition DependenceHow you frame a question
affects the answer
- Case prime Will Sunday be the hottest day of
the week? - A two-fold partition of the sample space is
evoked - Sunday versus the rest of the week. (1/2)
- Class prime Will the hottest day of the week
be Sunday? - A seven-fold partition is invoked.
- Sunday is one of 7 possible options (1/7)
20OverconfidenceTypical experiments have
presented series of two alternative general
knowledge questions to subjects and asked them to
indicate the correct answer and state their
subjective probability, expressed as a
percentage, that they have selected the correct
answer. E.g. Which is longer ? (a) Panama
canal sure (b) Suez
canalsure responses vary from 50 -
completely uncertain to 100 - completely
certain.
21Early general knowledge experiments suggested
that peoples confidence judgments are poorly
calibrated. Points below the diagonal
represent overconfident responses the expressed
confidence is higher than the proportion correct.
22But some experts (e.g. weather forecasters)
produce very well calibrated subjective
likelihood judgments in the domain of their
expertise.
23But not all experts are well calibrated.
Experienced physicians probabilistic diagnoses
of pneumonia are poorly calibrated. What makes
experts well calibrated? Some experts get prompt
unambiguous feedback (e.g. weather forecasters)
others (e.g doctors) may not.
24(No Transcript)
25The hot-hand fallacy and the gamblers fallacy
Two faces of Subjective Randomness?
26(No Transcript)
27(No Transcript)
28(No Transcript)
29The hot-hand fallacy and the gamblers fallacy
Two faces of Subjective Randomness?
- Pinker (1997) is critical of the presumption of
faulty reasoning typically accompanying
observations of the gamblers fallacy - It would not surprise me if a week of clouds
really did predict that the trailing edge was
near and the sun was about to be unmasked, just
as the hundredth rail road car on a passing train
portends the caboose with greater likelihood than
the third car. Many events work like that. An
astute observer should commit the gamblers
fallacy. A gambling device is by definition a
machine designed to defeat our intuitive
predictions. Its like calling our hands badly
designed because they fail to get out of
handcuffs. (p. 346).
30The hot-hand fallacy and the gamblers fallacy
Two faces of Subjective Randomness?
- Gilden and Wilson (1995 1996) have shown that
for golf putting, dart throwing and auditory and
visual signal detection there are streaks in
performance Adams (1995) reports momentum in
the performance of pocket billiards players and
Smith (in press) reports that horseshoe pitchers
have modest hot and cold spells. - Thus, belief in the hot-hand is not always
fallacious. Perhaps then people have learned to
expect the hot hand from observing human
performances where it occurs.
31Gains and losses
Samulesons paradox Offers a bet on a coin
toss. Heads you win 200 tails you lose 100.
No-one takes it - but would play ten times.
Loss aversion Losses are weighted more than
gains Insurance and extended warranties.
32Gains and losses
- Q1. Imagine that you face the following pair of
concurrent decisions. First examine both
decisions and then indicate the options that you
prefer. - Decision I Choose between
- A sure gain of 2,400
- B. A 25 chance to gain 10,000, and a 75
chance to gain nothing - Decision II Choose between
- A sure loss of 7,500
- D. A 75 chance to lose 10,000, and a 25
chance to lose nothing
33Gains and losses
- Q1. Imagine that you face the following pair of
concurrent decisions. First examine both
decisions and then indicate the options that you
prefer. - Decision I Choose between
- A sure gain of 2,400
- B. A 25 chance to gain 10,000, and a 75
chance to gain nothing - Decision II Choose between
- A sure loss of 7,500
- D. A 75 chance to lose 10,000, and a 25
chance to lose nothing
Most people choose A and D hardly anyone
prefers B and C. They like the sure gain in
Decision I and dislike the certain loss in
Decision II. But the pair of choices B and C is
much better than A and D.
34Gains and losses
- Q1. Imagine that you face the following pair of
concurrent decisions. First examine both
decisions and then indicate the options that you
prefer. - Decision I Choose between
- A sure gain of 2,400
- B. A 25 chance to gain 10,000, and a 75
chance to gain nothing - Decision II Choose between
- A sure loss of 7,500
- D. A 75 chance to lose 10,000, and a 25
chance to lose nothing
Most people choose A and D hardly anyone
prefers B and C. They like the sure gain in
Decision I and dislike the certain loss in
Decision II. But the pair of choices B and C is
much better than A and D. If you combine the
outcomes of the two choices you can add the sure
gain of 2,400 to the risky outcomes in D. So,
A and D gives you A and D. 25
chance to gain 2,400, and
75 chance to lose 7,600 Similarly, B and C
can be combined the sure loss of 7,500 in C
can be subtracted from the risky outcomes from B
B and C. 25 chance to gain 2,500,
and 75 chance to
lose 7,500 With B and C the chances of winning
and losing are the same as in A and D but the
amount you might win is more and the amount you
might lose is less.
35Gains and losses
- The same notions of loss aversion and certainty
weighting can explain the sunk cost effect. - Mindful of their investment people cant quit
(but animals can and do).
36Gains and losses
- The mental accounting of wine cellars.
- You purchase several cases of wine at 20 a
bottle and, after several years it has now
increased in value. You have been offered 75 a
bottle. - You decide to drink a bottle to help you decide
about the offer. How much does this cost you? - Possible mental accounts...
- (a) Nothing (b) 20 (c) 20
interest (d) 75 (e) A gain
of 55 - (I already own it) (what I paid) (what I
paid interest) (what I am offered)
(I drank a 75 bottle and it
only cost 20) -
-
- (a) Nothing (b) 20 (c) 20 interest
(d) 75 (e) A gain of 55 - __________________________________________________
__________________________ - Students 30 10 1 37 22
- Experts 30 18 7 20 25
- (wine collectors)
37(No Transcript)
38- A patient with severe chest pains is rushed to
the emergency department in a hospital. The
physicians must (quickly) decide Should the
patient be sent to the coronary care unit or to a
regular bed with ECG telemetry? - In two Michigan hospitals, emergency physicians
sent 90 of all patients to the care unit. Such
defensive decision-making led to over-crowding,
decreased quality of care, and greater health
risks for patients. - Researchers taught the physicians to use the
Heart Disease Predictive Instrument, an expert
system consisting of a chart with some 50
probabilities and a logistic formula with which
the physician, aided by a pocket calculator,
computes the probability of requiring the
coronary care unit for each patient. If the
probability is higher than a certain value, then
the patient is sent to the care unit, otherwise
not. - Physicians dont like using this and similar
systems. They dont understand it - it does not
conform to their intuitive thinking - and so
avoid using it.
39The researchers tried a third alternative a
heuristic that has the structure of physicians
intuitions, but is based on empirical evidence.
This fast and frugal tree (Figure 2) asks only a
few yes-no questions. If a patient has a certain
anomaly in his electrocardiogram (the so-called
ST segment), he is immediately sent to the
coronary care unit. No other information is
required. If that is not the case, a second cue
is considered whether the patients primary
complaint was chest pain. If this is not the
case, he is immediately assigned to a regular
nursing bed. No further information is sought. If
the answer is yes, then a third question is asked
to finally classify the patient.
40(No Transcript)
41Gaudis Stereostatic Model
Between the inverted rope-and-weight model and
these painted photographs, Gaudi obtained an
unorthodox, but architecturally flawless set of
plans for his famous chapel, one that no engineer
could have derived using traditional methods.
42Gaudis Stereostatic Model
Since the plan of the church was so
complicated-towers and arcs emerging from
unexpected places, leaning on other arcs and
towers-it is practically impossible to solve the
set of equations which corresponds to the
requirement of equilibrium in this complex. But
through Gaudis model all the computation was
instantaneously done by gravity! The set of arcs
arranged itself such that the whole complex is in
equilibrium, but upside down.Dorit Aharonov,
Quantum Computation, Annual Reviews of
Computational Physics VI (Dietrich Stauffer, ed.,
1998).
43How Dogs Navigate to CatchFrisbees
44 According to the notion of bounded rationality
(Simon, 1956 1992), the computational limits of
cognition and the structure of the environment
may foster the use of "satisficing" rather than
optimal strategies. Thus for many of our
decisions "fast and frugal" heuristics would be a
serviceable substitute for the proper
rule. But not always. E.g. U.K. Magistrates
bail decisions are well modelled by One-reason
decision models (despite their insistence that
they look at all the information)
45Human Judgment and choice Rational or
irrational?
How can anyone be perfectly rational in a world
where knowledge is limited, time is pressing, and
deep thought is often an unattainable luxury?
Traditional models of unbounded rationality and
optimization in cognitive science, economics, and
animal behavior have tended to view
decision-makers as possessing supernatural powers
of reason, limitless knowledge, and endless time.
But understanding judgment and decisions in the
real world requires a more psychologically
plausible notion of bounded rationality.
46How bad is Human Judgment?
- The good news is that, counter to some views,
human judgment can be very accurate though it
may not always be. - However, we are closer to understanding the
conditions where judgement may be more reliable.
(Formats of information learning conditions with
feedback). - Understanding judgement means understanding not
just the mind but how it interacts with its
environment.
47The Beauty Contest
- The game is called a beauty contest after a
famous passage in Keynes (1936) General theory
of Employment Interest and Money.
48The Beauty Contest
- The game is called a beauty contest after a
famous passage in Keynes (1936) General theory
of Employment Interest and Money.
Keynes remarked that the stock market is like a
beauty contest. He had in mind contests that were
popular in England at the time, where a newspaper
would print 100 photographs, and people would
write in and say which six faces they liked most.
Everyone who picked the most popular face was
automatically entered in a raffle, where they
could win a prize. Keynes wrote, It is not a
case of choosing those faces which, to the best
of ones judgment, are really the prettiest, nor
even those which average opinion genuinely thinks
the prettiest. We have reached the third degree
where we devote our intelligences to anticipating
what average opinion expects the average opinion
to be. And there are some, I believe, who
practise the fourth, fifth and higher degrees.
49The Beauty Contest
If you played this game repeatedly, your
thoughts might run as follows. You might assume
that the starting average would probably be 50,
so youd guess 33. But then youd say, hmmm, if
other people are as clever as I am, they will all
pick 33, so I should pick 22. But if everyone
else does that, too, I should pick two-thirds of
22. And if you carry this through infinitely many
levels of reasoning to the logical end, youll
wind up picking zero. Zero is what game theory
predicts for this situation. Game theory is the
branch of social science that analyzes strategic
interactions in mathematical terms. It was
founded quite a long time ago, but its had a
slow fuseonly in the last 10 or 15 years has it
come to the fore in reasoning about economics and
political science.
50So how do people actually behave? Do they pick
zero? The data here are from undergrads from
Singapore, Germany, the Wharton School of
Business at the University of Pennsylvania, and
Caltech. The average choice across all these
experiments was around 40, so if you guessed
about two-thirds of 40, or 27, youd probably
win. If we use these data to gauge how many
steps of reasoning people are doing about other
peoples reasoning, something from one to three
seems reasonable. Its clearly not the
game-theory prediction of infinity, but it
clearly demonstrates the performance of at least
one step of reasoning.
51Three Newspaper studies
52Three Newspaper studies
The most popular numbers in all three experiments
are two-thirds of 50 (about 33), two thirds of
this number (about 22) and the equilibria of the
game (0 and 1 in The FT, 1 in Expansion and 0 in
Spektrum). The steps of iterated dominance
interpretation claims that in the Beauty-contest
game people reason in steps. Step 0, which would
be the preliminary step of any reasoning,
translates into numbers that are arbitrarily
distributed over the interval. Level-1 reasoning
is (2/3)50 33.333. Level-2 reasoning is
(2/3)33.333 22.22 and so on.
53University of Chicago Economics PhDs Other
Economics PhDs CEOs The Caltech Board (eminent
in various fields)