Title: Probability, Bayes Theorem and the Monty Hall Problem
1Probability, Bayes Theorem and the Monty Hall
Problem
2Probability Distributions
- A random variable is a variable whose value is
uncertain. - For example, the height of a randomly selected
person in this class is a random variable I
wont know its value until the person is
selected. - Note that we are not completely uncertain about
most random variables. - For example, we know that height will probably be
in the 5-6 range. - In addition, 56 is more likely than 50 or
60 (for women). - The function that describes the probability of
each possible value of the random variable is
called a probability distribution.
3Probability Distributions
- Probability distributions are closely related to
frequency distributions.
4Probability Distributions
- Dividing each frequency by the total number of
scores and multiplying by 100 yields a percentage
distribution.
5Probability Distributions
- Dividing each frequency by the total number of
scores yields a probability distribution.
6Probability Distributions
- For a discrete distribution, the probabilities
over all possible values of the random variable
must sum to 1.
7Probability Distributions
- For a discrete distribution, we can talk about
the probability of a particular score occurring,
e.g., p(Province Ontario) 0.36. - We can also talk about the probability of any one
of a subset of scores occurring, e.g., p(Province
Ontario or Quebec) 0.50. - In general, we refer to these occurrences as
events.
8Probability Distributions
- For a continuous distribution, the probabilities
over all possible values of the random variable
must integrate to 1 (i.e., the area under the
curve must be 1). - Note that the height of a continuous distribution
can exceed 1!
9Continuous Distributions
- For continuous distributions, it does not make
sense to talk about the probability of an exact
score. - e.g., what is the probability that your height is
exactly 65.485948467 inches?
Normal Approximation to probability distribution
for height of Canadian females (parameters from
General Social Survey, 1991)
10Continuous Distributions
- It does make sense to talk about the probability
of observing a score that falls within a certain
range - e.g., what is the probability that you are
between 53 and 57? - e.g., what is the probability that you are less
than 510?
Normal Approximation to probability distribution
for height of Canadian females (parameters from
General Social Survey, 1991)
11Probability of Combined Events
12Probability of Combined Events
13Exhaustive Events
- Two or more events are said to be exhaustive if
at least one of them must occur. - For example, if A is the event that the
respondent sleeps less than 6 hours per night and
B is the event that the respondent sleeps at
least 6 hours per night, then A and B are
exhaustive. - (Although A is probably the more exhausted!!)
14Independence
15An Example The Monty Hall Problem
16Problem History
- When problem first appeared in Parade,
approximately 10,000 readers, including 1,000
PhDs, wrote claiming the solution was wrong. - In a study of 228 subjects, only 13 chose to
switch.
17Intuition
- Before Monty opens any doors, there is a 1/3
probability that the car lies behind the door you
selected (Door 1), and a 2/3 probability it lies
behind one of the other two doors. - Thus with 2/3 probability, Monty will be forced
to open a specific door (e.g., the car lies
behind Door 2, so Monty must open Door 3). - This concentrates all of the 2/3 probability in
the remaining door (e.g., Door 2).
18(No Transcript)
19Analysis
Car hidden behind Door 3
Car hidden behind Door 2
Car hidden behind Door 1
Player initially picks Door 1
Host must open Door 2
Host must open Door 3
Host opens either Door 2 or 3
Switching loses with probability 1/6
Switching wins with probability 1/3
Switching wins with probability 1/3
Switching loses with probability 1/6
Switching wins with probability 2/3
Switching loses with probability 1/3
20Notes
- It is important that
- Monty must open a door that reveals a goat
- Monty cannot open the door you selected
- These rules mean that your choice may constrain
what Monty does. - If you initially selected a door concealing a
goat, then there is only one door Monty can open. - One can rigorously account for the Monty Hall
problem using a Bayesian analysis
21End of Lecture 2
22Conditional Probability
- To understand Bayesian inference, we first need
to understand the concept of conditional
probability. - What is the probability I will roll a 12 with a
pair of (fair) dice? - What if I first roll one die and get a 6? What
now is the probability that when I roll the
second die they will sum to 12?
23Conditional Probability
- The conditional probability of A given B is the
joint probability of A and B, divided by the
marginal probability of B. - Thus if A and B are statistically independent,
- However, if A and B are statistically dependent,
then
24Bayes Theorem
- Bayes Theorem is simply a consequence of the
definition of conditional probabilities
25Bayes Theorem
- Bayes theorem is most commonly used to estimate
the state of a hidden, causal variable H based on
the measured state of an observable variable D
26Bayesian Inference
- Whereas the posterior p(HD) is often difficult
to estimate directly, reasonable models of the
likelihood p(DH) can often be formed. This is
typically because H is causal on D. - Thus Bayes theorem provides a means for
estimating the posterior probability of the
causal variable H based on observations D.
27Marginalizing
- To calculate the evidence p(D) in Bayes
equation, we typically have to marginalize over
all possible states of the causal variable H.
28The Full Monty
- Lets get back to The Monty Hall Problem.
- Lets assume you initially select Door 1.
- Suppose that Monty then opens Door 2 to reveal a
goat. - We want to calculate the posterior probability
that a car lies behind Door 1 after Monty has
provided these new data.
29The Full Monty
30The Full Monty
31But were not on Lets Make a Deal!
- Why is the Monty Hall Problem Interesting?
- It reveals limitations in human cognitive
processing of uncertainty - It provides a good illustration of many concepts
of probability - It get us to think more carefully about how we
deal with and express uncertainty as scientists. - What else is Bayes theorem good for?
32Clinical Example
- Christiansen et al (2000) studied the mammogram
results of 2,227 women at health centers of
Harvard Pilgrim Health Care, a large HMO in the
Boston metropolitan area. - The women received a total of 9,747 mammograms
over 10 years. Their ages ranged from 40 to 80.
Ninety-three different radiologists read the
mammograms, and overall they diagnosed 634
mammograms as suspicious that turned out to be
false positives. - This is a false positive rate of 6.5.
- The false negative rate has been estimated at 10.
33Clinical Example
- There are about 58,500,000 women between the ages
of 40 and 80 in the US - The incidence of breast cancer in the US is about
184,200 per year, i.e., roughly 1 in 318.
34Clinical Example