Title: A Practical Course in Graphical Bayesian Modeling; Class 1
1A Practical Course in Graphical Bayesian
Modeling Class 1
Eric-Jan Wagenmakers
2Outline
- A bit of probability theory
- Bayesian foundations
- Parameter estimation A simple example
- WinBUGS and R2WinBUGS
3Probability Theory (Wasserman, 2004)
- The sample space O is the set of possible
outcomes of an experiment. - If we toss a coin twice then O HH, HT, TH,
TT. - The event that the first toss is heads isA
HH, HT.
4Probability Theory (Wasserman, 2004)
- denotes intersection A and B
- denotes union A or B
5Probability Theory (Wasserman, 2004)
P is a probability measure when the following
axiomsare satisfied
1. Probabilities are never negative
2. Probabilities add to 1.
2. The probability of the union of
non-overlapping (disjoint) events is its sum
6Probability Theory (Wasserman, 2004)
For any events A and B
O
A
B
7Conditional Probability
The conditional probability of A given B is
O
A
B
8Conditional Probability
You will often encounter this as
O
A
B
9Conditional Probability
From
and
follows Bayes rule.
10Bayes Rule
11The Law of Total Probability
Let A1,,Ak be a partition of O. Then, for any
event B
12The Law of Total Probability
This is just a weighted average of P(B) over
thedisjoint sets A1,,Ak. For instance, when all
P(Ai) areequal, the equation becomes
13Bayes Rule Revisited
14Example (Wasserman, 2004)
- I divide my Email into three categories spam,
low priority, and high priority. - Previous experience suggests that the a priori
probabilities of a random Email belonging to
these categories are .7, .2, and .1,
respectively.
15Example (Wasserman, 2004)
- The probabilities of the word free occurring in
the three categories is .9, .01, .01,
respectively. - I receive an Email with the word free. What is
the probability that it is spam?
16Outline
- A bit of probability theory
- Bayesian foundations
- Parameter estimation A simple example
- WinBUGS and R2WinBUGS
17The Bayesian Agenda
- Bayesians use probability to quantify uncertainty
or degree of belief about parameters and
hypotheses. - Prior knowledge for a parameter ? is updated
through the data to yield the posterior
knowledge.
18The Bayesian Agenda
Also note that this equation allows one to learn,
from the probability of what is observed,
something about what is not observed.
19The Bayesian Agenda
- But why would one measure degree of belief by
means of probability? Couldnt we choose
something else that makes sense? - Yes, perhaps we can, but the choice of
probability is anything but ad-hoc.
20The Bayesian Agenda
- Assume degree of belief can be measured by a
single number. - Assume you are rational, that is, not
self-contradictory or obviously silly. - Then degree of belief can be shown to follow the
same rules as the probability calculus.
21The Bayesian Agenda
- For instance, a rational agent would not hold
intransitive beliefs, such as
22The Bayesian Agenda
- When you use a single number to measure
uncertainty or quantify evidence, and these
numbers do not follow the rules of probability
calculus, you can (almost certainly?) be shown to
be silly or incoherent. - One of the theoretical attractions of the
Bayesian paradigm is that it ensures coherence
right from the start.
23Coherence Examplea la De Finetti
- There exists a ticket that says If the French
national soccer team wins the 2010 World Cup,
this ticket pays 1. - You must determine the fair price for this
ticket. - After you set the price, I can choose to either
sell the ticket to you, or to buy the ticket from
you. This is similar to how you would divide a
pie according to the rule you cut, I choose. - Please write this number down, you are not
allowed to change it later!
24Coherence Examplea la De Finetti
- There exists another ticket that says If the
Spanish national soccer team wins the 2010 World
Cup, this ticket pays 1. - You must again determine the fair price for this
ticket.
25Coherence Examplea la De Finetti
- There exists a third ticket that says If either
the French or the Spanish national soccer team
wins the 2010 World Cup, this ticket pays 1. - What is the fair price for this ticket?
26Bayesian Foundations
- Bayesians use probability to quantify uncertainty
or degree of belief about parameters and
hypotheses. - Prior knowledge for a parameter ? is updated
through the data to yield posterior knowledge. - This happens through the use of probability
calculus.
27Bayes Rule
Prior Distribution
Likelihood
Posterior Distribution
Marginal Probability of the Data
28Bayesian Foundations
This equation allows one to learn, from the
probability of what is observed, something about
what is not observed. Bayesian statistics was
long known as inverse probability.
29Nuisance Variables
- Suppose ? is the mean of a normal distribution,
and a is the standard deviation. - You are interested in ?, but not in a.
- Using the Bayesian paradigm, how can you go from
P(?, a x) to P(? x)? That is, how can you get
rid of the nuisance parameter a? Show how this
involves P(a).
30Nuisance Variables
31Predictions
- Suppose you observe data x, and you use a model
with parameter ?. - What is your prediction for new data y, given
that youve observed x? In other words, show how
you can obtain P(yx).
32Predictions
33Want to Know More?
34Outline
- A bit of probability theory
- Bayesian foundations
- Parameter estimation A simple example
- WinBUGS and R2WinBUGS
35Bayesian Parameter Estimation Example
- We prepare for you a series of 10 factual
true/false questions of equal difficulty. - You answer 9 out of 10 questions correctly.
- What is your latent probability ? of answering
any one question correctly?
36Bayesian Parameter Estimation Example
- We start with a prior distribution for ?. This
reflect all we know about ? prior to the
experiment. Here we make a standard choice and
assume that all values of ? are equally likely a
priori.
37Bayesian Parameter Estimation Example
- We then update the prior distribution by means of
the data (technically, the likelihood) to arrive
at a posterior distribution.
38The Likelihood
- We use the binomial model, in which P(D?) is
given bywhere n 10 is the number of trials,
and s9 is the number of successes.
39Bayesian Parameter Estimation Example
- The posterior distribution is a compromise
between what we knew before the experiment (i.e.,
the prior) and what we have learned from the
experiment (i.e., the likelihood). The posterior
distribution reflects all that we know about ?.
40Mode 0.9 95 confidence interval (0.59, 0.98)
41Bayesian Parameter Estimation Example
- Sometimes it is difficult or impossible to obtain
the posterior distribution analytically. - In this case, we can use Markov chain Monte Carlo
algorithms to sample from the posterior. As the
number of samples increases, the approximation to
the analytical posterior becomes arbitrarily
small.
42(No Transcript)
43Mode 0.89 95 confidence interval (0.59,
0.98) With 9000 samples, almost identical
to analytical result.
44(No Transcript)
45Outline
- A bit of probability theory
- Bayesian foundations
- Parameter estimation A simple example
- WinBUGS and R2WinBUGS
46WinBUGS
Bayesian inference UsingGibbs Sampling
You want to have thisinstalled (plus the
registration key)
47WinBUGS
- Knows many probability distributions
(likelihoods) - Allows you to specify a model
- Allows you to specify priors
- Will then automatically run the MCMC sampling
routines and produce output.
48Want to Know MoreAbout MCMC?
49Models in WinBUGS
- The models you can specify in WinBUGS are
directed acyclical graphs (DAGs).
50Models in WinBUGS(Spiegelhalter, 1998)
Below, E depends only on C
51Models in WinBUGS(Spiegelhalter, 1998)
If the nodes are stochastic, the
jointdistribution factorizes
52Models in WinBUGS(Spiegelhalter, 1998)
P(A,B,C,D,E) P(A) P(B) P(CA,B)
P(DA,B) P(EC)
53Models in WinBUGS(Spiegelhalter, 1998)
This means we can sometimes performlocal
computations to get what we want
54Models in WinBUGS(Spiegelhalter, 1998)
What is P(CA,B,D,E)?
55Models in WinBUGS(Spiegelhalter, 1998)
P(CA,B,D,E) is proportional to P(CA,B) P(EC)
? D is irrelevant
56WinBUGS R
- WinBUGS produces MCMC samples.
- We want to analyze the output in a nice program,
such as R. - This can be accomplished using the R package
R2WinBUGS
57End of Class 1