A Practical Course in Graphical Bayesian Modeling; Class 1 - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

A Practical Course in Graphical Bayesian Modeling; Class 1

Description:

Models in WinBUGS Models in WinBUGS (Spiegelhalter, 1998) Models in WinBUGS (Spiegelhalter, 1998) Models in WinBUGS (Spiegelhalter, 1998) Models in WinBUGS ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 58
Provided by: Medewer62
Category:

less

Transcript and Presenter's Notes

Title: A Practical Course in Graphical Bayesian Modeling; Class 1


1
A Practical Course in Graphical Bayesian
Modeling Class 1
Eric-Jan Wagenmakers
2
Outline
  • A bit of probability theory
  • Bayesian foundations
  • Parameter estimation A simple example
  • WinBUGS and R2WinBUGS

3
Probability Theory (Wasserman, 2004)
  • The sample space O is the set of possible
    outcomes of an experiment.
  • If we toss a coin twice then O HH, HT, TH,
    TT.
  • The event that the first toss is heads isA
    HH, HT.

4
Probability Theory (Wasserman, 2004)
  • denotes intersection A and B
  • denotes union A or B

5
Probability Theory (Wasserman, 2004)
P is a probability measure when the following
axiomsare satisfied
1. Probabilities are never negative
2. Probabilities add to 1.
2. The probability of the union of
non-overlapping (disjoint) events is its sum
6
Probability Theory (Wasserman, 2004)
For any events A and B
O
A
B
7
Conditional Probability
The conditional probability of A given B is
O
A
B
8
Conditional Probability
You will often encounter this as
O
A
B
9
Conditional Probability
From
and
follows Bayes rule.
10
Bayes Rule
11
The Law of Total Probability
Let A1,,Ak be a partition of O. Then, for any
event B
12
The Law of Total Probability
This is just a weighted average of P(B) over
thedisjoint sets A1,,Ak. For instance, when all
P(Ai) areequal, the equation becomes
13
Bayes Rule Revisited
14
Example (Wasserman, 2004)
  • I divide my Email into three categories spam,
    low priority, and high priority.
  • Previous experience suggests that the a priori
    probabilities of a random Email belonging to
    these categories are .7, .2, and .1,
    respectively.

15
Example (Wasserman, 2004)
  • The probabilities of the word free occurring in
    the three categories is .9, .01, .01,
    respectively.
  • I receive an Email with the word free. What is
    the probability that it is spam?

16
Outline
  • A bit of probability theory
  • Bayesian foundations
  • Parameter estimation A simple example
  • WinBUGS and R2WinBUGS

17
The Bayesian Agenda
  • Bayesians use probability to quantify uncertainty
    or degree of belief about parameters and
    hypotheses.
  • Prior knowledge for a parameter ? is updated
    through the data to yield the posterior
    knowledge.

18
The Bayesian Agenda
Also note that this equation allows one to learn,
from the probability of what is observed,
something about what is not observed.
19
The Bayesian Agenda
  • But why would one measure degree of belief by
    means of probability? Couldnt we choose
    something else that makes sense?
  • Yes, perhaps we can, but the choice of
    probability is anything but ad-hoc.

20
The Bayesian Agenda
  • Assume degree of belief can be measured by a
    single number.
  • Assume you are rational, that is, not
    self-contradictory or obviously silly.
  • Then degree of belief can be shown to follow the
    same rules as the probability calculus.

21
The Bayesian Agenda
  • For instance, a rational agent would not hold
    intransitive beliefs, such as

22
The Bayesian Agenda
  • When you use a single number to measure
    uncertainty or quantify evidence, and these
    numbers do not follow the rules of probability
    calculus, you can (almost certainly?) be shown to
    be silly or incoherent.
  • One of the theoretical attractions of the
    Bayesian paradigm is that it ensures coherence
    right from the start.

23
Coherence Examplea la De Finetti
  • There exists a ticket that says If the French
    national soccer team wins the 2010 World Cup,
    this ticket pays 1.
  • You must determine the fair price for this
    ticket.
  • After you set the price, I can choose to either
    sell the ticket to you, or to buy the ticket from
    you. This is similar to how you would divide a
    pie according to the rule you cut, I choose.
  • Please write this number down, you are not
    allowed to change it later!

24
Coherence Examplea la De Finetti
  • There exists another ticket that says If the
    Spanish national soccer team wins the 2010 World
    Cup, this ticket pays 1.
  • You must again determine the fair price for this
    ticket.

25
Coherence Examplea la De Finetti
  • There exists a third ticket that says If either
    the French or the Spanish national soccer team
    wins the 2010 World Cup, this ticket pays 1.
  • What is the fair price for this ticket?

26
Bayesian Foundations
  • Bayesians use probability to quantify uncertainty
    or degree of belief about parameters and
    hypotheses.
  • Prior knowledge for a parameter ? is updated
    through the data to yield posterior knowledge.
  • This happens through the use of probability
    calculus.

27
Bayes Rule
Prior Distribution
Likelihood
Posterior Distribution
Marginal Probability of the Data
28
Bayesian Foundations
This equation allows one to learn, from the
probability of what is observed, something about
what is not observed. Bayesian statistics was
long known as inverse probability.
29
Nuisance Variables
  • Suppose ? is the mean of a normal distribution,
    and a is the standard deviation.
  • You are interested in ?, but not in a.
  • Using the Bayesian paradigm, how can you go from
    P(?, a x) to P(? x)? That is, how can you get
    rid of the nuisance parameter a? Show how this
    involves P(a).

30
Nuisance Variables
31
Predictions
  • Suppose you observe data x, and you use a model
    with parameter ?.
  • What is your prediction for new data y, given
    that youve observed x? In other words, show how
    you can obtain P(yx).

32
Predictions
33
Want to Know More?
34
Outline
  • A bit of probability theory
  • Bayesian foundations
  • Parameter estimation A simple example
  • WinBUGS and R2WinBUGS

35
Bayesian Parameter Estimation Example
  • We prepare for you a series of 10 factual
    true/false questions of equal difficulty.
  • You answer 9 out of 10 questions correctly.
  • What is your latent probability ? of answering
    any one question correctly?

36
Bayesian Parameter Estimation Example
  • We start with a prior distribution for ?. This
    reflect all we know about ? prior to the
    experiment. Here we make a standard choice and
    assume that all values of ? are equally likely a
    priori.

37
Bayesian Parameter Estimation Example
  • We then update the prior distribution by means of
    the data (technically, the likelihood) to arrive
    at a posterior distribution.

38
The Likelihood
  • We use the binomial model, in which P(D?) is
    given bywhere n 10 is the number of trials,
    and s9 is the number of successes.

39
Bayesian Parameter Estimation Example
  • The posterior distribution is a compromise
    between what we knew before the experiment (i.e.,
    the prior) and what we have learned from the
    experiment (i.e., the likelihood). The posterior
    distribution reflects all that we know about ?.

40
Mode 0.9 95 confidence interval (0.59, 0.98)
41
Bayesian Parameter Estimation Example
  • Sometimes it is difficult or impossible to obtain
    the posterior distribution analytically.
  • In this case, we can use Markov chain Monte Carlo
    algorithms to sample from the posterior. As the
    number of samples increases, the approximation to
    the analytical posterior becomes arbitrarily
    small.

42
(No Transcript)
43
Mode 0.89 95 confidence interval (0.59,
0.98) With 9000 samples, almost identical
to analytical result.
44
(No Transcript)
45
Outline
  • A bit of probability theory
  • Bayesian foundations
  • Parameter estimation A simple example
  • WinBUGS and R2WinBUGS

46
WinBUGS
Bayesian inference UsingGibbs Sampling
You want to have thisinstalled (plus the
registration key)
47
WinBUGS
  • Knows many probability distributions
    (likelihoods)
  • Allows you to specify a model
  • Allows you to specify priors
  • Will then automatically run the MCMC sampling
    routines and produce output.

48
Want to Know MoreAbout MCMC?
49
Models in WinBUGS
  • The models you can specify in WinBUGS are
    directed acyclical graphs (DAGs).

50
Models in WinBUGS(Spiegelhalter, 1998)
Below, E depends only on C
51
Models in WinBUGS(Spiegelhalter, 1998)
If the nodes are stochastic, the
jointdistribution factorizes
52
Models in WinBUGS(Spiegelhalter, 1998)
P(A,B,C,D,E) P(A) P(B) P(CA,B)
P(DA,B) P(EC)
53
Models in WinBUGS(Spiegelhalter, 1998)
This means we can sometimes performlocal
computations to get what we want
54
Models in WinBUGS(Spiegelhalter, 1998)
What is P(CA,B,D,E)?
55
Models in WinBUGS(Spiegelhalter, 1998)
P(CA,B,D,E) is proportional to P(CA,B) P(EC)
? D is irrelevant
56
WinBUGS R
  • WinBUGS produces MCMC samples.
  • We want to analyze the output in a nice program,
    such as R.
  • This can be accomplished using the R package
    R2WinBUGS

57
End of Class 1
Write a Comment
User Comments (0)
About PowerShow.com