Bayesian inference - PowerPoint PPT Presentation

About This Presentation

Title:

Bayesian inference

Description:

... group of organisms is monophyletic: Find the number of sampled ... If it is monophyletic in 74% of the trees, it has a 74% probability of being monophyletic ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 28

Provided by: csta3

Category:

more less

Transcript and Presenter's Notes

Title: Bayesian inference

1
Bayesian inference

Based on Bayesian inference using Markov Chain
Monte Carlo in phylogenetic studies by TorbojÖrn
Karfunkel

Presented by Amir Hadadi Bioinformatics seminar,
spring 2005
2
What is Bayesian inference ?

Definition an approach to statistics in which
all forms of uncertainty are expressed in terms
of probability (Radford M. Neal)

3
Probability reminder

Conditional probability
P(D?T) P(DT)?P(T)
P(D?T) P(TD)?P(D)

Bayes theorem P(TD) P(DT)?P(T)/P(D)

P(TD) is called the posterior probability of T
P(T) is the prior probability, that is the
probability assigned to T before seeing the data
P(DT) is the likelihood of T, which is what we
try to maximize in ML
P(D) is the probability of observing the data D
disregarding which tree is correct

4
posterior vs. likelihood probabilitiesBayesian
inference vs. Maximum likelihood
Observation Fair Biased
1/6
1/21
1/6
2/21
1/6
3/21
1/6
4/21

100 dice
some fair, some biased

1/6
5/21
1/6
6/21
5
Example continued

A die is drawn at random from the box
Rolling the die twice gives us a and a
Using the ML approach we get
P( Fair) 1/6 ? 1/6 0.028
P( Biased) 4/21 ? 6/21 0.054
ML Conclusion the die is biased

6
Example continued further

Assume we have a prior knowledge about the dice
distribution inside the box
We know that in the box there are 90 fair dice
and 10 biased dice

7
Example conclusion

Prior probability fair 0.1, biased 0.9
Rolling the die twice gives us a and a
Using the Bayesian approach we get
P(Biased ) P(
Biased)?P(Biased)/P( )0.179
B.I. Conclusion the die is fair
Conclusion ML and BI do not necessarily agree
Resemblance of BI and ML results depends on the
strength of prior assumptions we introduce

8
Steps in B.I.

formulate a model of the problem
Formulate a prior distribution which captures
your beliefs before seeing the data
Obtain posterior distribution for the model
parameters

9
B.I. In phylogenetic reconstruction

Phylogenetic reconstruction
Finding an evolutionary tree which explains the
Data (observed species)
Methods of phylogenetic reconstruction
Using a model of sequence evolution, e.g. maximum
likelihood
Not using sequence evolution, e.g. maximum
parsimony, neighbor joining etc.
Bayesian inference belongs to the first category

10
Bayesian inference vs. Maximum likelihood

The basic question in Bayesian inference
What is the probability that this model (T) is
correct, given the data (D) that we have observed
?
Maximum likelihood asks a different question
What is the probability of seeing the observed
data (D) given that a certain model (T) is true
?
B.I. seeks P(TD), while ML maximizes P(DT)

11
Which priors should we assume ?

Knowledge about a parameter can be used to
approximate its prior distribution
Usually we dont have prior knowledge about a
parameters distribution. In this case a flat or
vague prior is assumed.

12
A flat prior
A vague prior
13
How to find the posterior probability P(TD) ?

P(T) is the assumed prior
P(DT) is the likelihood
Finding P(D) is infeasible we need to sum
P(DT)P(T) over the entire tree space
Markov Chain Monte Carlo (MCMC) gives us an
indirect way of finding P(TD) without having to
calculate P(D)

14
MCMC Example
P1/2
P1/2
P1/2
P1/2
,
,
,
,
,
,
P(Palestine) 3/7, P(Tree) 4/7
15
Symmetric simple random walk

Definition A sequence of steps in ?, starting at
0 and moving one step left or right with
probability ½
Properties
After n steps the average distance from 0 is of
magnitude ?n
A random walk in one or two dimensions is
recurrent
A random walk in three dimensions or more is
transient
The Brownian motion is a limit of a random walk

16
Definition of a markov chain

A special type of stochastic process
A sequence of random variables X0, X1, X2, such
that
Each Xi takes values in a state space S s1,
s2,
If x0, x1,, xn1 are elements of S, then
P(Xn1 xn1Xn xn, Xn-1 xn-1,,X0 x0)
P(Xn1 xn1Xn xn)

17
Using MCMC to calculate posterior probabilities

set S the set of parameters (e.g. tree
topology, mutation probability, branch lengths
etc.)
Construct an MCMC with a stationary distribution
equal to the posterior probability of the
parameters
Run the chain for a long time and sample from it
regularly
Use the samples to find the stationary
distribution

18
Constructing our MCMC

The state space S is defined as the parameter
space
Start with random tree and parameters
In each new generation, randomly propose either
A new tree topology
A new value for a model parameter
If the proposed tree has higher posterior
probability, ?proposed, than the current tree,
?current, the transition is accepted
Otherwise the transition is accepted with
probability ?proposed / ?current

19
Algorithm visualization
20
Convergence issues

An MCMC might run for a long time until its
sampled distribution is close to the stationary
distribution
The initial convergence phase is called the
burn-in phase
We wish to minimize burn-in time

21
Avoiding getting stuck on local maxima

Assume our landscape looks like this

22
Avoiding local maxima (contd)

descending a maximum can take a long time
MCMCMC (Metropolis coupled MCMC) speeds the
chains mixing rate
Instead of running a single chain, multiple
chains are run simultaneously
The chains are heated to different degrees

23
Chain heating
The cold chain has stationary distribution P(TD)
Heated chain number i has Stationary distribution
P(TD)1/i
24
The MC3 algorithm

Run multiple heated chains
At each generation, attempt a swap between two
chains
If the swap is accepted, the hotter and cooler
chains will swap states
sample only from the cold chain

25
Drawing conclusions

To Decide the value of a parameter
Draw a histogram showing the number of trees in
each interval and calculate mean, mode,
credibility intervals etc.
To find the most likely tree topologies
sort all sampled trees according to their
posterior probabilities
Pick the most probable trees until the cumulative
probability is 0.95
To Check whether a certain group of organisms is
monophyletic
Find the number of sampled trees in which it is
monophyletic
If it is monophyletic in 74 of the trees, it has
a 74 probability of being monophyletic

26
Summary

Bayesian inference is very popular in many fields
requiring statistical observations
The advent of fast computers gave rise to the use
of MCMC in B.I., enabling multi-parameter
analysis
Fields of genomics using Bayesian methods
Identification of SNPs
Inferring levels of gene expression and
regulation
Association mapping
Etc.