An introduction to maximum likelihood - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

An introduction to maximum likelihood

Description:

Probability of consistency: Parsimony will be consistent if: ... less than 0.5, consistency requires that P2 Q(1-Q) Consistency is not guaranteed ... – PowerPoint PPT presentation

Number of Views:174

Avg rating:3.0/5.0

Slides: 34

Provided by: David51

Category:

more less

Transcript and Presenter's Notes

Title: An introduction to maximum likelihood

1
An introduction to maximum likelihood
2
What does parsimony assume?

Traditional view Just character independence -
with enough characters you should converge to the
true phylogeny
Felsenstein (1978) used a simple example to show
that parsimony assumes more

3
Four taxon case

Two states 0 1
Changes in each direction equally probable
Probability of change of states on a branch P
or Q
Which data patterns favor the true tree?

A B C D 0 0 1 1 1 1 0 0
4
How can the 1100 pattern arise?

Change on branches AB (PQ) and either
No change on the other three (1-Q)2(1-P)
Change on the other three PQ2
PQ(1-Q)2(1-P) Q2P

0
5
What is the probability of those outcomes?

Prob1100 PQ(1-Q)2(1-P) Q 2 P
Prob0011 (1-P)(1-Q)Q(1-Q)(1-P)(1-Q)QP

0
6
Consider the probability of data favoring the
tree (A,C)(B,D)

Prob1010 P(1-Q)Q2(1-P) (1-Q)2P
Prob0101 (1-P)QQ(1-Q)PQ(1-Q)(1-P)

0
7
Probability of consistency

Parsimony will be consistent if
Prob1010 Prob0101 Prob1100 Prob0011
If we assume Q is less than 0.5, consistency
requires that P2 Q(1-Q)

8
Consistency is not guaranteed
9
Inconsistency

When the model is inconsistent the tree gets
worse as you add more data
Long branch attraction (LBA)

10
Possible responses

It only applies to four taxa and two states
Still applies to 4-state data
Gets worse with more taxa
Consistency is not so important
Real data are not in the Felsenstein zone

11
Maximum likelihood

A general approach to estimating parameters in
statistics
Has many desirable statistical properties
Felsenstein suggested it could be applied to
phylogenetic inference and that it should avoid
LBA

12
The maximum likelihood criterion

The best estimate of a parameter is the value
that would be most likely to generate the
observed data

13
Application to phylogeny

Assume a model of evolution
Find the tree that would be most likely to give
the observed data given the model
Branch lengths are taken into account
Uses all data (variant and invariant)

14
An example (from Swofford et al. 1996)

What can we say about the placement of another
taxon with state C?

15
An example (from Swofford et al. 1996)

Parsimony the new taxon could attach in several
places

16
An example (from Swofford et al. 1996)

ML - One place is favored
State at ? most likely A

17
An outline of the ML approachConsider one
character, i
(It is useful to arbitrarily root the tree)
18
Sum across all possible histories for i
There are 4(n-2) arrangements for n taxa
19
For each tree we calculate the likelihood of
getting the observed states L(i)
A
G
G
G
t2
t3
t4
t5
A
t1
A
L(i) PA x PA-A(t1)x PA-G(t1)x PA-G(t1)x
PA-A(t1)x PA-G(t1)
20
Multiply across all sites (assume independence)
L will be very smalllnL will be a large negative
number
21
Tree searching

Search for the set of branch-lengths that
maximize L ( lower -lnL score)
Record that score
Search for tree topologies with the best score

Time consuming
22
Critical issues glossed over

Where do we get Pn - the probability of state n
at the arbitrary root node?
Equiprobable (25)
Empirical (frequency in the entire matrix)
Estimated (optimized by ML on each tree)
Where do we get Pi-j(t) - the probability of
going from state i to state j in time t?

23
Typical Simplifying Assumptions

Stationarity
Reversibility
Site independence
Markovian process (no memory)

24
The simplest model of molecular evolution
Jukes-Cantor
Instantaneous rate matrix (Q-matrix)
25
The simplest model of molecular evolution
Jukes-Cantor
Instantaneous rate matrix (Q-matrix)
26
Calculating probabilities of change

To convert the Q matrix into a matrix giving the
probability of starting at state i and ending in
state j, t time units later uses the formula

P(t) eQt
27
The simplest model of molecular evolution
Jukes-Cantor
Substitution probability matrix (P-matrix)
28
More complicated (realistic) models for DNA

Allow deviation from equiprobable base
frequencies
HKY85 F81GTR
Allow two substitution types (ti and tv)
K2P HKY85
Allow for six substitution types
GTR

29
Relationship among models
30
Relationship between MP and ML

One argument - MP is inherently nonparametric ?
No direct comparison possible
MP is an ML model that makes particular
assumptions

31
The Goldman (1990) model(see Lewis 1998 for more)

We force all branch lengths to be equal
The Likelihood for a character only considers the
set of ancestral states that maximizes the
likelihood

32
Why use MP

The model is clearly less realistic, but
We can do more thorough searches and data
exploration (computational efficiency)
Robust results will usually still be supported

33
Why use ML

The model (assumptions) are explicit
We can statistically compare alternative models
We can conduct parametric statistical tests
(under the assumption that we have used the
correct model)
But, even the most complex model is still
unrealistically simple

Write a Comment

User Comments (0)