An introduction to maximum likelihood - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

An introduction to maximum likelihood

Description:

Probability of consistency: Parsimony will be consistent if: ... less than 0.5, consistency requires that P2 Q(1-Q) Consistency is not guaranteed ... – PowerPoint PPT presentation

Number of Views:174
Avg rating:3.0/5.0
Slides: 34
Provided by: David51
Category:

less

Transcript and Presenter's Notes

Title: An introduction to maximum likelihood


1
An introduction to maximum likelihood
2
What does parsimony assume?
  • Traditional view Just character independence -
    with enough characters you should converge to the
    true phylogeny
  • Felsenstein (1978) used a simple example to show
    that parsimony assumes more

3
Four taxon case
  • Two states 0 1
  • Changes in each direction equally probable
  • Probability of change of states on a branch P
    or Q
  • Which data patterns favor the true tree?

A B C D 0 0 1 1 1 1 0 0
4
How can the 1100 pattern arise?
  • Change on branches AB (PQ) and either
  • No change on the other three (1-Q)2(1-P)
  • Change on the other three PQ2
  • PQ(1-Q)2(1-P) Q2P

0
5
What is the probability of those outcomes?
  • Prob1100 PQ(1-Q)2(1-P) Q 2 P
  • Prob0011 (1-P)(1-Q)Q(1-Q)(1-P)(1-Q)QP

0
6
Consider the probability of data favoring the
tree (A,C)(B,D)
  • Prob1010 P(1-Q)Q2(1-P) (1-Q)2P
  • Prob0101 (1-P)QQ(1-Q)PQ(1-Q)(1-P)

0
7
Probability of consistency
  • Parsimony will be consistent if
  • Prob1010 Prob0101 Prob1100 Prob0011
  • If we assume Q is less than 0.5, consistency
    requires that P2 Q(1-Q)

8
Consistency is not guaranteed
9
Inconsistency
  • When the model is inconsistent the tree gets
    worse as you add more data
  • Long branch attraction (LBA)

10
Possible responses
  • It only applies to four taxa and two states
  • Still applies to 4-state data
  • Gets worse with more taxa
  • Consistency is not so important
  • Real data are not in the Felsenstein zone

11
Maximum likelihood
  • A general approach to estimating parameters in
    statistics
  • Has many desirable statistical properties
  • Felsenstein suggested it could be applied to
    phylogenetic inference and that it should avoid
    LBA

12
The maximum likelihood criterion
  • The best estimate of a parameter is the value
    that would be most likely to generate the
    observed data

13
Application to phylogeny
  • Assume a model of evolution
  • Find the tree that would be most likely to give
    the observed data given the model
  • Branch lengths are taken into account
  • Uses all data (variant and invariant)

14
An example (from Swofford et al. 1996)
  • What can we say about the placement of another
    taxon with state C?

15
An example (from Swofford et al. 1996)
  • Parsimony the new taxon could attach in several
    places

16
An example (from Swofford et al. 1996)
  • ML - One place is favored
  • State at ? most likely A

17
An outline of the ML approachConsider one
character, i
(It is useful to arbitrarily root the tree)
18
Sum across all possible histories for i
There are 4(n-2) arrangements for n taxa
19
For each tree we calculate the likelihood of
getting the observed states L(i)
A
G
G
G
t2
t3
t4
t5
A
t1
A
L(i) PA x PA-A(t1)x PA-G(t1)x PA-G(t1)x
PA-A(t1)x PA-G(t1)
20
Multiply across all sites (assume independence)
L will be very smalllnL will be a large negative
number
21
Tree searching
  • Search for the set of branch-lengths that
    maximize L ( lower -lnL score)
  • Record that score
  • Search for tree topologies with the best score

Time consuming
22
Critical issues glossed over
  • Where do we get Pn - the probability of state n
    at the arbitrary root node?
  • Equiprobable (25)
  • Empirical (frequency in the entire matrix)
  • Estimated (optimized by ML on each tree)
  • Where do we get Pi-j(t) - the probability of
    going from state i to state j in time t?

23
Typical Simplifying Assumptions
  • Stationarity
  • Reversibility
  • Site independence
  • Markovian process (no memory)

24
The simplest model of molecular evolution
Jukes-Cantor
Instantaneous rate matrix (Q-matrix)
25
The simplest model of molecular evolution
Jukes-Cantor
Instantaneous rate matrix (Q-matrix)
26
Calculating probabilities of change
  • To convert the Q matrix into a matrix giving the
    probability of starting at state i and ending in
    state j, t time units later uses the formula

P(t) eQt
27
The simplest model of molecular evolution
Jukes-Cantor
Substitution probability matrix (P-matrix)
28
More complicated (realistic) models for DNA
  • Allow deviation from equiprobable base
    frequencies
  • HKY85 F81GTR
  • Allow two substitution types (ti and tv)
  • K2P HKY85
  • Allow for six substitution types
  • GTR

29
Relationship among models
30
Relationship between MP and ML
  • One argument - MP is inherently nonparametric ?
    No direct comparison possible
  • MP is an ML model that makes particular
    assumptions

31
The Goldman (1990) model(see Lewis 1998 for more)
  • We force all branch lengths to be equal
  • The Likelihood for a character only considers the
    set of ancestral states that maximizes the
    likelihood

32
Why use MP
  • The model is clearly less realistic, but
  • We can do more thorough searches and data
    exploration (computational efficiency)
  • Robust results will usually still be supported

33
Why use ML
  • The model (assumptions) are explicit
  • We can statistically compare alternative models
  • We can conduct parametric statistical tests
    (under the assumption that we have used the
    correct model)
  • But, even the most complex model is still
    unrealistically simple
Write a Comment
User Comments (0)
About PowerShow.com