Phylogenetic inference

About This Presentation

Title:

Phylogenetic inference

Description:

Tree searching (heuristic search) Models (using Modeltest to choose one) ... Use 'hill climbing' methods. Initial tree starts the process, then we seek to ... – PowerPoint PPT presentation

Number of Views:203

Avg rating:3.0/5.0

Slides: 66

Provided by: Guille83

Category:

more less

Transcript and Presenter's Notes

Title: Phylogenetic inference

1
Phylogenetic inference
Minicourse Molecular Phylogenetics an update of
new methodological developments during the 48th
Annual Meeting of the Sociedade Brasileira de
Genetica, Aguas de Lindoia, Sao Paulo, Brasil
(Sept 2002)

Many methods available, using different
techniques, many software packages
For molecular data, the trend is towards using
methods based on explicit models based on
realistic assumptions
New improved methods and tests appear in the
literature constantly

2
Phylogenetic inference

This minicurso will review some of the widely
used (traditional) approaches and introduce two
recent developments
Bayesian inference
Genetic algorithms (MetaGA)
Review
algorithmic vs. optimality criteria approaches
(parsimony, distance methods and ML)
Tree searching (heuristic search)
Models (using Modeltest to choose one)

3
Classification of phylogenetic methods
4
Distance and discrete data
5
Algorithms versus optimality criteria

Phylogenetic inference is an estimation procedure
(best estimate)
Only have information about the contemporary
molecules (and organisms)
How do we choose a tree from the set of all
possible trees?
Two basic approaches
Algorithmic just follow a sequence of steps
Optimality criterion how to compare trees

6
Algorithmic methods

Combine tree inference and the definition of a
preferred tree into a single statement
Include UPGMA and all forms of pair-group cluster
analysis, and neighbor joining
Computationally fast because they go straight to
the final solution
The task of finding an optimal tree can not be
separated from that of evaluating a specific tree

7
Optimality criteria

Two logical steps
Define an optimality criterion (objective
function for evaluating trees)
Find the tree(s) with the best value for the
objective function (may use algorithms)
Evolutionary assumptions made in the first step
are decoupled from the computations involved in
the second step
Price for logical clarity is that these methods
can be very slow

8
Use of algorithms

Different use in the two approaches
In purely algorithmic methods, the algorithm
defines the tree selection criterion and is
fundamental
In criterion-based methods, algorithms are merely
tools used in evaluating and searching for
optimal trees
Reliability of the tree?

9
Optimality criteria

Parsimony select the tree that minimizes the
total tree length (number of steps or character
transformations required to explain a given set
of data)
Some methods are based on models of evolutionary
change assumptions are made explicit.
Is parsimonys model-free nature an advantage
or a disadvantage?
Parsimony does make assumptions (consistency)

10
Optimality methods (cont.)

Maximum likelihood evaluates the probability
that a proposed model of evolution and the
hypothesized history could give rise to the
observed data (attempts to estimate the actual
amount of change)
Usually more consistent estimates with lower
variance than other methods robust to violations
of assumptions

11
Optimality criteria (cont.)

Pairwise distance methods also minimize the
effect of multiple hits when using appropriate
model to estimate the true evolutionary distance
between two sequences (less desirable than full
ML)
Additive and ultrametric distances can be fitted
to a tree such that all pairwise distances are
equal to the sum of the branches along the path
connecting them in the tree

Observed distances are obtained directly from the
sequences themselves and patristic distances
from a tree
For additive and ultrametric distances, the
observed and tree distances match exactly

Additive
Ultrametric
For real data this is rarely the case, indicating
that observed distances cannot be completely
accurately represented by a tree.
13
Classification of phylogenetic methods
14
UPGMA an algorithmic method

Cluster analysis Unweighted pair group method
using arithmetic averages (Sneath and Sokal 1973)
Assumes ultrametricity

15
UPGMA example

Given a matrix of pairwise distances, find the
clusters (taxa) i and j such that dij is the Min
value in the table
Define the depth of the branching between i and j
(lij) to be dij/2
If i and j were the last two clusters, the tree
is complete. Otherwise, create a new cluster
called u
Define a distance from u to each other cluster
(k, with k ? i or j) to be the average of the
distances dki and dkj
Go back to step 1 with one less cluster clusters
i and j have been eliminated, and cluster u has
been added

16
Distance Matrices and phenogram
17
Classification of phylogenetic methods
18
Parsimony methods

The most widely-used method, familiar notion in
science (simplicity)
Shared attributes among taxa are inherited from
common ancestors
When character conflicts occur, ad hoc hypotheses
cannot be avoided if you want to explain all the
data, and assumptions of homoplasy must be
invoked

19
Parsimony

From all sets of possible trees, find all trees ?
such that L(?) is minimal
B is the number of branches
N is the number of characters
k and k are the two nodes incident to each
branch k
xkj and xkj represent either elements of the
input matrix or optimal-character assignments
made to internal nodes
Diff(y,z) is a function specifying the cost of
transformation from y to z along any branch--for
unrooted trees diff(y,z)diff(z,y) Diff may be
defined by cost matrix
The coefficient w is the weight assigned to each
character (a priori or a posteriori)

20
Other parsimony variants

Dollo parsimony every derived character must be
uniquely derived (originate only once in the
tree)
Homoplasy only reversals are allowed (no
parallelism or convergence)
In practice, Dollo parsimony does not require
inclusion of hypothetical ancestors just
character polarity (unrooted Dollo)
Convenient for restriction-site characters
(easier to loose that to gain a site)

21
Dollo parsimony and RFLP data
Relaxed Dollo criterion, may be appplied using
generalized parsimony
22
Generalized Parsimony

All parsimony variants can be subsumed into a
generalized method that assigns a cost for each
possible transformation
Costs are represented in a m-by-m cost matrix S,
where each element Sij represents the increase in
tree length due to a transformation from state i
to j
The cost of each transformation (weight) can be
determined a priori (e.g. for RFLPs or for
transition/transversion changes) or a posteriori
(using the same data, e.g. successive
approximations method)

23
Generalized Parsimony Cost matrices
24
Protein parsimony

A 20x20 matrix specifies the cost for each
possible transformation
The matrix may be based on the genetic code
(PROTPARS matrix) and/or the biochemical
properties of the amino acids themselves (Dayhoff
matrices)

25
Difference in perspective MP and ML

Parsimony seeks solutions that minimize the
amount of change required to explain the data
(underestimates superimposed changes)
ML attempts to estimate the actual amount of
change (by specifying the evolutionary model that
will account for the data with the highest
likelihood)
Methods that incorporate models of evolutionary
change can make more efficient use of the data

26
Classification of phylogenetic methods
27
Distance methods

Experimentally derived distances are assumed to
be estimates of true distances
We want to fit them to a mathematical model
(additive tree) and find the optimal value for
the adjustable parameters
Branching pattern
Branch lenghts
Some methods Fitch Margoliash, minimum evolution
(ME)

28
Distance Methods

Alternative approach to ML for minimizing the
impact of the underestimation problem if
corrected distances are used
Corrected distances are assumed to be estimates
of the true evolutionary distance (between a pair
of sequences)
Distance methods are less desirable
approximations to a full ML approach, but much
faster
But some drawbacks of character data-to-distance
transformations are information loss and
difficulty for combination of two or more data
sets

29
The problem...

We have uncertain data (distance estimates) that
we want to fit to a particular mathematical model
(an additive tree) and find optimal values for
the adjustable parameters
The branching pattern
The branch lengths

30
An additive distance measure defines a tree...
For any 2 sequences, the value in the distance
matrix should correspond to the sum of the branch
lengths along the path between the 2 sequences on
the tree.
31
When distances are not ultrametric but only
metric they can still be represented by a tree
An additive tree
Additive trees also represent additive distances
exactly...
32

While this tree is additive, it is not
ultrametric
Notice that sequences b and c are the most
similar (3), but ARE NOT the most closely related
Similarity and and evolutionary relationship will
only coincide exactly if the distances are
ultrametric

33
Additive-tree methods

Due to the finite amount of available data,
stochastic variation will cause deviations of the
estimated evolutionary distances from perfect
tree additivity...
even when evolution proceeds according to the
model used for distance correction (JC, K2P,
HKY85, etc)
Many methods exist that derive a tree (w/ branch
lengths) from a distance matrix to come closest
to being additive
The discrepancy (distortion) between observed
and tree distances can be used as an indicator
(optimality criterion) of how well observed
distances fit a tree like representation (but
confusion with algorithm)

34
Fitch-Margoliash and related methods

E definition of disagreement between data and
tree
Alpha and weights must be defined
If alpha1 then this is an absolute difference
criterion
If alpha2 thenthis is a least squares criterion
Weighting schemes (w) more commonly used are 1,
1/dij, 1/dij2, and 1/variance(dij)

35
Minimum Evolution Method

Use unweighted least squares criterion (w1,
alpha2) to fit branch lengths, but a different
criterion to evaluate and compare trees

Optimality criterion sum of the absolute values
of the BL that minimize the squared deviations
between observed and path-length distances
2T-3 is the number of independent branches in an
unrooted tree
36
Computation and tree-searchproblems

Sometimes negative branch lengths will be defined
to optimize the fit (E in the equation) some
solutions
Outright rejection of the tree with negative
branch lenghts (too drastic)
Constrain the optimization process to disallow
negative branch lengths (set them to zero)
Least-square and minimum-abs-deviation methods
assume that each pairwise distance is independent
(not generally true because of common
evolutionary history of the molecules)
Also remember loss of information when
summarizing discrete data as a distance matrix

37
Classification of phylogenetic methods
38
Maximum Likelihood methods

Evaluates a hypothesis about evolutionary history
in terms of the probability that a proposed model
of the evolutionary process and the hypothesized
history would give rise to the observed data
L Pr (DH)
A history with a higher probability of giving
rise to the current state of affairs is preferred
Cavalli-Sforza and Edwards (1967) and Felsenstein
(1981, 1993) and others.

39
ML Objective

Data are observed sequences (DNA or prot)
Unknowns are branching order (topology) and
branch lengths of a tree
A concrete model of evolution that transforms one
sequence into another needs to be specified
(fully defined or with uncertain parameters that
need to be estimated from the data)
L Pr (DH)
Trees with higher likelihoods are preferred

40
Calculating L for a tree

Aligned sequences for 4 taxa
We want to evaluate the tree shown
What is the prob that this tree generated the
data?

41
Calculating L for a tree

Root the tree at any internal node (models are
time-reversible)
Assumption of independence allows to calculate L
for each site separately
Then combine the likelihoods into a total value
at the end

42
Calculating L for a tree

To calculate L for some site j, we must consider
all possible scenarios by which the tip sequences
could have evolved
Specifically, the root (6) may have had A, C, T,
or G
For each of these possibilities, the other
internal node (5) also might have possessed any
of the 4 nucleotides

43
Calculating L for a tree

Thus, there are 4x416 possibilities to consider

44
Calculating L for a tree

Calculate the probability of each and sum them to
obtain the total probability for site j
Assume that the changes along each branch are
independent (Markov model)
Thus, the Pr of any single scenario is equal to
the product of the Pr of the changes required by
that scenario

45
Calculating L for a tree

Because the Pr of any single observation is an
extremely small number, we evaluate the log of
the likelihood instead
Probabilities are accumulated as the sum of logs
of the single-site likelihoods

46
Difference in perspective MP and ML

Parsimony seeks solutions that minimize the
amount of change required to explain the data
(underestimates superimposed changes)
ML attempts to estimate the actual amount of
change (by specifying the evolutionary model that
will account for the data with the highest
likelihood)
Methods that incorporate models of evolutionary
change can make more efficient use of the data

47
Difference in perspective ML vs MP
48
Parsimony and likelihood
ML and MP scores for all 15 unrooted trees for
mtDNA sequence data
49
MP and Inconsistency
50
Long Branch Attraction

The Felsenstein Zone
What are the assumptions for MP?
How can we tell if theres LBA in our data?

51
Searching for optimal trees

Methods with explicit optimality criteria
Parsimony
Maximum likelihood
Additive-tree distance
Separate the problem of
evaluating the tree
finding the optimal tree(s)
Can we evaluate all possible trees for a
particular problem?

52
Searching for optimal trees

For small to moderate data sets, with as many as
8-20 taxa, we can use exact methods
Exact methods guarantee the discovery of all
optimal trees
Exact methods include
Exhaustive search
Branch-and-bound search

53
How many trees?
54
And for more than 10 taxa?
55
Exhaustive search enumerate al possible trees
56
Branch-and-bound Does not require exhaustive
search and yet provides an exact solution. 1.
Traverse a search tree in a depth-first
sequence 2. Select upper bound (L) on optimal
value of chosen criterion. 3. Move along path to
tips and evaluate trees. If tree is L then
dispense the rest of that path.
57
Approximate methods

For larger data sets computing time becomes
prohibitive and we only explore some subset of
all possible trees (hoping that the optimal trees
will be found in the subset explored)
Heuristic approaches sacrifice the guarantee of
optimality in favor of reduced computer time
Use hill climbing methods. Initial tree starts
the process, then we seek to improve its score
When we can find no way to further improve the
score, we stop.We dont know if we reached a
local or a global optimum

58
Initial trees

May be obtained by stepwise addition, the most
commonly used method
Similar to exhaustive search but evaluate trees
at every step, each time you add a new taxon and
only follow the path derived from the optimal
tree
Which taxa do you choose first? Which do you
connect next?
These are greedy algorithms

59
Stepwise addition
60

Initial trees also may be obtained by star
decomposition, another greedy algorithm

61
Branch swapping

To improve the initial estimate we can perform
sets of predefined rearrangements on the tree
Any of these rearrangements amounts to a stab in
the dark
Globally optimal trees may be several
rearrangements away from the starting tree
If a better tree is found, a new round of
rearrangements is then performed in the new tree
Several branch-swapping algorithms are available

62
Branch swapping by tree bisection and
reconnection (TBR) 1. Tree is bisected along a
branch, yielding two disjunct subtrees 2. The
subtrees are reconnected by joining a pair of
branches, one from each subtree 3. All possible
bisections and pairwise reconnections are
evaluated
63
Branch swapping by subtree prunning and
regrafting 1. A subtree is pruned from the tree
(e.g. A,B) 2. The subtree is then regrafted to a
different location on the tree 3. All possible
subtree removals and reattachment points are
evaluated
64
Branch swapping by nearest-neighbor interchanges
(NNI) 1. Each interior branch of the tree
defines a local region of four subtrees
2. Interchanging a subtree on one side of the
branch with one from the other constitutes an
NNI 3. Two such rearrangements are possible for
each interior branch (all interior branches are
swapped)
65
Landscapes and the problem of islands of trees

Write a Comment

User Comments (0)