Molecular phylogenetics 1 - PowerPoint PPT Presentation

About This Presentation
Title:

Molecular phylogenetics 1

Description:

Parsimony analysis identifies seven substitutions and places them on the five ... Under parsimony each site requires one change, which gives a total of seven changes ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 16
Provided by: jimpr8
Category:

less

Transcript and Presenter's Notes

Title: Molecular phylogenetics 1


1
Molecular phylogenetics 1
  • Level 3 Molecular Evolution and Bioinformatics
  • Jim Provan

Page and Holmes Sections 6.1-2
2
Distances vs. discrete characters
  • This division is based on how the data are
    treated
  • Distance methods first convert aligned sequences
    into a pairwise distance matrix, then input that
    matrix into a tree building method
  • Discrete methods consider each nucleotide site
    (or function of each site) separately

3
Distances vs. discrete characters
4
Distances vs. discrete characters
5
Distances vs. discrete characters
  • Trees obtained by parsimony (a discrete method)
    and minimum evolution (a distance method) are
    identical in topology and branch lengths
  • Parsimony analysis identifies seven substitutions
    and places them on the five branches of the tree
  • Distance tree apportions observed distances
    between sequences over branches of the tree
  • Under parsimony each site requires one change,
    which gives a total of seven changes
  • Summing the branch lengths of the distance tree
    gives the same value 2 1 2 1 1 7
  • Parsimony tree gives additional information
    which site contributes to which branch plus
    ancestral states

6
Clustering methods vs. search methods
  • Cluster methods follow a set of steps (an
    algorithm) and arrive at a tree
  • Advantages
  • Easy to implement, resulting in very fast
    computer programs
  • Always produce a single tree
  • Disadvantages
  • Results obtained from simple clustering
    algorithms often depend on the order in which
    sequences are added to the growing tree
  • Do not allow evaluation of competing hypotheses
    two different trees could explain data equally
    well but no way of measuring fit between tree and
    data

7
A clustering method
Round 1
Round 2
8
Search methods
  • Tree-building methods in this class use
    optimality criteria to choose among the set of
    all possible trees
  • Criterion is used to assign a score or rank
    to each tree which is a function of the
    relationship between the tree and the data
  • Require an explicit function relating tree and
    data (e.g. a model of how sequences evolve)
  • Allow comparison of how well competing hypotheses
    of evolutionary relationships fit the data
  • Major disadvantage is that optimality methods are
    computationally very expensive
  • For a given data set and tree, what is the
    optimality value?
  • Which of all possible trees has the maximum
    optimality value?

9
An optimality method
10
Non-deterministic polynomial- completeness
problems
  • Non-deterministic polynomial-completeness
    problems represent a set of problems with no
    efficient algorithm for their solution known to
    exist
  • Problem of finding the optimal evolutionary tree
    for a variety of criteria (e.g. minimum
    evolution, maximum parsimony) is NP-complete
  • For even a reasonable number of sequences (e.g.
    20) it is impossible to guarantee that the
    optimal tree has been found
  • In such cases, we must rely on heuristics to find
    something approaching the best tree, but this may
    be far from optimal
  • Human mitochondrial DNA - different researchers
    obtained quite different trees using different
    heuristic searches

11
An heuristic method
12
Subtree methods
  • The effectiveness of an heuristic search depends
    in part on the number of trees examined, which
    can be computationally demanding
  • An alternative approach is to divide the set of
    sequences into smaller sets and find optimal
    trees for these subsets
  • Smallest unrooted tree is a quartet
  • Each quartet has three possible unrooted trees
  • Quartet puzzling follows these two steps
  • For each quartet, identify the optimal tree
  • Take all four-sequence trees from step 1 and
    assemble them into a tree
  • Due to homoplasy, the best tree will usually be
    the one which contains most quartets (but this is
    an NP-complete problem as well)

13
Comparing tree-building methods
UPGMA
Neighbour joining
Maximum parsimony
Minimum evolution
Maximum likelihood
14
Comparing tree-building methods
  • Efficiency
  • Effectively the time in which a computer program
    can find a tree
  • Since virtually all optimality methods are
    NP-complete, efficient tree searching algorithms
    that guarantee the best tree are unlikely
  • Some optimality criteria can be evaluated quicker
    than others heuristic searches using parsimony
    can explore a much larger number of trees than a
    search using likelihood
  • Power
  • Measure of how much data are needed before we can
    be reasonably sure of arriving at the correct
    result
  • A method may be theoretically appealing, but if
    it requires huge numbers of sites it is not
    practical

15
Comparing tree-building methods
  • Consistency
  • Will the method converge on the true tree as data
    are added?
  • Inconsistent methods will fail even if data are
    continually added
  • Robustness
  • All tree-building methods make (implicit or
    explicit) assumptions about evolutionary
    processes
  • Sensitivity to violations of the underlying model
    which return poor estimates of phylogeny e.g.
    assumption of a molecular clock
  • Falsifiability
  • The ability to tell whether these assumptions
    have been violated i.e. that we should not be
    using the method at all!
Write a Comment
User Comments (0)
About PowerShow.com