Dan Graur - PowerPoint PPT Presentation

About This Presentation
Title:

Dan Graur

Description:

Methods of Tree Reconstruction Dan Graur * – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 62
Provided by: dan
Learn more at: http://nsmn1.uh.edu
Category:

less

Transcript and Presenter's Notes

Title: Dan Graur


1
Methods of Tree Reconstruction
  • Dan Graur

2
(No Transcript)
3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
Molecular phylogenetic approaches 1.
distance-matrix (based on distance measures) 2.
character-state (based on character states) 3.
maximum likelihood (based on both character
states and distances)
7
DISTANCE-MATRIX METHODS In the distance matrix
methods, evolutionary distances (usually the
number of nucleotide substitutions or amino-acid
replacements between two taxonomic units) are
computed for all pairs of taxa, and a
phylogenetic tree is constructed by using an
algorithm based on some functional relationships
among the distance values.
8
Multiple Alignment
9
Compute pairwise distances by correcting for
multiple hits at a single sites
Number of differences Number of changes
(e.g., number of nucleotide substitutions, number
of amino acid replacements)
10
Distance Matrix
Units Numbers of nucleotide substitutions per
1,000 nucleotide sites
11
Distance Methods UPGMA Neighbor-relations N
eighbor joining
12
UPGMA Unweighted pair-group method with
arithmetic means
13
UPGMA employs a sequential clustering algorithm,
in which local topological relationships are
identified in order of decreased similarity, and
the tree is built in a stepwise manner.
14
simple OTUs
15
composite OTU
16
(No Transcript)
17
(No Transcript)
18
UPGMA yields the correct answer only if the
distances are ultrametric! Q What happens if
the distances are only additive? Q What happens
if the distances are not even additive?
19
Neighborliness methods The neighbors-relation
method (Sattath Tversky) The neighbor-joining
method (Saitou Nei)
20
In an unrooted bifurcating tree, two OTUs are
said to be neighbors if they are connected
through a single internal node.
Neighbors ? Sister Taxa
21
If we combine OTUs A and B into one composite
OTU, then the composite OTU (AB) and the simple
OTU C become neighbors.
22
A
C
B
D
lt

Four-Point Condition
23
The Neighbor Joining Method
24
In distance-matrix methods, it is assumed
Similarity ? Kinship
25
(No Transcript)
26
From Similarity to Relationship
  • Similarities among OTUs can be due to
  • Ancestry
  • Shared ancestral characters (symplesiomorphies)
  • Shared derived characters (synapomorphy)
  • Homoplasy
  • Convergent events
  • Parallel events
  • Reversals

27
Parsimony Methods
Willi Hennig 1913-1976
28
Entities must not be multiplied beyond necessity
William of Occam (ca. 1285-1349) English
philosopher Franciscan monk William of Occam
was solemnly excommunicated by Pope John XXII.
29
MAXIMUM PARSIMONY METHODS Maximum parsimony
involves the identification of a topology that
requires the smallest number of evolutionary
changes to explain the observed differences among
the OTUs under study. In maximum parsimony
methods, we use discrete character states, and
the shortest pathway leading to these character
states is chosen as the best or maximum
parsimony tree. Often two or more trees with
the same minimum number of changes are found, so
that no unique tree can be inferred. Such trees
are said to be equally parsimonious.
30
(No Transcript)
31
(No Transcript)
32
uninformative
33
informative
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
In the case of four OTUs, an informative site can
only favor one of the three possible alternative
trees. Thus, the tree supported by the largest
number of informative sites is the most
parsimonious tree.
39
Inferring the maximum parsimony tree 1.
Identify all the informative sites. 2. For each
possible tree, calculate the minimum number of
substitutions at each informative site. 3. Sum
up the number of changes over all the informative
sites for each possible tree. 4. Choose the tree
associated with the smallest number of changes as
the maximum parsimony tree.
40
  • Maximum parsimony (Practice)
  • Data
  • TGCA
  • TACC
  • AGGT
  • AAGT
  • Step 1. Identify all the informative sites.


41
  • Maximum parsimony (Practice)
  • Data
  • TGC
  • TAC
  • AGG
  • AAG
  • Step 2. For each possible tree, calculate the
    minimum number of substitutions at each
    informative site.

42
  • Maximum parsimony (Practice)
  • Data
  • TGC
  • TAC
  • AGG
  • AAG
  • Step 3. Sum up the number of changes over all the
    informative sites for each possible tree.

4 5 6
43
  • Maximum parsimony (Practice)
  • Data
  • TGC
  • TAC
  • AGG
  • AAG
  • Step 4. Choose the tree associated with the
    smallest number of changes as the maximum
    parsimony tree.

4 5 6
44
Problem (exaggerated)
45
Fitchs (1971) method for inferring nucleotides
at internal nodes
The set at an internal node is the intersection
(?) of the two sets at its immediate descendant
nodes if the intersection is not empty. The set
at an internal node is the union (?) of the two
sets at its immediate descendant nodes if the
intersection is empty. When a union is required
to form a nodal set, a nucleotide substitution at
this position must be assumed to have occurred.
46
Fitchs (1971) method for inferring nucleotides
at internal nodes
47
Testing properties of ancestral proteins
The ability to infer in silico the sequence of
ancestral proteins, in conjunction with some
astounding developments in synthetic biology,
allow us to resurrect putative ancestral
proteins in the laboratory and test their
properties. These properties, in turn, can be
used to test hypotheses concerning the physical
environment which the ancestral organism
inhabited (its paleoenvironment).
48
Testing properties of ancestral proteins
Gaucher et al. (2003) used EF-Tu
(Elongation-Factor thermounstable) gene sequences
from completely sequenced mesophile eubacteria to
reconstruct candidate ancestral sequences at
nodes throughout the bacterial tree. These
inferred ancestral proteins were, then,
synthesized in the laboratory, and their
activities and thermal stabilities were measured
and compared to those of extant organisms.
Thermostability curves
The temperature profile of the inferred ancestral
protein was 55C, suggesting that the ancestor
of extant mesophiles was a thermophile.
49
Ancestral reconstruction is not possible with
morphological data.
50
The impossibility of exhaustively searching for
the maximum-parsimony tree when the number of
OTUs is large
51
Exhaustive Examine all trees, get the best tree
(guaranteed). Branch-and-Bound Examine some
trees, get the best tree (guaranteed). Heuristic
Examine some trees, get a tree that may or may
not be the best tree.
52
Exhaustive
53
Branch-and-Bound
Rationale The length of a tree with n1 OTUs can
either be equal to or larger than the length of a
tree with n OTUs.
Reminder The total number of substitutions in a
tree tree length
54
Branch -and- Bound
Obtain a tree by a fast method. (e.g., the
neighbor-joining method) Compute numbers of
substitutions (L) for this tree. Turn L into an
upper bound value. Rationale the maximum
parsimony tree must be either equal in length to
L or shorter.
55
Branch -and- Bound
The magnitude of the search will depend on the
data (i.e., luck).
56
Heuristic
57
(No Transcript)
58
Likelihood
  • Example Coin tossing
  • Data 10 tosses 6 heads 4 tails
  • Hypothesis Binomial distribution

59
LIKELIHOOD IN MOLECULAR PHYLOGENETICS
  • The data are the aligned sequences
  • The model is the probability of change from one
    character state to another (e.g., Jukes Cantor
    1-P model).
  • The parameters to be estimated are Topology
    Branch Lengths

60
(No Transcript)
61
Bayesian Phylogenetics
Based on Bayes Theorem
A a proposition, a hypothesis. B the
evidence. P(A) the prior, the initial degree of
belief in A. P(AB) the posterior, the new
degree of belief in A given B (the evidence).
P(BA)/P(B) represents the support B provides
for A.
Thomas Bayes (17011761)
Write a Comment
User Comments (0)
About PowerShow.com