Title: Dan Graur
1Molecular Phylogenetics
2(No Transcript)
3(No Transcript)
4(No Transcript)
5(No Transcript)
6Molecular phylogenetic approaches 1.
distance-matrix (based on distance measures) 2.
character-state (based on character states) 3.
maximum likelihood (based on both character
states and distances)
7DISTANCE-MATRIX METHODS In the distance matrix
methods, evolutionary distances (usually the
number of nucleotide substitutions or amino-acid
replacements between two taxonomic units) are
computed for all pairs of taxa, and a
phylogenetic tree is constructed by using an
algorithm based on some functional relationships
among the distance values.
8Multiple Alignment
9Distance Matrix
Units Numbers of nucleotide substitutions per
1,000 nucleotide sites
10Distance Methods UPGMA Neighbor-relations N
eighbor joining
11UPGMA Unweighted pair-group method with
arithmetic means
12UPGMA employs a sequential clustering algorithm,
in which local topological relationships are
identified in order of decreased similarity, and
the tree is built in a stepwise manner.
13simple OTUs
14composite OTU
15(No Transcript)
16(No Transcript)
17UPGMA only works if the distances are strictly
ultrametric.
18Neighborliness methods The neighbors-relation
method (Sattath Tversky) The neighbor-joining
method (Saitou Nei)
19In an unrooted bifurcating tree, two OTUs are
said to be neighbors if they are connected
through a single internal node.
20(No Transcript)
21A
C
B
D
lt
Four-Point Condition
22(No Transcript)
23(No Transcript)
24In distance-matrix methods, it is assumed
Similarity ? Kinship
25From Similarity to Relationship
- Similarity Relationship, only if genetic
distances increase with divergence times
(monotonic distances).
26From Similarity to Relationship
- Similarities among OTUs can be due to
- Ancestry
- Shared ancestral characters (plesiomorphies)
- Shared derived characters (synapomorphy)
- Homoplasy
- Convergent events
- Parallel events
- Reversals
27(No Transcript)
28Parsimony Methods
Willi Hennig 1913-1976
29Pluralitas non est ponenda sine neccesitate.
(Plurality should not be posited without
necessity.)
Occams razor
William of Occam or Ockham (ca.
1285-1349) English philosopher Franciscan monk
Excommunicated by Pope John XXII in
1328. Officially rehabilitated by Pope Innocent
VI in 1359.
30MAXIMUM PARSIMONY METHODS Maximum parsimony
involves the identification of a topology that
requires the smallest number of evolutionary
changes to explain the observed differences among
the OTUs under study. In maximum parsimony
methods, we use discrete character states, and
the shortest pathway leading to these character
states is chosen as the best or maximum parsimony
tree. Often two or more trees with the same
minimum number of changes are found, so that no
unique tree can be inferred. Such trees are said
to be equally parsimonious.
31(No Transcript)
32(No Transcript)
33uninformative
34informative
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(No Transcript)
39Inferring the maximum parsimony tree 1.
Identify all the informative sites. 2. For each
possible tree, calculate the minimum number of
substitutions at each informative site. 3. Sum
up the number of changes over all the informative
sites for each possible tree. 4. Choose the tree
associated with the smallest number of changes as
the maximum parsimony tree.
40In the case of four OTUs, an informative site can
only favor one of the three possible alternative
trees. Thus, the tree supported by the largest
number of informative sites is the most
parsimonious tree.
41With more than 4 OTUs, an informative site may
favor more than one tree, and the maximum
parsimony tree may not necessarily be the one
supported by the largest number of informative
sites.
42The informative sites that support the internal
branches in the inferred tree are deemed to be
synapomorphies. All other informative sites are
deemed to be homoplasies.
43(No Transcript)
44Parsimony is based solely on synapomorphies
45(No Transcript)
46Variants of Parsimony Wagner-Fitch Unordered.
Character state changes are symmetric and can
occur as often as neccesary. Camin-Sokal
Complete irreversibility. Dollo Partial
irreversibility. Once a derived character is
lost, it cannot be regained. Weighted Some
changes are more likely than others. Transversion
A type of weighted parsimony, in which
transitions are ignored.
47Fitchs (1971) method for inferring nucleotides
at internal nodes
48Fitchs (1971) method for inferring nucleotides
at internal nodes
The set at an internal node is the intersection
(?) of the two sets at its immediate descendant
nodes if the intersection is not empty. The set
at an internal node is the union (?) of the two
sets at its immediate descendant nodes if the
intersection is empty. When a union is required
to form a nodal set, a nucleotide substitution at
this position must be assumed to have occurred.
number of unions minimum number of
substitutions
49Fitchs (1971) method for inferring nucleotides
at internal nodes
50(No Transcript)
51total number of substitutions in a tree tree
length
52Searching for the maximum-parsimony tree
53Exhaustive Examine all trees, get the best tree
(guaranteed). Branch-and-Bound Examine some
trees, get the best tree (guaranteed). Heuristic
Examine some trees, get a tree that may or may
not be the best tree.
54Exhaustive
55Branch -and- Bound
56Branch -and- Bound
Obtain a tree by a fast method. (e.g., the
neighbor-joining method) Compute minimum number
of substitutions (L). Turn L into an upper
bound value. Rationale (1) the maximum
parsimony tree must be either equal in length to
L or shorter. (2) A descendant tree is either
equal in length or longer than the ascendant tree.
57Branch -and- Bound
58Heuristic
59(No Transcript)
60(No Transcript)
61Likelihood
- Example Coin tossing
- Data Outcome of 10 tosses 6 heads 4
tails - Hypothesis Binomial distribution
62LIKELIHOOD IN MOLECULAR PHYLOGENETICS
- The data are the aligned sequences
- The model is the probability of change from one
character state to another (e.g., Jukes Cantor
1-P model). - The parameters to be estimated are Topology
Branch Lengths
63(No Transcript)
64Background Maximum Likelihood
How to calculate ML score for a tree
1... j ... ...N ... ...
... Seq x C...GGACGTTTA...C Seq y
C...AGATCTCTA...C ... ... ...
65Background Maximum Likelihood
Calculate likelihood for a single site j given
tree
where