Patterns in Evolution

About This Presentation

Title:

Patterns in Evolution

Description:

Patterns in Evolution I. Phylogenetic II. Morphological III. Historical (later) IV. Biogeographical The only trait we did not define was an autapomorphy - this is a ... – PowerPoint PPT presentation

Number of Views:188

Avg rating:3.0/5.0

Slides: 79

Provided by: Compu2

Learn more at: http://eweb.furman.edu

Category:

more less

Transcript and Presenter's Notes

Title: Patterns in Evolution

1
Patterns in Evolution I. Phylogenetic II.
Morphological III. Historical (later) IV.
Biogeographical
2
Patterns in Evolution I. Phylogenetic -
Determining the genealogical, familial patterns
among organisms, populations, species and higher
taxa - "family trees"
3
Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification
4
Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification 1.
Taxonomy - the naming of taxa (singular 'taxon")
5

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
a. Rules for naming species
Latin binomen (Drosophila melanogaster)
italicized or underlined
author recognized in some groups (insects)
Genus - species agree in gender
unambiguous within a kingdom
if a species is named twice, priority counts
based on a 'holotype' or 'type' specimen
'paratypes' show range of variation
'species' is both singular and plural genus
(s.), genera (pl.)

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
b. Rules for renaming species
if assigned to new genus, epithet stays
new author name placed in parens

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
c. Rules for higher taxa
Animal families end in "-idae" (Felidae)
Animal sub-families end in "-inae" (Homininae)
These are often derived from the same stem as the
'type genus' - the first genus described for the
family. (Felis)
Plant families end in "-aceae" (Betulaceae)
Higher taxa are capitalized, but not italicized
(as above)
adjectives are not capitalized ("hominids")

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
2. Classification - determining the hierarchical
position of each species within higher taxa.
a. The Hierarchy....

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
2. Classification - determining the hierarchical
position of each species within higher taxa.
a. The Hierarchy....
b. Issues

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
2. Classification - determining the hierarchical
position of each species within higher taxa.
a. The Hierarchy....
b. Issues
Cladogenesis you want the branching/"clade"
pattern of taxa to reflect phylogenetic
relationships Archosaurs for
crocodilians and birds

13
(No Transcript)
14

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
2. Classification - determining the hierarchical
position of each species within higher taxa.
a. The Hierarchy....
b. Issues
Cladogenesis you want the branching/"clade"
pattern of taxa to reflect phylogenetic
relationships Archosaurs for
crocodilians and birds
Anagenesis however, some evolutionary changes
are so profound that we might honor the degree
of difference ("Class Aves)

c. Terms
Monophyletic taxon includes all (and only) the
species descended from a common ancestor. Aves is
good.

16
c. Terms Monophyletic taxon includes all (and
only) the species descended from a common
ancestor. Aves is good. Paraphyletic taxon
includes all descendants of a common ancestor,
except for those placed in another taxon. So,
Reptilia is a paraphyletic group, as it
includes all diapsids and anapsids EXCEPT birds.
OR, it includes all amniotes EXCEPT mammals and
birds (this gets the synapsids).
17
c. Terms Monophyletic taxon includes all (and
only) the species descended from a common
ancestor. Aves is good. Paraphyletic taxon
includes all descendants of a common ancestor,
except for those placed in another taxon. So,
Reptilia is a paraphyletic group, as it
includes all diapsids and anapsids EXCEPT birds.
OR, it includes all amniotes EXCEPT mammals and
birds (this gets the synapsids). Polyphyletic
taxon includes organisms that do not share a
common ancestor that is in the group. To be
avoided. Fliers (Birds, Pterosaurs)
18

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
1. Taxonomy - the naming of taxa (singular
'taxon")
2. Classification - determining the hierarchical
position of each species within higher taxa.
a. The Hierarchy....
b. Issues
c. Terms
d. Philosophy of Cladistics
Term coined by Willi Hennig suggested that
classification should only include monophyletic
groups, and that phylogeny should be inferred
from the analyses of shared derived traits.
This gives strong preference to cladogenesis over
anagenesis , such that birds really be
classified as a derived group of
dinosaurs or reptiles, not as separate from
them.

http//palaeos.com/vertebrates/theropoda/dinosaurs
-birds.html
19
Linnaean Classification of Apes
Pongidae
Hylobatidae
Hominidae
Apes primates (grasping hands, binocular
vision) with no tails
20
Linnaean Classification of Apes
21
Linnaean Classification of Apes
22

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
morphological
behavioral
cellular (structural or chemical)
genetic - nitrogenous base sequence amino acid
sequence

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
morphological
behavioral
cellular (structural or chemical)
genetic - nitrogenous base sequence amino acid
sequence
can be quantitative measurements, or qualitative
"presence/absence"

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
2. Trees
a. Unrooted trees show patterns among groups
without specifying ancestral relationships

A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
26

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
2. Trees
So, A and B share three traits that C and D
don't have (1,2, 4) and are more similar to one
another than they are to C and D.

A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
27

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
2. Trees
Same for C and D.

A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
28

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
2. Trees
So, A and B share three traits that C and D
don't have (1,2, 4) and are more similar to one
another than they are to C and D.

29
A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
2. Trees
b. Rooted Trees Hypothetical patterns of
descent that could be produced with this pattern.
You might suppose it would have to be this

30
A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1

Patterns in Evolution
I. Phylogenetic
A. Systematics Taxonomy and Classification
B. Reconstructing Phylogenies
1. Characters
2. Trees
b. Rooted Trees But it could easily be one of
these, depending on whether the state 0 or 1
for traits 1 and 2 were ancestral.

1 derived
0 derived
31
Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification B.
Reconstructing Phylogenies 1. Characters 2.
Trees b. Rooted Trees SO, in order to access
ancestry, we need to compare the groups in
question to an "outgroup". An outgroup is a
sister taxon which should only share ancestral
traits with the group in question. So reptiles
would be the outgroup for comparisons among
diverse mammals, for example or a crocodile or
dinosaur would be the outgroup to a comparison
among diverse birds.
32

Now, we assume that spE expresses ANCESTRAL
characters (plesiomorphies). Any different
character state must have evolved FROM this
ancestral state - and this evolved state is
called DERIVED (apomorphy).

A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
33

Now, all species in a clade might share
plesiomorphies, because they are all ultimately
derived from the same ancestor. So shared
ancestral traits tell us nothing about patterns
of relationship within the group. But DERIVED
traits will only be shared by species that share
a more recent common ancestor...

A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
34

So, to reconstruct phylogenies and build a rooted
tree, we don't just count shared traits... we
count SHARED, DERIVED traits (synapomorphies)

A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
35

So, A and B share 3 synapomorphies 1, 2, 4, and
5 (they share these traits, and their state is
different from the outgroup). B and C share 1
synapomorphy (3).

A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
A B C
B 4 - -
C 1 2 -
D 1 1 1
Number of synapomorphies
36

Now, there are a couple rooted trees that fit
these data equally well
First, our assumed tree

A B C
B 4 - -
C 1 2 -
D 1 1 1
In this case, the shared trait between B and C
must be interpreted as an instance of
"convergent/parallel evolution (CE)", in which
the trait evolved independently in both species
(not inherited from ancestor).
3
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
1, 2, and 4
5
37

Now, there are a couple rooted trees that fit
these data equally well
But there is another

A B C
B 4 - -
C 1 2 -
D 1 1 1
In this case, the discrepancy between A, B, and C
is explained as an evolutionary "reversal" in A,
which has re-expressed the ancestral trait.
3
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
1, 2, and 4
3
5
38

In both cases, species share traits for reasons
OTHER than inheritance for an immediate common
ancestor. These are called homoplasies, and they
obviously can confound the reconstruction of
phylogenies. Both trees require 6 evolutionary
events, so they are equally "parsimonious"
(simple). We could envision lots of other trees,
but they would require more reversions and
convergent events. We apply Occam's Razor - a
philosophical dictum that we will accept (and
subsequently test) the simplest trees that
express "maximum parsimony". So these two trees
are our phylogenetic hypotheses to be tested by
more data that explicitly addresses their
differences.

The only trait we did not define was an
autapomorphy - this is a trait unique to a
species. In our examples above, each trait has
only two character states. But consider
nucleotides, where each trait (position) has 4
possibilities. we can envision that a species
might have a T whereas all other species in the
tree have A, C, or G. This would be an
autapomorphy, and obviously doesn't help us out
in phylogeny reconstruction because it doesn't
share this trait with anything else.

40
Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification B.
Reconstructing Phylogenies 1. Characters 2.
Trees 3. Molecular Evolution and Algorithms DNA,
RNA, and protein sequence data - thousands of
characters - multiple parsimonious trees
41

3. Molecular Evolution and Algorithms
a. Synapomorphies and parsimony

Are cetaceans artiodactyls, or a sister group to
the Artiodactyla?
42

3. Molecular Evolution and Algorithms
a. Synapomorphies and parsimony

Exon 7 from the gene that encodes ß-casein, a
protein in milk. Shared derived traits with
cetaceans at positions 162, 166, 177
43

3. Molecular Evolution and Algorithms
a. Synapomorphies and parsimony

6 changes required at these positions 41 over
entire 60 base sequence
9 changes required at these positions 47 over
entire 60 base sequence
44

3. Molecular Evolution and Algorithms
a. Synapomorphies and parsimony

6 changes required at these positions 41 over
entire 60 base sequence
9 changes required at these positions 47 over
entire 60 base sequence
45
PROBLEMS WITH BASE DATA

Scoring characters-its easy if its categorical
(A, C, T, G), but very difficult if it is
continuous. Need independent characters, so they
are weighted evenly.
Homoplasies are common - both as convergence or
reversal.
Ancient changes are obscured by more recent
ones... A to G, then G to C, looks like it could
be one change A to C.
Rapid radiations mean that branches/subgroups may
not have had time to evolve their own unique
synapomorphies... and we have lots of species
with autapomorphies (and are thus distinct) but
it is difficult to group them.
Trees of single genes may not "map" onto the
phylogenetic tree among species. The loss of
particular alleles may not parallel patterns of
relationships.
Hybridization and gene transfer - this can make
populations look more similar at these loci than
they really are across the whole genome.
Rates of evolution of different characters and
states differ...Some are "highly conserved' and
don't change much... others change dramatically.
This is called mosaic evolution. This affects the
"branch lengths" that are used to represent the
degree of departure (or the quantified number of
genetic changes in that unique lineage.

46
Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification B.
Reconstructing Phylogenies 1. Characters 2.
Trees 3. Molecular Evolution and Algorithms DNA,
RNA, and protein sequence data - thousands of
characters - multiple parsimonious trees a.
Synapomorphies and parsimony b. UPGMA
(unweighted pair group method with arithmetic
mean)
47

3. Molecular Evolution and Algorithms
b. UPGMA

- UPGMA assume constant mutation rates, and so
is the simplest likelihood model.
Unweighted Pair Group Method with Arithmetic Mean
These are the number of differences in AA
sequences between species-pairs.
48

3. Molecular Evolution and Algorithms
b. UPGMA

The most similar sequences are those of humans
and monkey (1 difference).
This difference accumulated over TWO lineages
since their divergence (constant mutation)
So, the branch length of each is 1 difference / 2
branches 0.5

49
1. So, we join taxa B (human) and F (monkey). 2.
Then, we AVERAGE the differences between these
taxa and each other taxon and reduce the
matrix.... so, B differs from A by 19 AA's, and F
differs from A by 18 AA's. So the average
difference between A and new taxon 'BF' 18.5
(fusion of two orange boxes into one orange box
in the new and reduced matrix). (That's why this
is called UPGMA - unweighted pair-group method
using arithmetic averages)

50
1. So, we join taxa B (human) and F (monkey). 2.
Then, we AVERAGE the differences between these
taxa and each other taxon and reduce the
matrix.... so, B differs from A by 19 AA's, and F
differs from A by 18 AA's. So the average
difference between A and new taxon 'BF' 18.5
(fusion of two orange boxes into one orange box
in the new and reduced matrix). 3. Now, in the
reduced matrix, we look for the most similar pair
(which is A and D 8 diffs). We halve the
difference to calculate each unique branch length
(4.0)

51
1. So, we join taxa B (human) and F (monkey). 2.
Then, we AVERAGE the differences between these
taxa and each other taxon and reduce the
matrix.... so, B differs from A by 19 AA's, and F
differs from A by 18 AA's. So the average
difference between A and new taxon 'BF' 18.5
(fusion of two orange boxes into one orange box
in the new and reduced matrix). 3. Now, in the
reduced matrix, we look for the most similar pair
(which is A and D 8 diffs). We halve the
difference to calculate each unique branch length
(4.0) 4. Now repeat the averaging process with
other taxa to reduce the matrix.

3. Molecular Evolution and Algorithms
b. UPGMA

Here, branch lengths are equal (and additive)
because averaging and constant mutation are
assumed. In other models, branch lengths vary
reflecting more complex models which accept
different substitution rates.
53

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
In the UPGMA example, the
Branch length is mean number
of AA substitutions in cytochrome C.
This protein has 104 AA in animals.
2) Typically, these raw data are
Converted to nucleotide substitutions
per site by dividing /length. Or, by
Multiplying this by 100, as change.
18 differences.
18/104 AA 0.173 nucleotide substitutions per
site
0.17 x 100 17.3 difference

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
In the UPGMA example, the
Branch length is mean number
of AA substitutions in cytochrome C.
This protein has 104 AA in animals.
2) Typically, these raw data are
Converted to nucleotide substitutions
per site by dividing /length. Or, by
Multiplying this by 100, as change.
18 differences.
18/104 AA 0.173 nucleotide substitutions per
site
0.17 x 100 17.3 difference

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
4) Evolutionary Modeling
The relationship between difference
and evolutionary divergence (substitution rate)
may not be linear.
- not all differences are indicative of change
Even 2 random sequences will only differ by 75
(just by chance there will be the same base at
25 of sites).
- some changes are more likely than others.
Transition mutations (A to G, C to T) are more
likely than transversions (A to C or T). So,
models incorporate a transition/transversion
ratio (2.0, above right).

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
4) Evolutionary Modeling
The relationship between difference
and evolutionary divergence (substitution rate)
May not be linear.
- Our ability to detect change depends on
existing degree of similarity. We are more likely
to detect changes in sequences that are
identical, than in sequences that are only 50
similar, because many changes in that case will
make the sequences MORE SIMILAR. So a change in
similarity from 10-12 probably represents fewer
mutations, and less genetic distance, than
observed changes from 60-62. If sequences are
60 different, a lot of mutations in one sequence
will make it more similar to the otherthus the
same NET change of 2 represents MORE
evolutionary change (Distance).

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths

A
a
c
C
b
A B C
A 22 39
B 41
C
D
E
B

a b 22
a c 39
b c 41
2 3 a b -2
5) 1 4 2a 20, so a 10.
6) The distance from A to B 22, so b 12, and
C 29.

Hypothetical sequence differences
OR a ((AC BC) AB) / 2
58

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths

A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
Hypothetical sequence differences among 5 taxa
59

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths

A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E

Hypothetical sequence differences among 5 taxa
D and E are most similar
Calculate average distance from D and E to A, B,
and C (reduce this to a 3-point problem)

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths

A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E

Hypothetical sequence differences among 5 taxa
D and E are most similar
Calculate average distance from D and E to A, B,
and C (D 32.6, E 34.6)
So, E is 2 units farther away from node, and the
distance between them is 10, so

61
D
a
D and E are the closest sequences
c
A-C
b
A-C D E
A-C - 32.6 34.6
D - 10
E -
E
a 4 b 6
a ((AC BC) AB) / 2
Now lets recompute the complete distance matrix
A B C DE
A - 22 39 40
B - 41 42
C - 19
DE -
A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
C and DE are the closet sequences
62
D
a
D and E are the closest sequences
c
A-C
b
A-C D E
A-C - 32.6 34.6
D - 10
E -
E
a 4 b 6
a ((AC BC) AB) / 2
Now lets recompute the complete distance matrix
A B C DE
A - 22 39 40
B - 41 42
C - 19
DE -
A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
Mean distance from C to AB 40, and mean
distance from DE to AB 41.
C and DE are the closet sequences
63
C and DE are the closet sequences
C
a
b is not just for that segment, it represents the
complete distance from the connecting node to the
leaves
AB C DE
AB - 40 41
C - 19
DE -
c
A-B
b
a 9 b 10 (mean)
DE
So once again, there is one unit of branch length
difference to the node of C and DE, with a total
distance of 19.
a ((AC BC) AB) / 2
C
9
Now lets recompute the complate distance matrix
31
A-B
A B C DE
A - 22 39 40
B - 41 42
C - 19
DE -
5
4
D
A B CDE
A - 22 39.5
B - 41.5
E -
6
E
64
A
Now we are in thee trivial case of 3 sequences
a
b is not just for that segment, it represents the
complete distance from the connecting node to the
leaves
b
B
A B C-E
A - 22 39.5
B - 41.5
C-E -
c
CDE
a 10 b 12
a ((AC BC) AB) / 2
A
C
9
10
20
5
4
12
D
B
6
E
65
10
A
WHICH was the outgroup? Lets Say C
20
12
B
6
E
5
4
D
9
C
A
C
9
10
20
5
4
12
D
B
6
E
66

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths
e. Maximum Likelihood Models
What evolutionary rates (in terms of transitions
and tranversion, etc., are required to give us
the pattern and rate (as measured in branch
lengths) that we SEE?
So, different models of evolution are tested.
The models are probability matrices of
substitution rates between bases.
A tree is given. The branch lengths are given.
The model of mutation changes, and the
probabilities of generating the data (sequences)
change with the model. The likelihood of a tree
is the probability that it generates the data.

67
(No Transcript)
68
(No Transcript)
69

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths
e. Maximum Likelihood Models
Neighbor Joining
Similar, but we dont prioritize which pair we
group first. Rather, we repeat the tree
formation using every possible pair-wise
combination, and then pick the tree with the
shortest total branch lengths (most conservative
evolutionary tree). Repeat, using this pair as
one node (like DE before).

70
(No Transcript)
71

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths
e. Maximum Likelihood Models
Neighbor Joining
g. Bootstrapping
Gain confidence in a node by subsampling the data
and creating a tree. Is the node still there?
How frequently is it present in 100 or 1000
subsamples of the data set?

72
Randomly sample characters (in this case, base
positions) WITH REPLACEMENT. Create the tree,
and report the frequency of a clade in the tree.
73
Bootstrap using entire 1100 bases of casein gene,
N 1000. Whales are within the Artiodactyla in
99 of clades. Whales are in clade with deer,
hippo, cow (100)
74

3. Molecular Evolution and Algorithms
Synapomorphies and parsimony
b. UPGMA
c. Branch Length Units
d. Calculating Branch Lengths
e. Maximum Likelihood Models
Neighbor Joining
g. Bootstrapping
h. Bayesian inference

Must estimate the prior probability of trees
based on external knowledge. Or, assume
equality.
Likelihood pp
P(treedata) P(datatree) P(tree)
P(data)
Where P(data) SUM(tree likelihood x prior prob)
across all trees considered.
So, for the trees considered and given their
prior probabilities, what is their fractional
probability at which the data produces each tree?
This is the posterior probability that we want.
P(treedata).

Must estimate the prior probability of trees
based on external knowledge. Or, assume
equality.
Likelihood pp
P(treedata) P(datatree) P(tree)
P(data)
Where P(data) SUM(tree likelihood x prior prob)
across all trees considered.
So, for the trees considered and given their
prior probabilities, what is their fractional
probability at which the data produces each tree?
This is the posterior probability.
Clade credibility is the sum of the probabilities
of the trees in which it occurs.