Title: Patterns in Evolution
1Patterns in Evolution I. Phylogenetic II.
Morphological III. Historical (later) IV.
Biogeographical
2Patterns in Evolution I. Phylogenetic -
Determining the genealogical, familial patterns
among organisms, populations, species and higher
taxa - "family trees"
3Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification
4Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification 1.
Taxonomy - the naming of taxa (singular 'taxon")
5- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - a. Rules for naming species
- Latin binomen (Drosophila melanogaster)
- italicized or underlined
- author recognized in some groups (insects)
- Genus - species agree in gender
- unambiguous within a kingdom
- if a species is named twice, priority counts
- based on a 'holotype' or 'type' specimen
- 'paratypes' show range of variation
- 'species' is both singular and plural genus
(s.), genera (pl.)
6- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - b. Rules for renaming species
- if assigned to new genus, epithet stays
- new author name placed in parens
7- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - c. Rules for higher taxa
- Animal families end in "-idae" (Felidae)
- Animal sub-families end in "-inae" (Homininae)
- These are often derived from the same stem as the
'type genus' - the first genus described for the
family. (Felis) - Plant families end in "-aceae" (Betulaceae)
- Higher taxa are capitalized, but not italicized
(as above) - adjectives are not capitalized ("hominids")
8- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - 2. Classification - determining the hierarchical
position of each species within higher taxa. - a. The Hierarchy....
-
9 10 11- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - 2. Classification - determining the hierarchical
position of each species within higher taxa. - a. The Hierarchy....
- b. Issues
-
12- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - 2. Classification - determining the hierarchical
position of each species within higher taxa. - a. The Hierarchy....
- b. Issues
- Cladogenesis you want the branching/"clade"
pattern of taxa to reflect phylogenetic
relationships Archosaurs for - crocodilians and birds
-
13(No Transcript)
14- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - 2. Classification - determining the hierarchical
position of each species within higher taxa. - a. The Hierarchy....
- b. Issues
- Cladogenesis you want the branching/"clade"
pattern of taxa to reflect phylogenetic
relationships Archosaurs for - crocodilians and birds
- Anagenesis however, some evolutionary changes
are so profound that we might honor the degree
of difference ("Class Aves) -
15- c. Terms
- Monophyletic taxon includes all (and only) the
species descended from a common ancestor. Aves is
good. -
16c. Terms Monophyletic taxon includes all (and
only) the species descended from a common
ancestor. Aves is good. Paraphyletic taxon
includes all descendants of a common ancestor,
except for those placed in another taxon. So,
Reptilia is a paraphyletic group, as it
includes all diapsids and anapsids EXCEPT birds.
OR, it includes all amniotes EXCEPT mammals and
birds (this gets the synapsids).
17c. Terms Monophyletic taxon includes all (and
only) the species descended from a common
ancestor. Aves is good. Paraphyletic taxon
includes all descendants of a common ancestor,
except for those placed in another taxon. So,
Reptilia is a paraphyletic group, as it
includes all diapsids and anapsids EXCEPT birds.
OR, it includes all amniotes EXCEPT mammals and
birds (this gets the synapsids). Polyphyletic
taxon includes organisms that do not share a
common ancestor that is in the group. To be
avoided. Fliers (Birds, Pterosaurs)
18- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- 1. Taxonomy - the naming of taxa (singular
'taxon") - 2. Classification - determining the hierarchical
position of each species within higher taxa. - a. The Hierarchy....
- b. Issues
- c. Terms
- d. Philosophy of Cladistics
- Term coined by Willi Hennig suggested that
classification should only include monophyletic
groups, and that phylogeny should be inferred
from the analyses of shared derived traits. - This gives strong preference to cladogenesis over
anagenesis , such that birds really be
classified as a derived group of - dinosaurs or reptiles, not as separate from
them.
http//palaeos.com/vertebrates/theropoda/dinosaurs
-birds.html
19Linnaean Classification of Apes
Pongidae
Hylobatidae
Hominidae
Apes primates (grasping hands, binocular
vision) with no tails
20Linnaean Classification of Apes
21Linnaean Classification of Apes
22- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
-
23- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- morphological
- behavioral
- cellular (structural or chemical)
- genetic - nitrogenous base sequence amino acid
sequence -
24- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- morphological
- behavioral
- cellular (structural or chemical)
- genetic - nitrogenous base sequence amino acid
sequence - can be quantitative measurements, or qualitative
"presence/absence"
25- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- 2. Trees
- a. Unrooted trees show patterns among groups
without specifying ancestral relationships -
A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
26- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- 2. Trees
- So, A and B share three traits that C and D
don't have (1,2, 4) and are more similar to one
another than they are to C and D. -
A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
27- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- 2. Trees
- Same for C and D.
-
A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
28- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- 2. Trees
- So, A and B share three traits that C and D
don't have (1,2, 4) and are more similar to one
another than they are to C and D. -
29A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- 2. Trees
- b. Rooted Trees Hypothetical patterns of
descent that could be produced with this pattern.
You might suppose it would have to be this -
30A B C D
Trait 1 0 0 1 1
Trait 2 0 0 1 1
Trait 3 0 1 1 0
Trait 4 1 1 0 0
Trait 5 1 1 1 1
- Patterns in Evolution
- I. Phylogenetic
- A. Systematics Taxonomy and Classification
- B. Reconstructing Phylogenies
- 1. Characters
- 2. Trees
- b. Rooted Trees But it could easily be one of
these, depending on whether the state 0 or 1
for traits 1 and 2 were ancestral. -
1 derived
0 derived
31Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification B.
Reconstructing Phylogenies 1. Characters 2.
Trees b. Rooted Trees SO, in order to access
ancestry, we need to compare the groups in
question to an "outgroup". An outgroup is a
sister taxon which should only share ancestral
traits with the group in question. So reptiles
would be the outgroup for comparisons among
diverse mammals, for example or a crocodile or
dinosaur would be the outgroup to a comparison
among diverse birds.
32- Now, we assume that spE expresses ANCESTRAL
characters (plesiomorphies). Any different
character state must have evolved FROM this
ancestral state - and this evolved state is
called DERIVED (apomorphy). -
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
33- Now, all species in a clade might share
plesiomorphies, because they are all ultimately
derived from the same ancestor. So shared
ancestral traits tell us nothing about patterns
of relationship within the group. But DERIVED
traits will only be shared by species that share
a more recent common ancestor... -
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
34- So, to reconstruct phylogenies and build a rooted
tree, we don't just count shared traits... we
count SHARED, DERIVED traits (synapomorphies) -
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
35- So, A and B share 3 synapomorphies 1, 2, 4, and
5 (they share these traits, and their state is
different from the outgroup). B and C share 1
synapomorphy (3). -
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
A B C
B 4 - -
C 1 2 -
D 1 1 1
Number of synapomorphies
36- Now, there are a couple rooted trees that fit
these data equally well - First, our assumed tree
-
A B C
B 4 - -
C 1 2 -
D 1 1 1
In this case, the shared trait between B and C
must be interpreted as an instance of
"convergent/parallel evolution (CE)", in which
the trait evolved independently in both species
(not inherited from ancestor).
3
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
1, 2, and 4
5
37- Now, there are a couple rooted trees that fit
these data equally well - But there is another
-
A B C
B 4 - -
C 1 2 -
D 1 1 1
In this case, the discrepancy between A, B, and C
is explained as an evolutionary "reversal" in A,
which has re-expressed the ancestral trait.
3
A B C D E
Trait 1 0 0 1 1 1
Trait 2 0 0 1 1 1
Trait 3 0 1 1 0 0
Trait 4 1 1 0 0 0
Trait 5 1 1 1 1 0
1, 2, and 4
3
5
38- In both cases, species share traits for reasons
OTHER than inheritance for an immediate common
ancestor. These are called homoplasies, and they
obviously can confound the reconstruction of
phylogenies. Both trees require 6 evolutionary
events, so they are equally "parsimonious"
(simple). We could envision lots of other trees,
but they would require more reversions and
convergent events. We apply Occam's Razor - a
philosophical dictum that we will accept (and
subsequently test) the simplest trees that
express "maximum parsimony". So these two trees
are our phylogenetic hypotheses to be tested by
more data that explicitly addresses their
differences. -
39- The only trait we did not define was an
autapomorphy - this is a trait unique to a
species. In our examples above, each trait has
only two character states. But consider
nucleotides, where each trait (position) has 4
possibilities. we can envision that a species
might have a T whereas all other species in the
tree have A, C, or G. This would be an
autapomorphy, and obviously doesn't help us out
in phylogeny reconstruction because it doesn't
share this trait with anything else. -
40Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification B.
Reconstructing Phylogenies 1. Characters 2.
Trees 3. Molecular Evolution and Algorithms DNA,
RNA, and protein sequence data - thousands of
characters - multiple parsimonious trees
41- 3. Molecular Evolution and Algorithms
- a. Synapomorphies and parsimony
-
Are cetaceans artiodactyls, or a sister group to
the Artiodactyla?
42- 3. Molecular Evolution and Algorithms
- a. Synapomorphies and parsimony
-
Exon 7 from the gene that encodes ß-casein, a
protein in milk. Shared derived traits with
cetaceans at positions 162, 166, 177
43- 3. Molecular Evolution and Algorithms
- a. Synapomorphies and parsimony
-
6 changes required at these positions 41 over
entire 60 base sequence
9 changes required at these positions 47 over
entire 60 base sequence
44- 3. Molecular Evolution and Algorithms
- a. Synapomorphies and parsimony
-
6 changes required at these positions 41 over
entire 60 base sequence
9 changes required at these positions 47 over
entire 60 base sequence
45PROBLEMS WITH BASE DATA
- Scoring characters-its easy if its categorical
(A, C, T, G), but very difficult if it is
continuous. Need independent characters, so they
are weighted evenly. - Homoplasies are common - both as convergence or
reversal. - Ancient changes are obscured by more recent
ones... A to G, then G to C, looks like it could
be one change A to C. - Rapid radiations mean that branches/subgroups may
not have had time to evolve their own unique
synapomorphies... and we have lots of species
with autapomorphies (and are thus distinct) but
it is difficult to group them. - Trees of single genes may not "map" onto the
phylogenetic tree among species. The loss of
particular alleles may not parallel patterns of
relationships. - Hybridization and gene transfer - this can make
populations look more similar at these loci than
they really are across the whole genome. - Rates of evolution of different characters and
states differ...Some are "highly conserved' and
don't change much... others change dramatically.
This is called mosaic evolution. This affects the
"branch lengths" that are used to represent the
degree of departure (or the quantified number of
genetic changes in that unique lineage.
46Patterns in Evolution I. Phylogenetic A.
Systematics Taxonomy and Classification B.
Reconstructing Phylogenies 1. Characters 2.
Trees 3. Molecular Evolution and Algorithms DNA,
RNA, and protein sequence data - thousands of
characters - multiple parsimonious trees a.
Synapomorphies and parsimony b. UPGMA
(unweighted pair group method with arithmetic
mean)
47- 3. Molecular Evolution and Algorithms
- b. UPGMA
-
- UPGMA assume constant mutation rates, and so
is the simplest likelihood model.
Unweighted Pair Group Method with Arithmetic Mean
These are the number of differences in AA
sequences between species-pairs.
48- 3. Molecular Evolution and Algorithms
- b. UPGMA
-
- The most similar sequences are those of humans
and monkey (1 difference). - This difference accumulated over TWO lineages
since their divergence (constant mutation) - So, the branch length of each is 1 difference / 2
branches 0.5
491. So, we join taxa B (human) and F (monkey). 2.
Then, we AVERAGE the differences between these
taxa and each other taxon and reduce the
matrix.... so, B differs from A by 19 AA's, and F
differs from A by 18 AA's. So the average
difference between A and new taxon 'BF' 18.5
(fusion of two orange boxes into one orange box
in the new and reduced matrix). (That's why this
is called UPGMA - unweighted pair-group method
using arithmetic averages)
501. So, we join taxa B (human) and F (monkey). 2.
Then, we AVERAGE the differences between these
taxa and each other taxon and reduce the
matrix.... so, B differs from A by 19 AA's, and F
differs from A by 18 AA's. So the average
difference between A and new taxon 'BF' 18.5
(fusion of two orange boxes into one orange box
in the new and reduced matrix). 3. Now, in the
reduced matrix, we look for the most similar pair
(which is A and D 8 diffs). We halve the
difference to calculate each unique branch length
(4.0)
511. So, we join taxa B (human) and F (monkey). 2.
Then, we AVERAGE the differences between these
taxa and each other taxon and reduce the
matrix.... so, B differs from A by 19 AA's, and F
differs from A by 18 AA's. So the average
difference between A and new taxon 'BF' 18.5
(fusion of two orange boxes into one orange box
in the new and reduced matrix). 3. Now, in the
reduced matrix, we look for the most similar pair
(which is A and D 8 diffs). We halve the
difference to calculate each unique branch length
(4.0) 4. Now repeat the averaging process with
other taxa to reduce the matrix.
52- 3. Molecular Evolution and Algorithms
- b. UPGMA
-
Here, branch lengths are equal (and additive)
because averaging and constant mutation are
assumed. In other models, branch lengths vary
reflecting more complex models which accept
different substitution rates.
53- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- In the UPGMA example, the
- Branch length is mean number
- of AA substitutions in cytochrome C.
- This protein has 104 AA in animals.
- 2) Typically, these raw data are
- Converted to nucleotide substitutions
- per site by dividing /length. Or, by
- Multiplying this by 100, as change.
- 18 differences.
- 18/104 AA 0.173 nucleotide substitutions per
site - 0.17 x 100 17.3 difference
54- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- In the UPGMA example, the
- Branch length is mean number
- of AA substitutions in cytochrome C.
- This protein has 104 AA in animals.
- 2) Typically, these raw data are
- Converted to nucleotide substitutions
- per site by dividing /length. Or, by
- Multiplying this by 100, as change.
- 18 differences.
- 18/104 AA 0.173 nucleotide substitutions per
site - 0.17 x 100 17.3 difference
55- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- 4) Evolutionary Modeling
- The relationship between difference
- and evolutionary divergence (substitution rate)
- may not be linear.
- - not all differences are indicative of change
- Even 2 random sequences will only differ by 75
- (just by chance there will be the same base at
25 of sites). - - some changes are more likely than others.
Transition mutations (A to G, C to T) are more
likely than transversions (A to C or T). So,
models incorporate a transition/transversion
ratio (2.0, above right).
56- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- 4) Evolutionary Modeling
- The relationship between difference
- and evolutionary divergence (substitution rate)
- May not be linear.
- - Our ability to detect change depends on
existing degree of similarity. We are more likely
to detect changes in sequences that are
identical, than in sequences that are only 50
similar, because many changes in that case will
make the sequences MORE SIMILAR. So a change in
similarity from 10-12 probably represents fewer
mutations, and less genetic distance, than
observed changes from 60-62. If sequences are
60 different, a lot of mutations in one sequence
will make it more similar to the otherthus the
same NET change of 2 represents MORE
evolutionary change (Distance).
57- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
A
a
c
C
b
A B C
A 22 39
B 41
C
D
E
B
- a b 22
- a c 39
- b c 41
- 2 3 a b -2
- 5) 1 4 2a 20, so a 10.
- 6) The distance from A to B 22, so b 12, and
C 29.
Hypothetical sequence differences
OR a ((AC BC) AB) / 2
58- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
Hypothetical sequence differences among 5 taxa
59- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
- Hypothetical sequence differences among 5 taxa
- D and E are most similar
- Calculate average distance from D and E to A, B,
and C (reduce this to a 3-point problem)
60- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
- Hypothetical sequence differences among 5 taxa
- D and E are most similar
- Calculate average distance from D and E to A, B,
and C (D 32.6, E 34.6) - So, E is 2 units farther away from node, and the
distance between them is 10, so
61D
a
D and E are the closest sequences
c
A-C
b
A-C D E
A-C - 32.6 34.6
D - 10
E -
E
a 4 b 6
a ((AC BC) AB) / 2
Now lets recompute the complete distance matrix
A B C DE
A - 22 39 40
B - 41 42
C - 19
DE -
A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
C and DE are the closet sequences
62D
a
D and E are the closest sequences
c
A-C
b
A-C D E
A-C - 32.6 34.6
D - 10
E -
E
a 4 b 6
a ((AC BC) AB) / 2
Now lets recompute the complete distance matrix
A B C DE
A - 22 39 40
B - 41 42
C - 19
DE -
A B C D E
A 22 39 39 41
B 41 41 43
C 18 20
D 10
E
Mean distance from C to AB 40, and mean
distance from DE to AB 41.
C and DE are the closet sequences
63C and DE are the closet sequences
C
a
b is not just for that segment, it represents the
complete distance from the connecting node to the
leaves
AB C DE
AB - 40 41
C - 19
DE -
c
A-B
b
a 9 b 10 (mean)
DE
So once again, there is one unit of branch length
difference to the node of C and DE, with a total
distance of 19.
a ((AC BC) AB) / 2
C
9
Now lets recompute the complate distance matrix
31
A-B
A B C DE
A - 22 39 40
B - 41 42
C - 19
DE -
5
4
D
A B CDE
A - 22 39.5
B - 41.5
E -
6
E
64A
Now we are in thee trivial case of 3 sequences
a
b is not just for that segment, it represents the
complete distance from the connecting node to the
leaves
b
B
A B C-E
A - 22 39.5
B - 41.5
C-E -
c
CDE
a 10 b 12
a ((AC BC) AB) / 2
A
C
9
10
20
5
4
12
D
B
6
E
6510
A
WHICH was the outgroup? Lets Say C
20
12
B
6
E
5
4
D
9
C
A
C
9
10
20
5
4
12
D
B
6
E
66- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
- e. Maximum Likelihood Models
- What evolutionary rates (in terms of transitions
and tranversion, etc., are required to give us
the pattern and rate (as measured in branch
lengths) that we SEE? - So, different models of evolution are tested.
The models are probability matrices of
substitution rates between bases. - A tree is given. The branch lengths are given.
The model of mutation changes, and the
probabilities of generating the data (sequences)
change with the model. The likelihood of a tree
is the probability that it generates the data.
67(No Transcript)
68(No Transcript)
69- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
- e. Maximum Likelihood Models
- Neighbor Joining
- Similar, but we dont prioritize which pair we
group first. Rather, we repeat the tree
formation using every possible pair-wise
combination, and then pick the tree with the
shortest total branch lengths (most conservative
evolutionary tree). Repeat, using this pair as
one node (like DE before).
70(No Transcript)
71- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
- e. Maximum Likelihood Models
- Neighbor Joining
- g. Bootstrapping
- Gain confidence in a node by subsampling the data
and creating a tree. Is the node still there?
How frequently is it present in 100 or 1000
subsamples of the data set?
72Randomly sample characters (in this case, base
positions) WITH REPLACEMENT. Create the tree,
and report the frequency of a clade in the tree.
73Bootstrap using entire 1100 bases of casein gene,
N 1000. Whales are within the Artiodactyla in
99 of clades. Whales are in clade with deer,
hippo, cow (100)
74- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
- e. Maximum Likelihood Models
- Neighbor Joining
- g. Bootstrapping
- h. Bayesian inference
75- Must estimate the prior probability of trees
based on external knowledge. Or, assume
equality. - Likelihood pp
- P(treedata) P(datatree) P(tree)
- P(data)
- Where P(data) SUM(tree likelihood x prior prob)
across all trees considered. - So, for the trees considered and given their
prior probabilities, what is their fractional
probability at which the data produces each tree?
This is the posterior probability that we want.
P(treedata).
76- Must estimate the prior probability of trees
based on external knowledge. Or, assume
equality. - Likelihood pp
- P(treedata) P(datatree) P(tree)
- P(data)
- Where P(data) SUM(tree likelihood x prior prob)
across all trees considered. - So, for the trees considered and given their
prior probabilities, what is their fractional
probability at which the data produces each tree?
This is the posterior probability. - Clade credibility is the sum of the probabilities
of the trees in which it occurs.
77- 3. Molecular Evolution and Algorithms
- Synapomorphies and parsimony
- b. UPGMA
- c. Branch Length Units
- d. Calculating Branch Lengths
- e. Maximum Likelihood Models
- Neighbor Joining
- g. Bootstrapping
- h. Baysian inference
- SINEs and LINEs
- - Short and Long interspersed sequences
transposable elements. - - Highly unlikely to end up in the same place in
the genome by chance - - Similarity is most likely a SHARED, DERIVED
character.
78(No Transcript)