Title: Phylogeny
1Phylogeny 28 September, 1 October Durbin
7.1-7.5, 8.1-8.4, 8.6. Claverie Notredame Chap.
13.
Tuesday
Friday
Motivation Trees Methods
Methods Algorithms
Pause
Pause
Methods Algorithms
Data formats Use of programs
2Use of phylogenetic methods Classification Group
ing of genes and proteins according to common
ancestry Epidemiological investigations Cluster
ing
3Phylogeny based on house-keeping gene (atpD)
sequence comparison
4The F0F1ATP synthase operon is used as model.
The F1 unit is composed of five subunits and two
of these a and ß are encoded by homologous genes.
At the protein level, these paralogs are compared
in members of Enterobacteriaceae (Escherichia
coli and Yersinia enterocolytica) and
Pasteurellaceae (Haemophilus influenzae and
Pasteurella multocida). Are the orthologous
genes closer related than the paralogous?
5(No Transcript)
6Gene sorting
P. mult atpD ß
H. inf atpD ß
E. coli atpD ß
H. inf atpB a
E. coli atpB a
Y. ent. atpB a
7Use of phylogenetic methods for epidemiological
studies (time and space). Examples HIV, Avian
influenza virus, Newcastle disease virus. The
high rate of mutation allows real time
analysis. Hypothetic ex.
2004 USA
2004, DK
1970, Pakistan
1960, China
8Phylogenetic methods used for epidemiological
investigations Newcastle disease virus spread
with time and place. (Ke et al. 2001. J Vir Met
97, 1-11).
Taiwan 84-99
71-97
32-90
9Phylogenetic and cluster methods used with
non-phylogenetic data Pulsed field gel
electrophoresis (rare cutting restriction
enzymes) Point mutations or INDEL of restriction
sites very difficult to interprete as phylogeny
since neither relations or directions can be
modelled Only clusters can be defined
10- Phylogenetic analysis. Variation, time and rate.
- Biologic event Point mutations accumulate
continously in DNA and some of them lead to amino
acid changes. - Time after speciation
- 0 1 2
- Species000 atgctagcta Species001
atgttagcta Species001 atgttagcta - Species002 atgctagcta Species001 atgataggta
- Variation
- 0.0 0.1 0.2
-
- Rate Time / Variation, (year/mutation) 200
Mill year / 20 - Rate is the molecular clock.
- Limitations. Rates differ between genes and
organisms (at least one order of magnitude) and
might not be constant with time. Backmutations
might underestimate the time.
11 Phylogenetic analysis. Divergence between
sequences. Variation (or similarity)
matrix species001 species002 species001 0 0
.2 speceis002 0.8 0
12Properties of phylogenetic trees. Leaves
(vertices) represent species or sequences
compared. Nodes (vertices), bifurcations,
speciation events, hypothetical ancestor
sequences. Branches (edges), linear and represent
sequence diversity but can also be of unit
length. Branch length represent sequence
variation, time and rate of change. The root
(vertice) is optional and represents the
hypothetical ancestor.
Species 2
Species 3
Node
Species 1
Linear form of tree ((( Species 1, Species 2),
Species 3 ))
Node
Branch
Root
13Phylogenetic analysis. Radial tree versus
dendrogram
A
B
A
A
B
B
C
C
D
D
C
D
14Phylogenetic analysis. Monophyletic
group Definition. A monophyletic group is
characterized by common descent of all members
(at least two) and all members share a common
node.
15Paraphyletic and polyphyletic groups Definition.
A paraphyletic group do not include all taxa with
common descent. Polyphyletic group members do
not share common descent.
Polyphyletic
Paraphyletic
16Properties of data for phylogenetic
analysis Pulsed field gel electrophoresis
(rare cutting restriction enzymes) Only
clusters can be defined no relations or directions
17Gene sorting
P. mult atpD ß
H. inf atpD ß
E. coli atpD ß
H. inf atpB a
E. coli atpB a
Y. ent. atpB a
18Phylogenetic analysis/cluster analysis. Results
are phylogeniesRelationship and direction.
19Properties of data for phylogenetic
analysis Pulsed field gel electrophoresis
(rare cutting restriction enzymes) Only
clusters can be defined no relations or directions
20Phylogenetic analysis/cluster analysis. Results
are clusters. Only distribution of taxa
21Use of phylogenetic and cluster methods and
interpretation of results Both cluster- and true
phylogenetic methods can be used for
phylogenetic analysis but the result depends on
the nature of the data. Data Method Result
Random data Phylogenetic Cluster or
Cluster Homologous characters Phylogeny Phylogen
y Evolutionary meaning or Cluster ____________
___________
22Examples of data Data with phylogenetic
meaning DNA or protein sequence data of
homologous genes Morphological characters shown
to be homologous Data without phylogenetic
meaning Most restriction patterns (PFGE, AFLP,
ribotyping) Most hybridization data (array,
Southern blot etc.)
23Phylogenetic methods (1) Distance matrix/cluster
(UPGMA, NJ) Bacterial taxonomy based on
morphological, chemical, biochemical and
physiological chacters did not allow natural
relationships to be deduced Numerical taxonomy
(Sneath and Sokal, 1963, 1973) Parsimony
(maximum parsomony) The taxonomy of animals
shall reflect their natural relatioonships Phy
logenetic Systematics (Willi Hennig 1950,
1966) Without direction (eg. Wiley 1980)
24Phylogenetic methods (2) Maximum likelihood
methods Phylogenies should be formulated in a
probalistic framework and statistically
testable. Protein and DNA sequence data are
extraordinary good for phylogenetic
interpreation and can resist such treatment.
Cavalli-Sforza and Edwards 1967
(theory) Felsenstein 1981 first practically
useful algorithms.
25Current use of phylogenetic methods.
Distance matrix/cluster
Maximum parsimony
Maximum likelihood
26Phylogenetic analysis. Methods The best
(statistically) methods are slowest and analyse
the least sequences.
ndvfull7b CTTGCTATGG CTTGGGAATA ATACCCTCGA
TCAGATGAGA GCCACTACAA ndvfull7d CTTGCTATGG
CTTGGGAATA ATACCCTCGA TCAGATGAGA
GCCACTACAA ndvfull7c CTTACTATGG CTTGGGAATA
ACACCCTCGA TCAGATGAGA GCCACTACAA ndvfull7a
CTTACTATGG CTTGGGAATA ATACCCTCGA TCAGATGAGA
GCCACCACAA ndvfull8 CTTACTATGG CTTGGGAATA
ATACCCTCGA TCAGATGAGA GCCACCACAA ndvfull6b
CTTACTATGG CTTGGGAACA ACACCCTCGA TCAGATGAGA
GCCACTACAA ndvfull3 CTTATTATGG CTTGGGAATA
ATACCCTAAA TCAGATGAGG GCCACTACAA ndvfull4
CTTATTATGG CTTGGGAATA ATACCCTTGA TCAGATGAGA
GCCACTACAA ndvfull1 CTTGTTGTGG CTTGGGAATA
ATACCCTAGA CCAGATGAGG GCCACTACAA ndvfull2
CTTATTATGG CTTGGGAATA ATACTCTAGA TCAGATGAGA
GCCACTACAA
Distance matrix methods Distance matrix
calculated
Maximum parsimony One or many trees with same
number of steps Concensus evaluation
Maximum likelihood Trees ranked according to
likelihood The tree with the highest likelihood
chosen. One tree with branch lengths
Tree assembled by cluster method eg. Neighbour
Joining method One tree with branch lengths
One tree without branch lengths
27Phylogenetic analysis. Comparison of phylogenetic
methodsConsistency a phylogenetic method is
consistent for an evolutionary model, if the
method converges on the corrrect tree as the data
becomes infinite. Efficiency a phylogenetic
method have high efficiency if it quickly
converges on the correct solution as more data
are applied to the problem. Robustness a
phylogenetic method is robust if converges on the
correct solution with violations of the
assumptions about the evolutionary model.
Hillis 1995. Syst. Biol. 44, 3-16.
28- Phylogenetic analysis. Test of robustness.
Bootstrap - Purpose. To show how well supported the nodes are
by the data. - Performance. The original data are simulated by
drawing columns randomly with replacement 100 or
1000 times. The phylogenetic analysis is repeated
and the number of nodes common in all 100 or 1000
trees summarized. - Example. Original data 1 replicate 2
replicate - Species 1 AGGA AAGA GGAA
- Species 2 ACGT AACT CGTT
- Species 3 ACGT AACT CGTT
- Species 4 ACTT AACT CTTT
- Species 5 CCGT CCCT CGTT
- linear form (2,3)4)5)1 (2,3)4)5)1 (2,3)5)4)1
29Baysian inference of phylogeny. Improved
efficiency. Start with best guess of a tree
(prior probability) Simulation of trees (MCMC,
Markov Chain Monte Carlo) Keep all the best
trees Posterior tree with probabilities Program
available from morphbank.ebc.uu.se/mrbayes/ Comp
arison to existing methods Comparable to
maximum likelihood with bootstrap with respect
to consistency just faster (Douady et al.,
2003. Mol. Biol. Evol. 20, 248-54).
30Consistency of phylogenetic methods The
Felsenstein zone or long branch attraction. A
major source of inconsistency relates to trees
mixed up with very short and very long brances
1
3
1
3
2
4
2
4
Tree generated by consistent method
Tree generated by in-consistent method
31Day 2. Phylogeny. Data formats and
programs. Program packages PHYLIP format and
programs Consensus comparison of trees.
Comparison of alignments and trees. Automatic
drawing of trees.
32Phylogenetic tree example
33Phylip alignment format. Interleaved. Example
4 20 Species001aggcgctagc Species002agtagctagc Spe
cies003agtccctagc Species004agtcgttagc aggcgctagc
aggcgctagc aggcgctagc aggcgctagc On first line,
the number of species and sequence length. On
the next line the sequences. Each sequence starts
with a name no longer than 10 characters. The
sequence starts on position 11. Longer sequences
continue in blocks separated by a blank
line-shift but without the species name. Hint.
All spaces after position 11 should be removed.
If you are not shure, use Word to search for
blank characters (space bar) and exchange with
nothing (do not type anything).
34Phylip alignment format (2) Example 4 20
I Species001aggcgctagc aggcgctagc Species002agtag
ctagc aggcgctagc Species003agtccctagc aggcgctagc
Species004agtcgttagc aggcgctagc The
non-interleaved format is shown. Remember to add
I on the first line. For more advanced use.
Different options always with the form of a
capital letter can be added on the first line and
further instructions given on the second. Line J
for jumple (reverse order of sequences) on first
line and J 137 to jumble the alignment with the
seeds 1, 3, and 7.
35Adjustment of data formats Other programs use
different multiple alignment formats. Two main
problems 1. The output of multiple alignments is
interleaved but The programs only read
non-interleaved. 2. The output of multiple
alignments is with line-breaks but the programs
only works without linebreaks.
36Overview of data formats Phylip,
Interleaved Phylip, Sequence after sequence
Sequence after sequence, with line breaks
Sequence after sequence, each sequence on
single line. Clues to get a new program to
work use right OS, use right data format, run
test data.
37Adjustment of data formats Solutions 1. Try an
other out-put from ClustalW/X 2. Change format
with http//bioweb.pasteur.fr/seqanal/formats-uk.
html 3. The hard way manual editing Search and
replace (space with nothing) from Word. Join
lines with vi-editor Unix, etc.
38 Evolutionary models to be used with phylogenetic
methods Substitution matrix The probability of
changing one nucleic acid or one amino acid.
DNA 0.25 for nucleotides (Jukes and Cantor),
Compensation for Transition/transversion bias
(F84). Kimura models. or different rates for
all four nucleotides. PAM or BLOSSUM for
amino acids Weight of sites (only parsimony and
max. Likelihood) Conserved positions are given
higher weight than more variable DNAratesgeta.lif
e.uiuc.edu/gary/programs/DNArates.html
Heuristic tree search algoritms will easier find
the best tree with weigth heterogeneity
involked.
39(No Transcript)
40Phylogenetic program packages Several hundreds
of packages are available but only a few widely
used PHYLIP use a standard input format of
data with many different programs (free) . Open
source code (in C). Every routine can be
followed. Best on unix. Can be used on PC with
DOS (see below) Can be used online from server
(Pasteur) Can be used on MAC (no
experience) PAUP more complicated input data
format with calls to different routines in the
data format (small fee). Source code hidden but
see DNAPARS and PROTPARS in PHYLIP Runs best on
MAC NTSYS for cluster analysis including
advanced use. Based on Sneath and Sokal 1973.
Numerical taxonomy. For all packages and
programs see http//evolution.genetics.washington
.edu/phylip/software.html
41Consensus comparison between two (or more)
trees. Fraction of common nodes (CONSENSE,
PHYLIP). Majority concensus rule. Requires same
number of species and same names. Primate
example NJ ((gibbonxxxx0.12482,orangutang0.0917
8)0.03594,gorillaexx0.05836, (chimpanzee0.05256
,homosapien0.04044)0.00304) MP ((gorillaexx,(h
omosapien,(orangutang,gibbonxxxx))),chimpanzee)
Input all trees in one file
42Output from CONSENSE Majority-rule and strict
consensus tree program, version 3.51c Species in
order gorillaexx, homosapien, orangutang,
gibbonxxxx, chimpanzee Sets included in the
consensus tree Set (species in order) How
many times out of 2.00 ...
2.00 ... 1.00 Sets
NOT included in consensus tree Set (species in
order) How many times out of 2.00 ..
1.00 CONSENSUS TREE the
numbers at the forks indicate the number of times
the group consisting of the species which are to
the right of that fork occurred among the trees,
out of 2.00 trees ----homosapien
--1.0 ! ----chimpanzee
--1.0 ! ! ----gibbonxxxx ! --2.0
! ----orangutang !
--------------gorillaexx remember this is an
unrooted tree!
43Statistical comparison between phylogenetic
trees Likelihood ratio test. Given two trees
with the same number of species, calculate the
lnL of the two topologies given the original
alignment from one of the trees. Calculate
lnL1 lnL2 and evaluate the significance fastDNA
ml is obtained for UNIX/LINUX OS
from geta.life.uiuc.edu/gary/programs/fastDNAml.
html
445 846 U chimpanzeeAAGCTTCACC GGCGCAATTA
TCCTCATAAT CGCCCACGGA CTTACATCCT gibbonxxxxAAGCTTT
ACA GGTGCAACCG TCCTCATAAT CGCCCACGGA
CTAACCTCTT gorillaexxAAGCTTCACC GGCGCAGTTG
TTCTTATAAT TGCCCACGGA CTTACATCAT homosapienAAGCTTC
ACC GGCGCAGTCA TTCTCATAAT CGCCCACGGA
CTTACATCCT orangutangAAGCTTCACC GGCGCAACCA
CCCTCATGAT TGCCCATGGA CTCACATCCT CATTATTATT
CTGCCTAGCA AACTCAAATT ATGAACGCAC
CCACAGTCGC CCCTGCTATT CTGCCTTGCA AACTCAAACT
ACGAACGAAC TCACAGCCGC CATTATTATT CTGCCTAGCA
AACTCAAACT ACGAACGAAC CCACAGCCGC CATTACTATT
CTGCCTAGCA AACTCAAACT ACGAACGCAC
TCACAGTCGC CCCTACTGTT CTGCCTAGCA AACTCAAACT
ACGAACGAAC CCACAGCCGC ATCATAATTC TCTCCCAAGG
ACTTCAAACT CTACTCCCAC TAATAGCCTT ATCATAATCC
TATCTCGAGG GCTCCAAGCC TTACTCCCAC
TGATAGCCTT ATCATAATTC TCTCTCAAGG ACTCCAAACC
CTACTCCCAC TAATAGCCCT ATCATAATCC TCTCTCAAGG
ACTTCAAACT CTACTCCCAC TAATAGCTTT ATCATAATCC
TCTCTCAAGG CCTTCAAACT CTACTCCCCC
TAATAGCCCT TTGATGACTC CTAGCAAGCC TCGCTAACCT
CGCCCTACCC CCTACCATTA CTGATGACTC GCAGCAAGCC
TCGCTAACCT CGCCCTACCC CCCACTATTA TTGATGACTT
CTGGCAAGCC TCGCCAACCT CGCCTTACCC
CCCACCATTA TTGATGACTT CTAGCAAGCC TCGCTAACCT
CGCCTTACCC CCCACTATTA CTGATGACTT CTAGCAAGCC
TCACTAACCT TGCCCTACCA CCCACCATCA ATCTCCTAGG
GGAACTCTCC GTGCTAGTAA CCTCATTCTC
CTGATCAAAT ACCTCCTAGG TGAACTCTTC GTACTAATGG
CCTCCTTCTC CTGGGCAAAC ACCTACTAGG AGAGCTCTCC
GTACTAGTAA CCACATTCTC CTGATCAAAT ACCTACTGGG
AGAACTCTCT GTGCTAGTAA CCACATTCTC
CTGATCAAAT ACCTTCTAGG AGAACTCTCC GTACTAATAG
CCATATTCTC TTGATCTAAC ACCACTCTCC TACTCACAGG
ATTCAACATA CTAATCACAG CCCTGTACTC ACTACTATTA
CACTCACCGG GCTCAACGTA CTAATCACGG
CCCTATACTC ACCACCCTTT TACTTACAGG ATCTAACATA
CTAATCACAG CCCTGTACTC ATCACTCTCC TACTTACAGG
ACTCAACATA CTAGTCACAG CCCTATACTC ATCACCATCC
TACTAACAGG ACTCAACATA CTAATCACAA
CCCTATACTC CCTCTACATG TTTACCACAA CACAATGAGG
CTCACTCACC CACCACATTA CCTTTACATA TTTATCATAA
CACAACGAGG CACACTTACA CACCACATTA CCTTTATATA
TTTACCACAA CACAATGAGG CCCACTCACA
CACCACATCA CCTCTACATA TTTACCACAA CACAATGAGG
CTCACTCACC CACCACATTA TCTCTATATA TTCACCACAA
CACAACGAGG TACACCCACA CACCACATCA ATAACATAAA
GCCCTCATTC ACACGAGAAA ATACTCTCAT
ATTTTTACAC AAAACATAAA ACCCTCACTC ACACGAGAAA
ACATATTAAT ACTTATGCAC CCAACATAAA ACCCTCATTT
ACACGAGAAA ACATCCTCAT ATTCATGCAC ACAACATAAA
ACCCTCATTC ACACGAGAAA ACACCCTCAT
GTTCATACAC ACAACATAAA ACCTTCTTTC ACACGCGAAA
ATACCCTCAT GCTCATACAC CTATCCCCCA TCCTCCTTCT
ATCCCTCAAT CCTGATATCA TCACTGGATT CTCTTCCCCC
TCCTCCTCCT AACCCTCAAC CCTAACATCA
TTACTGGCTT CTATCCCCCA TCCTCCTCCT ATCCCTCAAC
CCCGATATTA TCACCGGGTT CTATCCCCCA TTCTCCTCCT
ATCCCTCAAC CCCGACATCA TTACCGGGTT CTATCCCCCA
TCCTCCTCTT ATCCCTCAAC CCCAGCATCA
TCGCTGGGTT CACCTCCTGT AAATATAGTT TAACCAAAAC
ATCAGATTGT GAATCTGACA TACTCCCTGT AAACATAGTT
TAATCAAAAC ATTAGATTGT GAATCTAACA CACCTCCTGT
AAATATAGTT TAACCAAAAC ATCAGATTGT
GAATCTGATA TTCCTCTTGT AAATATAGTT TAACCAAAAC
ATCAGATTGT GAATCTGACA CGCCTACTGT AAATATAGTT
TAACCAAAAC ATTAGATTGT GAATCTAATA ACAGAGGCTC
ACGACCCCTT ATTTACCGAG AAAGCTTATA
AGAACTGCTA ATAGAGGCTC GAAACCTCTT GCTTACCGAG
AAAGCCCACA AGAACTGCTA ACAGAGGCTC ACAACCCCTT
ATTTACCGAG AAAGCTCGTA AGAGCTGCTA ACAGAGGCTT
ACGACCCCTT ATTTACCGAG AAAGCTCACA
AGAACTGCTA ATAGGGCCCC ACAACCCCTT ATTTACCGAG
AAAGCTCACA AGAACTGCTA ATTCATATCC CCATGCCTGA
CAACATGGCT TTCTCAACTT TTAAAGGATA ACTCACTATC
CCATGTATGA CAACATGGCT TTCTCAACTT
TTAAAGGATA ACTCATACCC CCGTGCTTGA CAACATGGCT
TTCTCAACTT TTAAAGGATA ACTCATGCCC CCATGTCTAA
CAACATGGCT TTCTCAACTT TTAAAGGATA ACTCNTCACT
CCATGTGTGA CAACATGGCT TTCTCAGCTT
TTAAAGGATA ACAGCCATCC GTTGGTCTTA GGCCCCAAAA
ATTTTGGTGC AACTCCAAAT ACAGCTATCC ATTGGTCTTA
GGACCCAAAA ATTTTGGTGC AACTCCAAAT ACAGCTATCC
ATTGGTCTTA GGACCCAAAA ATTTTGGTGC
AACTCCAAAT ACAGCTATCC ATTGGTCTTA GGCCCCAAAA
ATTTTGGTGC AACTCCAAAT ACAGCTATCC CTTGGTCTTA
GGATCCAAAA ATTTTGGTGC AACTCCAAAT AAAAGTAATA
ACCATGTATA CTACCATAAC CACCTTAACC
CTAACTCCCT AAAAGTAATA GCAATGTACA CCACCATAGC
CATTCTAACG CTAACCTCCC AAAAGTAATA ACTATGTACG
CTACCATAAC CACCTTAGCC CTAACTTCCT AAAAGTAATA
ACCATGCACA CTACTATAAC CACCCTAACC
CTGACTTCCC AAAAGTAACA GCCATGTTTA CCACCATAAC
TGCCCTCACC TTAACTTCCC TAATTCTCCC CATCCTCACC
ACCCTCATTA ACCCTAACAA AAAAAACTCA TAATTCCCCC
CATTACAGCC ACCCTTATTA ACCCCAATAA
AAAGAACTTA TAATTCCCCC TATCCTTACC ACCTTCATCA
ATCCTAACAA AAAAAGCTCA TAATTCCCCC CATCCTTACC
ACCCTCGTTA ACCCTAACAA AAAAAACTCA TAATCCCCCC
CATTACCGCT ACCCTCATTA ACCCCAACAA
AAAAAACCCA TATCCCCATT ATGTGAAATC CATTATCGCG
TCCACCTTTA TCATTAGCCT TACCCGCACT ACGTAAAAAT
GACCATTGCC TCTACCTTTA TAATCAGCCT TACCCCCATT
ACGTAAAATC TATCGTCGCA TCCACCTTTA
TCATCAGCCT TACCCCCATT ATGTAAAATC CATTGTCGCA
TCCACCTTTA TTATCAGTCT TACCCCCACT ATGTAAAAAC
GGCCATCGCA TCCGCCTTTA CTATCAGCCT TTTCCCCACA
ACAATATTCA TATGCCTAGA CCAAGAAGCT
ATTATCTCAA ATTTCCCACA ATAATATTCA TGTGCACAGA
CCAAGAAACC ATTATTTCAA CTTCCCCACA ACAATATTTC
TATGCCTAGA CCAAGAAGCT ATTATCTCAA CTTCCCCACA
ACAATATTCA TGTGCCTAGA CCAAGAAGTT
ATTATCTCGA TATCCCAACA ACAATATTTA TCTGCCTAGG
ACAAGAAACC ATCGTCACAA ACTGGCACTG AGCAACAACC
CAAACAACCC AGCTCTCCCT AAGCTT ACTGACACTG
AACTGCAACC CAAACGCTAG AACTCTCCCT
AAGCTT GCTGACACTG AGCAACAACC CAAACAATTC
AACTCTCCCT AAGCTT ACTGACACTG AGCCACAACC
CAAACAACCC AGCTCTCCCT AAGCTT ACTGATGCTG
AACAACCACC CAGACACTAC AACTCTCACT
AAGCTT 2 (homosapien0.04044,((gibbonxxxx0.12482,
orangutang0.09178)0.03594, gorillaexx0.05836)0
.00304,chimpanzee0.05256) (gorillaexx0.05674,(h
omosapien0.03029,(orangutang0.08053,gibbonxxxx0
.11008)0.04669)0.01847, chimpanzee0.04728)
45Phylogenetic programs assuming a molecular
clock. DNAMLK (DNAML and fastDNAml no mol. clock
requirement KITCH (FITCH no mol. clock
requirement) UPGMA (NJ no mol . Clock
requirement) If the same phylogeny is obtained
with the mol. clock assumption as without we can
indirectly assume that it has been satisfied.
Primate data-set
DNAML
DNAMLK, KITCH, UPGMA
NJ, FITCH
Chimpanzee (pan)
Man
Man
Gorilla
Chimpanzee (pan)
Chimpanzee (pan)
Man
Gorilla
Gorilla
Orangutan (pongo)
Orangutan (pongo)
Orangutan (pongo)
Gibbon (hylobatidae)
Gibbon (hylobatidae)
Gibbon (hylobatidae)
46Relationships of Apes and humans (palaentological)
Man
Chimpanzee (pan)
Orangutan (pongo)
Gibbon (hylobatidae)
Gorilla
A
B
Synapomorphies (fælles erhvervede
karaktertræk) D palate (gane) deep, many common
teeth and vertebrate structures and many more
C orbits higher than broad, nasals
elongate, and many more B downward bending of
face, adaptions for kunckle-walking and many more
.. A upper incisors (fortænder) all similar in
shape, premaxillary suture obliterated (udvisket)
in adults, premaxillary alveolar process very
elongated, nasal premaxilla very short and no
more. Benton 1997. Vertebrate Palaentology.
Chapman Hall
C
D
47Installation of PHYLIP (phylogeny inference
package) on Windows For Windows OS download the
three files phylip, phylip95 and phylip96 from
evolution.genetics.washington.edu/phylip/getme.ht
ml and save them in a directory called c\phylip.
Install them one by one by double-click, answer
yes to overwrite.Run phylip from dos. Input
should always be in a file named infile and
output will go to outfile or outtree. Type the
name of program at DOS prompt eg. Dnadist.
48Rules and interpretation of phylogenies.
Rules Try to avoid mixing very short and very
long branches (Felsenstein-zone causes
inconsistency). Exclude identical
sequences. For complex phylogenies include all
taxa you might have missed one that could
change the whole phylogeny. Interpretation A
symmetric topology indicates balance between
speciation and extinction. Complike topology
indicates either very high speciation or very
high extinction rates.
49Programs free to install on local PCs allowing 90
of published phylogenetic analysis to be
performed.ClustalX inn-prot.weizmann.ac.il/soft
ware/ClustalX.htmlPHYLIP evolution.genetics.was
hington.edu/phylip/getme.htmlFor tree drawing
and manipulation Treeview taxonomy.zoology.gla.a
c.uk/rod/treeview.html