Title: Molecular data assisted morphological analyses
1Molecular data assisted morphological analyses
2Molecular data assisted morphological analyses
Use molecular data to define the limits of
species. barcoding, need some baseline
information to set the molecular limits of a
species
Polysiphonia study rbcL sequences as molecular
data The level of expected intraspecific
sequence divergence established by McIvor et al.
(2001, Mol. Ecol. 10911-919) in a study where
they compared rbcL sequence data with
karyological and interbreeding data. Sequences
generated from multiple North Carolina
specimens sequence similarity used for species
identification phylogenetic analyses used to
determine evolutionary relationships (an
advantage of rbcL as a barcoding gene)
Species specimen tree based on phylogenetic tree
used for character state mapping
Phylogenetic tree
3Analyze morphological characters in each
molecularly-defined specimen
4Determine the consistency of morphological
characters within species
5Trees, Phylogenetic Analyses and other ventures
down the Dark Side.
6Phylogenetic Analyses, Trees, etc.
Tree terminology
internal node
Node point at which 2 or more branches diverge
terminal node or OTU
Internal node hypothetical last common
ancestor Terminal node molecular or
morphological data from which the tree is
derived (OTUs Operational Taxonomic Units)
Clade a node and everything arising from it
terminal node or OTU
internal node
clade
clade
7Monophyletic group a group in which all members
are derived from a unique common ancestor
Polyphyletic group a group in which all members
are not derived from a unique common ancestor.
The common ancestor of the group has many
descendants that are not in the group
Paraphyletic group a group that excludes some of
the descendants of the common ancestor
8A couple of points about trees
A
B
All branches can rotate freely around a
node i.e. B is not more closely related to C
than A, and C is not more closely related to D
than E
C
D
E
Branch lengths may be proportional to the
hypothesized distance between nodes PAUPs
phylogram
Branch lengths may be be drawn as equal between
nodes cladograms (these are used when one is
interested only in the branching pattern)
DNA or protein sequence trees are hypotheses of
how a particular DNA locus or protein has
evolved. We assume that the way the DNA or
protein has evolved reflects the way the species
has evolved i.e. gene tree species tree This
may or may not reflect reality. i.e. molecules
do not necessarily trump morphology, etc. (las
moléculas no necesariamente morfología del
triunfo)
9DNA sequence analyses (protein as well)
www.sinauer.com
10DNA sequence analyses (protein as well)
Molecular trees are only as good as the data they
are based upon i.e. GARBAGE IN GARBAGE OUT
(Basura en basura hacia fuera)
Sequence alignment is the most important step in
phylogenetic analysis
same sites in different sequences need to be
homologous
area to possibly remove because of uncertain
homology between sites
inferred insertion/deletion mutations (gaps)
11DNA sequence analyses (protein as well)
Analysis methods Distance methods based on
similarity between OTUs Optimization
methods Parsimony searching for the tree that
requires the least number of mutational
steps Maximum Likelihood searching for the most
likely (tree with highest probability) given the
OTUs (sequences) and model of evolution Bayesian
searching for a set of trees in which the
likelihoods are so similar that changes between
them are essentially random
The choice of analysis method may be deeply
philosophical or it may be based on
practicality What method can I use and get a
result in a reasonable amount of time?
12DNA sequence analyses (protein as well)
Testing Trees Decay Analysis or Bremer Support
Values a test used in parsimony analyses where
one determines how many steps less parsimonious
than minimal is a particular branch in your tree
no longer resolved in the consensus of all
possible trees that length. Bootstrapping a way
to test the level of support in your data for a
particular relationship in your tree. by default
most programs will show bootstrap values when
they are greater than 50 but, does a bootstrap
value of 50 mean anything? Hillis Bull (1993)
Systematic Biology 42182-192 (tested bootstrap
values based on a known phylogeny) Wilsons
Rule 60-80, is there other evidence to support
the relationship, be cautious 80-90, usually
pretty solid 90-100, solid and unlikely to be
misleading