Title: Testing the outofindia dispersal of Crypteroniaceae by molecular dating
1Title
Molecular dating of cladogramsDiscussion of
various approaches
Frank Rutschmann phD student with Prof. Elena
Conti Institute of Systematic Botany, University
of Zurich, Switzerland
2Molecular dating steps
1. Construct an additive, phylogenetic tree
with branch lengths 2. Perform a statistical
test to reject or assume rate constancy within
the tree 3. Transform the additive tree into an
ultrametric tree (a chronogram) by
usingdifferent molecular dating methods 4.
Calibrate the tree at one internal node by using
fossils or geologic events 5. Infer the ages for
the nodes of interest 6. Calculate confidence
intervals
3The dating step itself
phylogram /additive tree
chronogram /ultrametric tree
transformation
branch length
branch length
absolute number of substitutions per site
amount of time (e.g. mys)
4Characteristics of this transformation
topology remains the same, but branch lengths
represent different things root has to be
defined and fixed before transformation by
removal of a dating outgroup 3 dependent
variables in an equation, one of these
unknown K absolute number of
substitutions r substitution rate
substitutions/site/million years T absolute
time e.g. million years
5The dating step itself
phylogram /additive tree
chronogram /ultrametric tree
transformation
short
all equal
long
6The most common methods
1. In the special case when rates for all
branches are equal (one common substitution rate
across the tree enforced clock) Enforce the
molecular clock in Maximum Likelihood
analysis (PAUP, baseml PAML etc.) MrBayes
(v2.x Huelsenbeck and Ronquist 2002) Use
Langley Fitch method in r8s (Langley and Fitch
1974 Sanderson 2003) Use multidivtime (Thorne
and Kishino 2002) with brownseed 0
7The most common methods
2. In the case when rates vary among lineages
(relaxed clock,constant clock rejected, absence
of a clock) a) NPRS in r8s (Sanderson 1997
2003) or TreeEdit (Rambaut Charleston 2002)
non-parametric autocorrelation of rates
smooth local transformations in rate as the rate
itself changes over the tree by minimizing the
ancestral-descendant changes of rate
optimality criterion sum of squared
differences in local rate estimates compared from
branch to neighboring branch.
8The most common methods
2. In the case when rates vary among lineages
(relaxed clock,constant clock rejected, absence
of a clock) b) Penalized Likelihood (PL)
implemented in r8s (Sanderson 2002 2003)
similar to NPRS, but roughness penalty is
assigned whenever rates change from ancestral to
descendent branches. This prevents (or allows)
too strong rate changes among branches.
Penalty can be defined as smoothing parameter in
the input file Optimal smoothing parameter is
estimated by running a preceeding
cross-validation procedure optimality
criterion log likelihood - roughness penalty
if penalty is large model reasonably clocklike
(function dominated by roughness penalty) if
penalty is small much rate variation is
permitted
9The most common methods
2. In the case when rates vary among lineages
(relaxed clock,constant clock rejected, absence
of a clock) c) Bayesian molecular dating
implemented in PAML/multidivtime (Yang 1997
Thorne et al. 1998 Thorne and Kishino 2002)
fully probabilistic and highly parametric method
derives the posterior distribution of rates
and time from a prior distribution uses a
Markov chain Monte Carlo procedure
10Other methods...
remove clades with deviant rates (Takezaki et
al. 1995) exclude data partitions that
falsify the clock assumption (Kato et al.
2003) use several, different clocks for
rate-homogenous clades (local clocks Yoder and
Yang 2000) Quartet dating (Cooper and Penny
1997 Rambaut and Bromham 1998 Brochu 2004)
Estimates evolutionary rates independently for
two pairs of related taxa, each calibrated with a
separate fossil. Heuristic rate smoothing
using ML (AHRS)(Yang, Acta zoologica Sinica, in
press)Tries to minimize the discrepancy in
branch lengths and rate changes.
11Comparison table
Comparison table (to discuss...)
12Better molecular dating studies
NPRS tends to overfit the data, especially if
the root node is not constrained provide
confidence intervals Either point estimates
(SD/SE) estimate the population from a sample
Or interval estimates (95 CI/CrI) based on a t
distribution. be aware of possible taxon
sampling effects on molecular dating(Linder et
al., in prep.) Use primary calibration
points, as many as possible, and as close as
possible to the node of interest.
13Bootstrapping 1. Execute matrix and specify
model of evolution 2. Under 'analysis' menu
select load constraints and choose best tree 3.
Under analysis again select bootstrap, specify
100 reps and tick save trees to file 4. When
asked to name the file use options to include
branch lengths, save as rooted trees, format
NEXUS (no translation table) 5. Press continue
and set up search options (no swapping) making
sure that topological constraints are
enforced 6. This should produce a tree file with
the 100 tree statements that you can then copy
and paste into r8s 7. To acquire estimated
divergence ages for particular nodes (with
standard deviations) you need to use the profile
command in r8s for nodes analysed across a sample
of bootstrap trees, you should create and run in
r8s a batch file with input from the 100 trees.
14(No Transcript)