Title: Combining data versus consensus methods
1Combining data versus consensus methods
2Multiple data sets for the same sets of taxa
- Two strategies
- Analyze each data set separately and then compare
the trees (consense) - Concatenate the data and conduct a single
combined analysis (combine)
3Argument for consensus
- If the same clades appear for multiple data sets
we can be more confident - The method is conservative
4Is consensus conservative?Barrett et al. 1994.
Syst. Zool. 40486
5Is consensus conservative?Barrett et al. 1994.
Syst. Zool. 40486
6Argument for combining data
- Data partitions are arbitrary
- Better signalnoise ratio
- Can evaluate confidence in the combined data set
- Should look at the total evidence
7Arguments against combined analysis
- Some data sets might have strong misleading
signals (e.g., due to lab errors) - How should one weight partitions?
- Different partitions might have tracked different
histories
8Combinational combined analysis
- First assess if the data conflict significantly
- If they do not combine
- If they do analyze separately
9Tests of data set conflict
- Topology tests (Templeton, Kishino-Hasegawa
Shimodaira-Hasegawa) - One data partition versus trees from the other
partition - Incongruence Length Different (ILD) test
Partition Homogeneity test - Direct comparison of the partitions
10Topology tests for conflict
? reject
What is wrong with this test?
11Topology tests for conflict
Confidence interval for data set 1
Confidence interval for data set 2
x
x
Do these data sets conflict?
Does each data set reject the optimal tree from
the other data set?
12But topology tests can be used more carefully
- Two data sets do conflict if
- Data set 1 rejects all tree that lack a certain
clade - Data set 2 rejects all tree that have that same
clade - Look at clades in the separate analyses that are
well supported and contradict relationships in
the other
13ILD versus Topology tests
- ILD can quickly identify data set conflict, but
do not localize the conflict - Use selective deletion?
- Topology tests can often miss conflict
- When conflict is found it is easily interpretted
14Option if you find conflict
- Conduct separate analyses only
- Delete taxa until conflict disappears - then
combine - Combine anyway
15Conditional conditional combined analysis
- You believe that conflict reflects data
partitions tracking different histories - Keep the data separate and find ways to summarize
the discrepancy - You believe that conflict reflects artifactual
signals (noise) in one or both data sets - Combine anyway in the hope that the real signal
will come to dominate