Constructing Synthetic Phylogenies - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Constructing Synthetic Phylogenies

Description:

Some of those organisms survive and multiply, while most do not ... Based on comparing the distances between all pairs of leafs in one tree to all ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 36
Provided by: Ale8248
Category:

less

Transcript and Presenter's Notes

Title: Constructing Synthetic Phylogenies


1
Constructing Synthetic Phylogenies
  • by Alex Kovarsky
  • Project Presentation of C606

2
Outline
  • Motivation
  • Process Description
  • Reconstruction and Comparison Algorithms
  • Experimental Results
  • Conclusion

3
Motivation
4
Phylogeny - What it is?
  • The study of the evolution of life forms
  • Tree of life forms and their relationships
  • The history of lineages as they change through
    time

5
Real Phylogenies
  • Estimate 5 to 100 million of organisms living
    today on earth
  • All organisms are connected by the passage of
    genes along the branches of the phylogenetic tree
  • In phylogenies we can find similarities and
    differences in plants, animals, etc.
  • Tree of Life Web Project (ToL)1
  • Collaborative world wide project
  • 3000 web pages containing information about
    particular groups of organisms
  • ToL pages are linked to one another
    hierarchically

6
Evolutionary Process
  • Main instrument of evolution is mutation
  • Organisms with novel traits are created
  • Organisms with mutated alleles migrate and mate
    in order to create variation in the population
  • Forces of natural selection then choose the
    better-adapted organisms
  • However, most mutations are harmful

7
Synthetic Phylogenies
  • Phylogenies created by pseudo-random mutation of
    a single organism located in the root of the
    phylogenetic tree
  • Conditions
  • Mutation Rates
  • Mutations Usability
  • Gene Sizes
  • Are based on current knowledge of evolutionary
    processes

8
Project Goals
  • By applying known evolutionary knowledge to
    generate synthetic phylogenies
  • Can learn about the process of evolution
  • Many phylogeny inferring algorithms exist
  • Need a way to test their accuracy and efficiency
  • Real phylogenies are not suitable since we do not
    know the tree topology
  • Thus, can use synthetic phylogenies

9
Method Description
10
Method General Description
  • Start with a randomly generated genome made up of
    N genes
  • This genome mutates creating new organisms
  • Some of those organisms survive and multiply,
    while most do not
  • This process is run for N cycles to create a
    phylogenetic tree of species

11
Mutation Types
  • Point common
  • Replacement
  • Insertion
  • Deletion
  • Duplication - rare
  • Translocation very rare

12
Point Mutations
  • Replacement
  • Changing of a random nucleotide in a random gene
    from 0 to 1 or vice versa
  • Insertion
  • Insertion of a random nucleotide in a random
    position of a random gene
  • Deletion
  • Deletion of a random nucleotide in a random
    position of a random gene

13
Rare Mutations
  • Duplication
  • Production of an exact copy of a random gene in a
    given genome
  • Translocation
  • Occurs when two neighbouring genes swap places in
    the genomic sequence

14
Harmful vs Beneficial
  • Vast majority of mutations are
  • Harmful
  • have detrimental affects on the fitness of the
    organism to the environment
  • Mutated genes will not transfer to the general
    population
  • Only few mutations improve the organisms chances
    to survive
  • Beneficial Mutations

15
Stable vs Evolving Population
  • Stable
  • Species whose population size is large enough to
    have a good likelihood to produce beneficial
    mutations
  • Evolving
  • Species whose population size is not high enough
    to have a good likelihood to produce beneficial
    mutations
  • The typical trend
  • Evolving ? Stable ? Extinct

16
Population Growth Process
01101110..11
Size 1, Growth 0.1
17
Population Growth Process
01101110..11
Size 1, Growth 0.1
01101110..11
01101111..11
Size 0.95, G -0.05
Size 0.08, G -0.2
18
Population Growth Process
01101110..11
Size 1, Growth 0.1
01101110..11
01101111..11
Size 0.95, G -0.05
Size 0.072, G -0.2
01101110..11
01001110..11
Size 0.86, G -0.1
Size 0.12, G 0.2
19
Population Growth Process
01101110..11
Size 1, Growth 0.1
01101110..11
01101111..11
Size 0.95, G -0.05
Size 0.072, G -0.2

01101110..11
01001110..11
Size 0.86, G -0.1
Size 0.12, G 0.2


20
Population Growth Process
01101110..11
Size 1, Growth 0.1
01101110..11
01101111..11
Size 0.95, G -0.05
Size 0.072, G -0.2

01101110..11
01001110..11
Size 0.86, G -0.1
Size 0.12, G 0.2


01101111..11
Size 0.009, G -0.2
01101110..11
01001110..11
Size 0.009, G -0.1
Size 0.3, G 0.1
21
Population Growth Process
01101110..11
Size 1, Growth 0.1
01101110..11
01101111..11
Size 0.95, G -0.05
Size 0.072, G -0.2

01101110..11
01001110..11
Size 0.86, G -0.1
Size 0.12, G 0.2


01101111..11
Size 0.009, G -0.2
01101110..11
01001110..11
Size 0.009, G -0.1
Size 0.3, G 0.1
01001110..01
01001110..11
Size 0.08, G -0.2
Size 0.285, G -0.05
22
Reconstruction Programs
  • All come from PHYLIP website
  • PHYLIP is a free package of programs for
    reconstructing phylogenies
  • Programs used
  • PARS
  • MIX
  • PENNY
  • DOLLPENY

23
PARS
  • Executes Wagner Parsimony
  • Finds the tree which requires the minimum number
    of changes
  • Assumptions
  • Ancestral states are unknown.
  • Different characters evolve independently.
  • Different lineages evolve independently.
  • Changes to all other states are equally probable

24
MIX
  • Executes the Wagner and Camin-Sokal parsimony
    methods in mixture
  • Assumptions similar to PARS
  • Except assumes that changes from 0 ? 1 are more
    probable than changes from 1 ? 0
  • Not the most suitable method for reconstruction
    of our tree

25
PENNY and DOLLPENNY
  • PENNY
  • Uses branch and bound method
  • Very slow
  • DOLPENNY
  • Uses branch and bound method
  • Asserts that in evolution it is harder to gain a
    feature than to lose it
  • 1 ? 0 more common than 0 ? 1
  • Extremely slow

26
Tree Comparison
  • Based on comparing the distances between all
    pairs of leafs in one tree to all pairs of leafs
    in the other tree
  • Process
  • Find all distances between leafs in both trees
  • Normalize to (0,1 for both trees
  • Compute the sum of distances in the generated
    tree
  • Compute the sum of differences between all
    respective leaf pairs
  • Similarity ratio is
  • Sum of Differences4 / Sum of Distances 3

27
EXPERIMENTAL RESULTS
28
Experiments - Goals
  • Examine performance of inference methods
  • PARS, MIX, PENNY, and DOLPENNY
  • Test the properties of created phylogenies
  • Variability
  • Growth rates

29
Experiment 1
  • Performance of Reconstruction Methods on Simplest
    Mutations
  • Only insertion and duplication mutations
  • PARS method is the best performer

30
Experiment 2
  • Performance of Reconstruction Methods on Complex
    Mutations
  • Point Mutations, Duplication, and Translocation
  • Much poorer results
  • PARS method is the fastest performer

31
Experiment 3
  • Variance of the current species size
  • Randomization has a substantial affect on
    progress of evolution

32
Experiment 4
  • Varying the Number of Current Cycles
  • Suggests exponential growth in the number of
    species

33
Conclusion
  • A tool that produces discrete synthetic
    phylogenies, according to user specified
    parameters
  • All parameters are adjustable to accommodate
    quick modifications
  • A test-bed for phylogeny inference methods

34
QUESTIONS?
35
References
  • Maddison, D. R. and K.-S. Schulz (ed.) 2004. The
    Tree of Life Web Project. Internet address
    http//tolweb.org
  • PHYLIP - http//evolution.genetics.washington.edu/
    phylip.html
Write a Comment
User Comments (0)
About PowerShow.com