Towards Identifying Lateral Gene Transfer Events - PowerPoint PPT Presentation

About This Presentation
Title:

Towards Identifying Lateral Gene Transfer Events

Description:

... connected and respects the direction of evolution implied by the arcs of T and S. ... algorithm is extremely rare in practice implying a O(22t n2) running ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 17
Provided by: csUt8
Category:

less

Transcript and Presenter's Notes

Title: Towards Identifying Lateral Gene Transfer Events


1
Towards Identifying Lateral Gene Transfer Events
  • L. Addario-Berry, M. Hallett, J. Lagergren
  • Presented By Jeff Mathew

2
Roadmap
  • Key terms
  • t-transfer problem
  • H-moves and I-moves algorithm
  • Tree generation for simulation
  • Experimental results
  • Conclusions and future work

3
Lateral transfer scenario
  • LGT HGT
  • Root of scenario tree must correspond to root of
    gene tree
  • The scenario tree is connected and respects the
    direction of evolution implied by the arcs of T
    and S.

4
a-activity
  • An a-active scenario for a gene tree and species
    tree allows at most alpha copies of a gene to
    simultaneously exist in the genome of an
    ancestral taxon.
  • Authors focus on 1-active scenarios though
    intractability results have been proved earlier
    for a 1.

5
t-transfer problem
  • Input Species tree S, gene tree T, integer t
  • Output A t lateral transfer scenario for S and
    T, t t
  • Intractability result
  • The decision version of the a-Active, t-Transfer
    Problem (does there exist a a-active scenario
    with cost t?) is NP-complete.
  • t is the number of lateral transfer events needed
    to explain the difference between S and T

6
Algorithm
  • 2 Phase approach
  • Phase 1
  • While H-fat or I-fat vertices remain
  • Perform H-fat move or I-fat move
  • At the end of phase 1, we are guaranteed that the
    scenario is 1-active. What about cycles?
  • Phase 2
  • Remove minimum number of LGT events from each
    candidate to make it acyclic.
  • Running Time 24t n2

7
Simulating species trees
  • Create random species tree S on n-leaves. T(log
    n) expected depth
  • S is supposed to reflect the actual evolutionary
    relationships between taxa
  • S is ultrametric. Therefore, edge-weights
    correspond to time.
  • Randomly assign weights to every edge such that
    every root-to-leaf path has weighted sum 1.

8
Simulating gene trees
  • Begin with generated ultrametric species tree
  • Lateral transfer events occur according to a
    Poisson process with mean rate ?
  • Moving from root to leaves, for each vertex x0
    with children x1 and x2, examine both edges
  • If the Poisson process provides us with a lateral
    transfer event along (x0, x1), we add it and
    point it to a randomly chosen edge alive at that
    point in time.
  • Else add a speciation event for x1
  • Repeat the analysis for (x0, x2)

9
Degenerate Cases
  • Simulation can result in plausible biological
    events that are not detectable by the algorithm.
  • Useless transfers LGTs that dont change the
    gene tree
  • Transfer-loss events One child of a node is a
    LGT event. Another child is a loss event.

10
Results
  • O number of repetitions
  • t true number of LGT events
  • t minimum cost LGT scenario found by algorithm
  • ? mean rate of LGTs from Poisson process

11
Finding the saturation point
  • The point when the average t stops increasing.
  • Random trees from a large pool were chosen as
    gene trees and species trees
  • Trials suggest that saturation point is slightly
    above n/2, i.e., when t gt n/2, the algorithms
    stops detecting new LGT events
  • Thus, if t gt n/2, the correspondence between T
    and S via LGT events is not very meaningful.

12
Results
  • O number of repetitions
  • t true number of LGT events
  • t minimum cost LGT scenario found by algorithm
  • ? mean rate of LGTs from Poisson process

13
Results
  • O number of repetitions
  • t true number of LGT events
  • t minimum cost LGT scenario found by algorithm
  • ? mean rate of LGTs from Poisson process

14
Results
  • O number of repetitions
  • t true number of LGT events
  • t minimum cost LGT scenario found by algorithm
  • ? mean rate of LGTs from Poisson process

15
Conclusions
  • Empirically verified feasibility of the
    t-transfer algorithm
  • Degenerate events such as transfer-loss events
    that result in over-estimates of transfers occur
    with low probability
  • Achieved near-optimal scenarios when ? is low
    enough not to cause saturation
  • The cycle elimination phase of the algorithm is
    extremely rare in practice implying a O(22t n2)
    running time.

16
Future work and open problems
  • Use weighted gene trees and species trees
  • Species trees are nearly ultra-metric while gene
    trees are not
  • Do fast algorithms exist when the input is a set
    of gene trees with no species tree?
  • Tractability on larger phylogenies
  • Can we consider gene duplication, lateral gene
    transfers, and other events simultaneously?
  • Can we use probabilistic models that assign
    likelihood events to various events and optimize
    over such models in a tractable manner?
Write a Comment
User Comments (0)
About PowerShow.com