Genome Rearrangements and YOU - PowerPoint PPT Presentation

About This Presentation
Title:

Genome Rearrangements and YOU

Description:

Fin. Biological Bakground. Comparing entire genomes across species. Need 'distance' measure ... Negative at tail - a Example. a. ap. Desire. Reality. Example ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 66
Provided by: kevinga7
Category:

less

Transcript and Presenter's Notes

Title: Genome Rearrangements and YOU


1
Genome Rearrangements and YOU!!
  • Presented by
  • Kevin Gaittens

2
Overview
  • Bio background
  • Definitions and Set-up
  • Reality-Desire
  • Good Components
  • Bad Components
  • Fin

3
Biological Bakground
  • Comparing entire genomes across species
  • Need distance measure
  • Interested in larger differences than just single
    insertions/deletions etc.
  • Genome Rearrangements chromosome piece (gene)
    being moved or copied to another location or
    transferring to another chromosome altogether

4
Definitions
  • Block section of genome possibly containing
    more than one gene one unit
  • Homologous when two blocks contain the same
    genes. Homologous blocks have the same number
    label
  • Reversal reversing a series of blocks and also
    their orientations distance is measured in
    number of reversals

5
Example of Reversal
  • 3 4 1 2 5 3 2 1 4 5
  • Red right orientation
  • Black left orientation

6
Goals
  • Want shortest number of reversals to transform
    one genome to another
  • Parsimony assumption assume Nature changes
    optimally
  • Desire polynomial time solution
  • Oriented has a poly-time solution, unoriented
    NP-hard

7
Example
  • 1 2 3 4 5
  • 5 2 1 3 4Add circle
    if orientation changes

8
One solution
9
Breakpoints
  • Act as a minimum
  • Happens in the case of
  • first/last label in original not the first/last
    label in the target
  • OR 2 labels are consecutive in original, but not
    in target
  • OR consecutive in original and target but duel
    orientation is different between blocks
  • 5 4 and 5 4
  • NOTE If a pair of labels is an exact reversal in
    the target, there is NO breakpoint
  • 4 5 and 5 4 do not have a breakpoint

10
Breakpoints for Last Example
  • 1 2 3 4 5
  • Goal reminder
  • 5 2 1 3 4

1 is different than first of target
No breakpoint between 1 and 2 since exact
reversal in target
2 and 3 not consecutive in target
3 and 4 match, thus no breakpoint
4 and 5 are not consecutive in target
5 is different from last in target
11
Mathy Stuff o)
  • Let L be finite set of labels
  • L0 U a , a for all a in L
  • x -gt remove arrows
  • Ex
  • a a a

12
Contd
  • Oriented permutation over L is a
  • mapping a 1..n -gt L0 such that for any a e L,
  • there is exactly one i e 1..n with a(i) a
  • Basically, permutation picks an orientation for
    each label. If a is picked, then a will not be

13
Example
  • n 4
  • L 1, 2, 3, 4
  • a ( 2, 1, 4, 3 )
  • So a(3) 4

14
Identity Permutation
  • Special case
  • Permutation I such that I(i) i for all i
    between 1 and n
  • For n 3, I ( 1 2 3)

15
Reversals
  • Let i and j be two indices with 1 i, j n
  • i,j indicates a reversal affecting elements
    a(i) through a(j)

16
Example
  • Given a ( 2, 3, 4, 1)
  • a2,3 ( 2, 4, 3, 1)
  • Note similar to boxing scheme used earlier

17
More Math!
  • In general
  • ?i, j(k) a(i j k) if i k j
  • a(k) otherwise
  • a(k) means reversal of orientation of a(k)

18
Sorting by Reversals
  • Is the main goal
  • Given 2 permutations a and ß, seek minimum number
    of reversals to transform a into ß
  • ?p1p2p3pt ß where p1, p2,, pt are reversals
  • t is called the reversal distance of a with
    respect to ß and denoted by dß(a)

19
Sorting cont
  • Look for reversals that make progress towards ß
  • dß (ap) lt dß (a) or
  • dß (ap) dß(a) - 1

20
Breakpoints
  • Add labels L and R to a to get extended version
  • One example of a a is(L, 2, 3, 1, 6, 5, 4, R)
  • If B is identity, then breakpoints at

21
Breakpoints
L 2 3 1 6 5 4 R
L 1 2 3 4 5 6 R
2 is not the first block of ß
2 and 3 are consecutive, but the orientations are
different than what they need and are not a
complete reversal
3 and 1 are not consecutive in ß
1 and 6 are not consecutive in ß
6 and 5 are consecutive, but not a complete
reversal (orientation of 6 prevents it)
none at 5 4, reverse pair 4 5 is in ß
4 is not the final block in ß
22
Breakpoints cont
  • Can remove at most 2 breakpoints with each
    reversal
  • Thus, b(a) b(ap) 2
  • This also means that b(a)/2 d(a)
  • This is a lower bound for d(a)

23
Bps contd
  • b(a)/2 is lower bound
  • However, this is rarely achievable
  • Want a better lower bound
  • Look to something called reality-desire diagram

24
Reality-Desire
  • Happens when 2 labels are adjacent, but do not
    want to be adjacent
  • Reality neighbor a certain label has in a
  • Desire neighbor the label has in ß

25
Diagram
  • Oriented labels can be viewed as a battery
  • Positive terminal at tip of arrow
  • Negative at tail
  • - a

26
Example
Desire
  • a
  • ap

Reality
27
Example
  • Extended a L 3 2 1 4 5 R
  • Replace labels by terminals reality edges
  • L -3 3 2 -2 1 -1 -4 4 5 -5
    R
  • Add desire edges

28
Diagram
  • To create diagram of reality-desire
  • Arrange all terminal nodes around a circle with L
    and R at the top
  • L to the left of R and all other nodes following
    a counterclockwise
  • Reality edges will be along circumference
  • Desire edges will be the chords

29
Diagram of Reality-Desire
Happens where not breakpoint
30
Interpretation
  • Number of cycles in RD(a) is cß(a) and is number
    of connected parts
  • cß(ß) has no breakpoints
  • Notice cß(ß)n1
  • Why?

31
Effects of a Reversal
  • Let (s,t) and (u,v) be two reality edges
    characterizing a reversal p with (s,t) preceding
    in the permutation a. Then RD(ap) differs from
    RD(a) by
  • 1. Reality edges (s,t) and (u,v) are replaced
    by (s,u) and (t,v)
  • 2. Desire edges remain unchanged
  • 3. The section of the circle going from node t
    to node u, including these extremities, in
    counterclockwise direction, is reversed.

32
Our Example
Reversing (-1,-4) and (4, 5)
33
Definitions
  • Let e and f be two reality edges belonging to the
    same cycle in RD(a)
  • If orientations induced by e and f coincide, they
    are convergent
  • Walk counterclockwise from start of e (passing
    through desire edges) until you reach the
    beginning of f. If the end of f is still
    counterclockwise, then converge
  • Divergent otherwise

34
Walking Convergent
(3,2) to (-1,-4)
Still counterclockwise
35
How Reversals Affect Cycles
  • If e and f belong to different cycles, c(ap)c(a)
    -1

36
If e and f belong to the same cycles and
converge c(ap)c(a)
37
If e and f belong to the same cycles and
diverge c(ap)c(a) 1
38
Summary
  • If e and f
  • belong to different cycles, c(ap)c(a) -1
  • belong to same cycle converge, c(ap)c(a)
  • belong to same cycle diverge, c(ap)c(a)1

39
Lower Bound
  • Since number of cycles changes by at most 1 per
    reversal, can get a new lower bound for reversals
  • Suppose ap1p2..ptß
  • --cß(ap1p2...pt)cß(ß)n1
  • cß(ap1) cß(a) 1
  • cß(ap1p2) cß(ap1) 1
  • cß(ap1...pt) cß(ap1...pt-1) 1

40
Lower Bound
  • Add to get
  • n1 cß(a) t
  • If p1,p2,...,pt is an optimal sorting, then
    tdß(a)
  • n1 cß(a) dß(a)
  • Very good lower bound

41
Good/Bad Cycles
  • A cycle is good if it has two divergent reality
    edges
  • If not, it is considered bad
  • Good cycles have at least two desire edges that
    cross
  • Not all cycles that have crossing edges are good
  • Call cycles proper if they have at least four
    edges

42
Good/Bad contd
  • If we only have good cycles, lower bound d(a)
    n1 c(a) is an equality
  • How could it be possible for it to be an equality
    if there are a few bad cycles mixed in to start?

43
Interleave
  • Twisting another cycle while breaking another is
    only possible if the two cycles are such that
    some desire edge from one of the cycles crosses
    some desire edge from the other
  • These two cycles interleave in this case

44
Interleave
45
Interleaving Graph
  • Important to verify which cycles interleave with
    which other cycles
  • Take as nodes the proper cycles of RD(a)
  • Two nodes adjacent iff the cycles interleave
  • Connected components are classified as good or
    bad
  • If a component contains all bad cycles, it is
    bad. Otherwise, it is said to be good

46
RD to Interleave
Gray filled-in circles are good cycles
47
Choosing a Reversal
48
Choosing a Reversal
  • C is the only good cycle
  • Let e (L, 3), f(-3,-4), g(-1,2)
  • f g converge, so not a good choice

49
e and g
  • e and g diverge and produce 2 good components
    with 1 cycle each

50
e and f
  • e and f produce a single good component with two
    cycles

51
Reversal Choosing contd
  • A reversal characterized by two divergent edges
    of the same cycle is a sorting reversal iff its
    application does not lead to the creation of bad
    components
  • So reversing e f or e g are both acceptable

52
Bad Components
  • Good components can be sorted as in previous
    slide
  • First step in dealing with bad components is to
    classify them
  • Component Y separates components X and Z if all
    chords in RD(a) that link a terminal in X to one
    in Z cross a desire edge of Y

53
  • E separates F and D
  • What are some other separations?

54
Definitions
  • Hurdle bad component that does not separate two
    bad components
  • Nonhurdle bad component that separates two bad
    components

55
Definitions contd
  • X protects nonhurdle Y if removal of X would
    cause Y to become a hurdle
  • If anytime Y separates 2 bad components, X is one
    of them
  • Superhurdle hurdle that protects a nonhurdle
  • Simple hurdle does not protect a nonhurdle

F protects E
56
Classification
57
Formula for Reversal Distance
  • d(a) n 1 c(a) h(a) f(a)
  • h(a) number of hurdles
  • f(a) 0 or 1
  • 1 if a is a fortress
  • A nonhurdle will become a hurdle at some point

58
Fortress
  • A fortress is a permutation where there are an
    odd number of hurdles and all of them are super
    hurdles. They require an extra reversal since a
    nonhurdle will become a hurdle at some point

59
Definitions
  • X and Y are opposite hurdles when we find the
    same number of hurdles when walking around the
    circle counterclockwise from X to Y as we do
    clockwise.

Note only wheneven number hurdles
60
Hurdle Cutting
  • Reverse edges in same component
  • Used only with simple hurdles

61
Final Algorithm
  • While a not B
  • If there is a good component in RD(a) then pick
    two divergent edges in this component ensuring
    that it does not create a bad component
  • Else
  • if h(a) is even then
  • return merging of two opposite hurdles
  • else
  • if there is a simple hurdle
  • return a reversal cutting this hurdle
  • else //fortress
  • return merging of any two hurdles

62
Fortress Handling
A
C
B
Fortress, so choose any 2 hurdles and merge C is
good
63
Complexity
  • Construction RD(a) takes linear
  • Finding the cycles is O(n)
  • For each cycle, determine good/bad
  • This is O(n) per cycle, so O(n2) total
  • Determining interleaving can be done in O(n2)
  • Counting hurdles etc. can be done linearly with
    the other knowledge

64
Complexity contd
  • Figuring out a Sorting Reversal for good
    components is the worst since need ensure we
    dont create bad components
  • Since reversal is identified with a pair of
    edges, O(n2) reversals.
  • For each one, O(n2) time checking the resulting
    permutation. O(n4) total
  • We need to do this dß(a) times so O(n5) all
    together

65
Final Slide, Huzzah!
  • Found accurate distance measure for genome
    movements
  • Found a poly-time solution for solving the
    problem
  • Played with fun graphs
Write a Comment
User Comments (0)
About PowerShow.com