On the complexity of several haplotyping problems - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

On the complexity of several haplotyping problems

Description:

Problem 2: Longest Haplotype Reconstruction. Problem 3: Pure Parsimony Haplotyping ... there exist two haplotypes such that every row of M has no conflicts with (at ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 29
Provided by: leovan
Category:

less

Transcript and Presenter's Notes

Title: On the complexity of several haplotyping problems


1
On the complexity of several haplotyping problems
  • Rudi Cilibrasi
  • Steven Kelk
  • John Tromp
  • Leo van Iersel

Part of this research has been funded by the
Dutch BSIK/BRICKS project
2
Overview
  • Problem 1 Minimum Error Correction
  • Problem 2 Longest Haplotype Reconstruction
  • Problem 3 Pure Parsimony Haplotyping

3
  • DNA chromosome pairs
  • Chromosome string over A,C,G,T
  • SNP site where variability is observed
  • Haplotype string over 0,1 representing the
    SNPs of a chromosome

4
  • SNP-matrix elements from 0,1,-, every row is
    a haplotype fragment, columns correspond to SNPs
  • - is a hole this fragment gives no information
    at this SNP
  • Gap block of holes, flanked by non-holes
  • Example ---00101-----101011-----

5
  • Two rows are said to be in conflict if, at some
    SNP, one row has a 0 and the other row has a 1
  • A SNP-matrix M is called feasible ifthere exist
    two haplotypes such that every row of M has no
    conflicts with (at least) one of these haplotypes

6
(No Transcript)
7
Feasible SNP matrix
First haplotype
Second haplotype
8
Minimum Error Correction
  • Input SNP matrix M
  • Output smallest number of corrections needed to
    make M feasible

9
Unfeasible SNP matrix
Feasible SNP matrix
10
Gapless MEC is NP-hard
  • Every row is gapless
  • Reduction from MAX-CUT

11
v7
Edge in G
v2
Encode as
Row in M
v1
v2
v3
v4
v5
v6
v7
v8
v9
12
(No Transcript)
13
(No Transcript)
14
  • hi hi1 (for all odd i)
  • h1i 1-h2i (for all i)
  • Haplotypes correspond to a cut in G
  • Example cut between vertices v1, v3, v4, v7, v9
    and v3, v5, v6, v8

First haplotype
Second haplotype
v1
v2
v3
v4
v5
v6
v7
v8
v9
15
  • Edge in cut V-2 corrections

Edge in cut
First haplotype
Second haplotype
v1
v2
v3
v4
v5
v6
v7
v8
v9
16
  • Edge not in cut V corrections

Edge not in cut
First haplotype
Second haplotype
v1
v2
v3
v4
v5
v6
v7
v8
v9
17
Gapless MEC is NP-hard
  • Edge in cut V-2 corrections
  • Edge not in cut V corrections

18
1-Gap MEC is APX-hard
  • Reduction from CUBIC-MAX-CUT
  • Reduction is approximation preserving
  • Does not work in the gapless case

19
Binary-MEC
  • MEC without holes
  • Thought to be NP-hard
  • Based on the paper Segmentation Problemsby Jon
    Kleinberg, Christos Papadimitriou and Prabhakar
    Raghavan
  • Complexity of Binary-MEC is still unknown

20
Summary and open problems
21
Longest Haplotype Reconstruction
  • Input SNP-matrix
  • Objective Remove rows from the matrix to make it
    feasible such that the sum-of-lengths of the
    induced haplotypes is maximal

First haplotype
Second haplotype
22
Longest Haplotype Reconstruction
23
Pure Parsimony Haplotyping
  • Two haplotypes can be combined into one genotype

First haplotype
Second haplotype
Genotype
24
  • Genotype g can be resolved by haplotypes h1 and h2

g
h1
h2
  • A 2 is called an ambiguous position

25
PPH Problem
  • Input set of genotypes G
  • Output minimum cardinality set H of haplotypes
    such that each genotype from G can be resolved by
    two haplotypes from H

26
2 Ambiguous PPH Problem
  • Input set of genotypes G with at most 2
    ambiguous positions per genotype
  • Output minimum cardinality of a set H of
    haplotypes such that each genotype from G can be
    resolved by two haplotypes from H

27
2-amb PPH is polynomial time solvable
  • Construct bipartite graph B
  • Maximum Independent Set in B solves 2-amb Pure
    Parsimony Haplotyping
  • MIS is polynomial for bipartite graphs
  • Giuseppe Lancia, Romeo Rizzi, A Polynomial
    Solution to a Special Case of the Parsimony
    Haplotyping Problem, OR Letters (2005)

28
Open problems
  • Complexity of Binary MEC?
  • PTAS for gapless MEC?
  • Constant factor approximation for 1-gap MEC?
  • Constant factor approximation for 1-gap LHR?
Write a Comment
User Comments (0)
About PowerShow.com