Title: Computational Molecular Biology
1Computational Molecular Biology
- Pooling Designs Inhibitor Models
2An Inhibitor Model
- In sample spaces, exists some inhibitors
- Inhibitor anti-positive
- (Positives Inhibitor) Negative
_
_
_
_
_
Inhibitor
_
x
Negative
3An Example of Inhibitors
4Inhibitor Model
- Definition
- Given a sample with d positive clones, subject to
at most r inhibitors - Find a pooling design with a minimum number of
tests to identify all the positive clones (also
design a decoding algorithm with your pooling
design)
5Inhibitors with Fault Tolerance Model
- Definition
- Given n clones with at most d positive clones and
at most r inhibitors, subject to at most e
testing errors - Identify all positive items with less number of
tests
6Preliminaries
72-stages Algorithm
What is AI? The set AI should contains all the
inhibitors and no positives. Hence the set PN
contains all positives (and some negatives) but
no inhibitors
82-stages Algorithm
At this stage, the problem become the
e-error-correcting problem.
9Non-adaptive Solution (1 stage)
- P contains all positives
- N contains all negatives
- O contains all inhibitors and no positives
10Non-adaptive Solution
11Generalization
- The positive outcomes due to the combination
effect of several items - Items are molecules
- Depends on a complex subset of molecules
- Example complexes of Eukaryotic DNA
transcription and RNA translation
12A Complex Model
- Definition
- Given n items and a collection of at most d
positive subsets - Identify all positive subsets with the minimum
number of tests - Pool set of subsets of items
- Positive pool Contains a positive subset
13What is Hypergraph H?
- H (V,E ) where
- V is a set of n vertices (items)
- E a set of m hyperedges Ej where Ej is a subsets
of V - Rank r max Ej s.t Ej inE
14Group Testing in Hypergraph H
- Definition
- Given H with at most d positive hyperedges
- Identify all positive hyperedges with the
minimum number of tests - Hyperedges suspect subsets
- Positive hyperedges positive subsets
- Positive pool contains a positive hyperedge
- Assume that Ei Ej
15d(H)-disjunct Matrix
- Definition
- M is a binary matrix with t rows and n columns
- For any d 1 edges E0, E1, , Ed of H, there
exists a row containing E0 but not E1, , Ed - Decoding Algorithm
- Remove all negatives edges from the negative
pools - Remaining edges are positive
16Construction Algorithms
- Consider a finite field GF(q). Choose k, s, and
q - Step 1
- for each v in V
- associate v with pv of degree k -1 over GF(q)
17A Proposed Algorithm
- Step 2 Construct matrix Asxm as follows
- for x from 0 to s -1 (rkd lts lt q)
- for each edge Ej inE
- Ax,Ej PE(x) pv(x) v in Ej
- E1 E2 Ej Em
-
- 0
- 1
- A
- x PE2(x) PEj(x)
-
-
- s-1
-
18A Proposed Algorithm
- Step 3 Construct matrix Btxn from Asxm as
follows - for x from 0 to s -1
- for each PEj(x)
- for each vertex v in V
- if pv(x) in PEj(x), then B(x, PEj(x)),v
1 - else B(x, PEj(x)),v 0
-
- E1 E2 Ej Em
- 0
- 1
- A
- x PEj(x)
-
-
- s-1
-
v1 v2
vj vn (0,
PE0(0)) (0, PE1(0)) B (x,
PEj(x)) (s-1, PEm(s-1))
0
1
19Analysis
- Theorem If rd (k -1) 1 s q, then B is
d(H)-disjunct
20Proof of d(H)-disjunct Matrix Construction
- Matrix A has this property
- For any d 1 columns C0, , Cd, there exists a
row at which the entry of C0 does not contain the
entry of Cj for j 1d - Proof Using contradiction method. Assume that
that row does not exist, then there exists a j
(in 1d) such that entries of C0 contain
corresponding entries of Cj at least r(k-1)1
rows. Then PEj(x) is in PE0(x) for at least
r(k-1)1 distinct values of x. This means that Ej
is in E0
21Proof of d(H)-disjunct Matrix Construction (cont)
- Prove B is d(H)-disjunct
- Proof A has a row x such that the entry F in
cell (x, E0) does not contain the entry at cell
(x, Ej) for all j 1d. Then the row ltx,Fgt in B
will contain E0 but not Ej for all j 1d