Title: DNA Structure Notation Operations
1DNA Structure NotationOperations
- Vincenzo Manca
- Dipartimento di Informatica
- Universita di Verona
210 Years of Molecular Computing
- 1994 Adlemans Experiment
- 1995 Liptons Model
- 1996 Int. Conf. on Math. Linguistics (Marcus)
- 1997 Mangalia (Paun, Head)
- 1998 MFCS Brno (Molecular Computing)
- 1999 (Pauns WMC)
- 2000 DNA6 Leiden
- 2001 DNA7 Tampa (FL) 3-SAT
- 2002 DNA8 Sapporo DNA Duplication
- 2004 DNA10 Milan XPCR Extraction
- 2005 DNA11 Ontario XPCR Recombination
3DNA Computing Motto
- Problem Data and Requirements
- Algorithm Solutions
- Encode data by DNA strands
- Encode algorithms by biotech procedures
- Decode final strands as solutions
-
4A General schema of combinatorial problem
A set of Requirements for assignments, that
is, sequences 0/1 of some length nThe Space of
possible solutions has E(2,n) elements, but only
some of them satisfy the requirements
Encode assignments by DNA strandsEncode
requirements as biotech protocols that filterthe
strands encoding the true solutions
5Possible Solutions
True Solutions
Solution Extraction In linear time !!!
Space Generation In linear time
6New Trends in DNAC
- DNA Self Assembly (Seeman, Winfree, )
- DNA Automata (Shapiro)
- DNA Algorithms gt new biotech protocols
7A change of perspective
Biotech Protocols
DNA Computing
Computing DNA
Algorithms
8- In the search for implementing algorithms on
DNA, general algorithmic principles are
discovered in fundamental biomolecular processes.
9Nucleotides
P
P
330 Dalton 1 Dalton 1.64 10-24 1 g. H 6.2
1023 1--- 1 1nm
5
CH2OH
O
1
----
4
----
P
B
3
2
2
5
3
CH2
O
1
1
4
4
H
B
B
O
5
CH2OH
3
2
A few grams of DNA the amount of all electronic
information stored in all the world
10Strings
- Strings over an alphabet are sequences of symbols
of the alphabet -
- abbabbba
- On strings a concatenation associative operation
- - is defined - (??)? ?(??)
- ? ?? ??
- A language L is a set of strings
11DNA Sequences are Mobile Double Strings
- B A, T, C, G
- B strings over B
-
- ?i,j
- ?
- s is a ?-strand or s ? or type(s )?
- ? n or mult(?)n
12- Complementation - c (involutive)
- Reverse rev (involutive)
- Mirror mir (involutive)
- mir(?) rev(?c)
- Reverse and Complementation commute
- Hybridization
-
- ?
- Pairing ?
- ?
13B A, T, C, G BB strings over B
fraction notation Axiom ?
rev(?) rev(?)
? ext Overlap --x-- overlapping
concatenation Z ?-gt up lt- ? down ?-gt ? -gt/
? ?? -gt/ ?
14BilinearityComplementarityAntiparallelism
5
3
15- Hybridization
- ? mir(?)
-
- ? ? ? ltgt ? ? ? , ? ? mir(?)
-
- ? ? ltgt ? ? ? for some ?
- Pairing ? ? gt ? / rev(?)
-
16- Notation ? / ? ? ?-gt
-
- ? / mir(?) lt?gt
- ? / ? rev(?) lt- ?
-
- gt lt?gt ltmir(?)gt
- BB is the set of DNA strings , BB ? B
17(No Transcript)
18A pool P of DNA molecules is a multiset of strands
- i) Set of strands typed by strings
- ii) Set of strings with multiplicities
- P s1?1 , s2?2, .
- P ?1 n1 , ?2 n2, .
- multP(?1) n1 , multP (?2) n2
- s ? P
- ? ? P
19Types of DNA Pools are Languages of BB
Type(T) ? ? BB s ? , s ? T
20Test Tube Operations in DNAC
- Denature (Melting)
- Renature (Hybridization, Annealing)
- Mix
- Split
- fish (by Affinity)
- Remove
- length
- Separate (Gel Electrophoresis)
- Ligate (Ligase)
- Extend (Polymerase)
- Synthetize (Oligos)
- Infix
21STRAND HYBRIDIZATION
22(No Transcript)
23(No Transcript)
24Polymerase Extension
25DNA Ligase
?
?
?
?
?
?
?
?
?
?
Ligase Joins 5' phosphate to 3' hydroxyl
26Ligase Catenation
27(No Transcript)
28More Complex Operations
- Amplification (PCR)
- Sequencing
- Restriction (R. Enzymes)
- Clonation (Plasmide Transinfection)
29PCR Polymerase Chain Reaction
30PCR with 3 sticky end
h(a)
b
h(b)
a
long
short
Linear
Exponential
31PCR Lemma
Given a pool P of type ??? and two primers ?,
? that hybridize with ? and ? respectively ( ?
? ? ).If the extensions e1 and e2 of the two
primers with the relative single strands overlap,
then an exponential amplification of ??? strands
happens which has the blunt form lte1 Z
exte2gtwhich appears within the first two steps.
32Operation
T of type L
T of type L
33MathematicallyTest Tube Operations
- Type (T) L
- means that
- Types of strands of T constitute the language L
- Given some test tubes as arguments with some
types - provide as results
- Test tubes with other types
34(No Transcript)
35DNA Test Tube Machine
- Register Machines where
- - Registers are Test Tubes
- (multisets of strands instead of numbers)
- - DNA Test Tubes operations
- (instead of arithmetic operations)
36Adlemans Problem
Given a Graph (of seven nodes) Find (if there
are) The paths from two given nodes
(0,6) Passing once for every node (hamiltonian
paths)
37Adleman - Liptons Extract ModelIn Combinatorial
Problems
- The Generation of all possible solutions
- in linear time
- The Extraction of true solutions
- in linear time
- Extraction is performed in a number of sub-steps
and each of them selects all the strands that
include a sub-strand of a given type
38Adlemans Graph
39Adlemans Encoding
Ai Bi
Bj
Bj Ai
Node i ?i ?i
?i
?i
Arc ij mir(?i ?j)
?ic ?jc
i , j 1, , 7
?i ?i 10
40Adlemans Algorithm
- Generation of hamiltonian paths from v1 to v7
- Generate paths of G (hybridization/ligation)
- Perform PCR of primers ?0, mir(?6)
- Separate paths of length 140 (7 x 20)
- for J 1 to 7 do Select strands where ?j?j
occurs - output remaining strands
41MIX and Split Method
- Generation of space solutions of N variables
- Merge X1 and ?X1 in a tube T
- Split T into A and B
- For J 2 To N
- Extend strands of A with XJ
- Extend strands of B with ?XJ
- Merge A and B into T
- Split T into A and B
- Merge A and B
42Liptons Algorithm 3-Sat(N, M)
- Generate N-space solutions in T
- For J 1 To M
- T1 Extract T, L(1,J)
- T T - T1
- T2 ExtrtactT , L(2,J)
- T T - T2
- T3 ExtractT , L(3,J)
- T Merge(T1, T2)
- T Merge(T, T3)
- Detect T
- if T? ?, then take a clone and sequence it
(Solution) - else Unsolvable Problem
43DNA Extraction
- Strands of type ? are called ?-strands
- (or instances of ?)
- A ?-strand with ? including ? as substring is
called a ?-superstrand (? is a ?-superstring) - Problem
- Extract all the ?-superstrands of a pool P
44A Formulation of the DNA Extraction Problem
Given an input pool P of heterogeneous DNA
strands with the same length and with the same
prefix and suffix, and given a string ? Provide
an output pool P? such that all and only the
types of ?-superstrands of P are represented in
P? .
45In other words, extraction of ?-superstrands of P
means To provide a pool P? such that for any
two strings ? ? ??? ? P ltgt ??? ? P? i.e.
the strings represented in P? are all and only
the ?-superstrings belonging to P.
46Cross Pairing PCR
47XPCR provides an efficient method for affix
concatenation of double strands (Heads null
context splicing rule)
?
?
?
?
?
?
?
N.B. Genome Sequencing is related to Affix
Concatenation Closure
48?
?
?
?
h(?)
Melting Hybridization
?
Polymerase Extension
49?
?
?
?
h(?)
Melting Hybridization
?
Polymerase Extension
50(No Transcript)
51?
?
?
?
?
h(?)
?
?
?
h(?)
h(?)
Exponential Amplification
Linear Amplification
Linear Amplification
52(No Transcript)
53-
XPCR was tested in many different situations
- in pools generated by recombination of 22
strands of lengths between 10 - 20
54RhoA XPCR
Lane 2 RhoA ???? of 582 bp Lane 3 ??? of 253
bp Lane 4 XPCR ????? of 582253 -229 606 bp ?
Starts at position -229 of RhoA
55XPCR DNA Extraction
- XPCR-Extract(P, ?)
- L length(P) , R1 ? , R2 ?
- For each n ? L do
- Q separate(P, n)
- P infix(Q, ?, ?)
- (P1, P2) split(P)
- P1 PCR(P1, ?, ?)
- For each m lt n do R1 R1 separate(P1, m)
- P2 PCR(P2, ?, mir(?))
- For each m lt n do R2 R2 separate(P2, m)
- Q mix(R1, R2)
- Q PCR(Q, ?, mir(?))
- Q separate(Q, n ? ?)
- Output Q
-
56Experimental Check
- Consider a pool P of ??-strands that are
- either ?-superstrands or ?-superstrands, and
- where all ?-superstrands are either
- ?1-superstrands, ?2-superstrands, or
- ?3-superstrands (? ? ?, ?1 ? ?2 ? ?3 15 bp).
57Experimental Check
- Our extraction is correct and complete in the
sense that - 1. XPCR-Extraction selected only
- ?-superstrands
- 2. XPCR-Extraction selected all kinds of
- ?-superstrands (?1, ?2 , ?3 - superstrands).
58Gamma Extraction
Lane 2 ?? strands of 120 bp (? 15 bp) Lane 3
?? of 45 bp Lane 4 XPCR ?? and ?? 150 bp Lane
5 PCR(?, ? a.s.) (? at -45) Lane 6 PCR(?, ?
a.s.) Lane 7 PCR(?1, ? a.s.) (?1 at -125) Lane
8 PCR(?2, ? a.s.) (?2 at -75)
59Applications
- XPCR in generation of space solutions
- XPCR in in vitro mutagenesis
- XPCR in gene extraction
60(No Transcript)
61XPCR Mutagenesis
XPCR -Mutagenesis(P, ? , ? ) 1. let P
lt???gt 2. input Q lt?-20,-1 ? ?1, 20gt
3. (P1, P2) split(P) 4. P1 PCR(P1, 1,
20, mir(?-18,-1)) 5. P2 PCR(P2, 1, 20,
mir(?-20,-1)) 6. P1 separate(P1, ? ) 7. P2
separate(P2, ? ) 8. P1 mix(P1,Q) 9. P1
PCR(P1, ?1, 18, mir(?1, 20)) 10. P1
separate(P1, ? ? 20) 11. P mix(P1,
P2) 12. P PCR(P, ?1, 20,mir(?-20,-1)) 13.
P separate(P, ? ? ?) 14. output P
62XPCR Mutagenesis
Figure 10 Electrophoresis results Lane 1
molecular size marker ladder (100bp) Lane 2
amplification of strand ? (230bp) Lane 3
amplification of strand ? (229bp) Lane 4
amplification of strand ?-18, -1 ? ?1,20
(188bp) Lane 5 cross pairing amplification of ?
and ?-18, -1 ? ?1,20 (400bp) Lane 6 cross
pairing amplification of ? and ? ? ?1,20
(609bp) Lane 7 RhoAwt (582bp), lane 8 positive
control by PCR(? , ?-20, -1) (354 bp)
63Ongoing Research
- XPCR Clonation
- Dry DNA Computing