Title: DNA Computing: Implications for Theoretical Computer Science
1DNA Computing Implications for Theoretical
Computer Science
- Lila Kari
- Dept. of Computer Science
- University of Western Ontario
- London, ON, Canada
- http//www.csd.uwo.ca/lila/
- lila_at_csd.uwo.ca
2From DNA to TCS
- The genetic code
- Splicing systems
- Optimal encodings for DNA Computing
- Sticker systems
- Watson-Crick automata
- Combinatorics on DNA words
- Cellular computing
- DNA computation by self-assembly
31953 Watson and Crick discover DNA structure
4DNA
5The RNA Tie Club
- 1954 Solve the riddle of the RNA structure and
to understand how it builds proteins (clockwise
from upper left Francis Crick, L. Orgel, James
Watson, Al. Rich) - There are 20 aminoacids that build up proteins
6The Diamond Code
- G.Gamow - double stranded DNA acts as a template
for protein synthesis various combinations of
bases could form distinctively shaped cavities
into which the side chains of aminoacids might fit
7Comma-Free Codes(the prettiest wrong idea in
20-th century science)
8The prettiest wrong idea in all of 20th century
science
- Suckling-pig model of protein synthesis
- Construct a code in which when two sense codons
(triplets) are catenated, the subword codons are
nonsense codons - If CGU and AAG are sense codons, then GUA and UAA
must be nonsense because they appear in CGUAAG
9Comma-free codes (Crick 1957)
- How many words can a comma-free code include?
- For n4 and k3 the size of a maximal comma-free
code is the magic number 20 - For an alphabet of n letters grouped into
k-letter words, if k is prime, the number of
maximal comma-free codes is (nk n)/k - For n4 and k3 this equals 408
10Reality Intrudes
- News from the lab bench Nirenberg,Matthaei 61
synthesize RNA, namely poly-U, coding for
phenylalanine - By 1965 the genetic code was solved
- The code resembled none of the theoretical
notions - The extra codons are merely redundant
11The Genetic Code
12Splicing Systems (Head 1987)
- 5 CCCCCTCGACCCCC 3
- 3GGGGGAGCTGGGGG5
- 5AAAAAGCGCAAAAA 3
- 3 TTTTTCGCGTTTTT 5
- Enzyme 1 Enzyme 2
- 5TCGA3 5GCGC3
- 3AGCT5 3CGCG5
13Splicing Systems
- 5 CCCCCT CGACCCCC 3
- 3GGGGGAGC TGGGGG5
- 5AAAAAG CGCAAAAA 3
- 3 TTTTTCGC GTTTTT 5
- DNA strands with compatible sticky ends
- recombine to produce two new strands
14Splicing operation
15Splicing system sample results
- Theorem (Paun95, Freund,Kari,Paun ,99)
- Every type-0 language can be generated by a
splicing system with finitely many axioms and
finitely many rules. - Theorem (Freund,Kari,Paun 99)
- For every given alphabet T there exists a
splicing system, with finitely many axioms and
finitely many rules, that is universal for the
class of systems with terminal alphabet T.
16From DNA to TCS
- The genetic code
- Splicing systems
- Optimal encodings for DNA Computing
- Sticker systems
- Watson-Crick automata
- Combinatorics on DNA words
- Cellular computing
- DNA computation by self-assembly
17DNA Computing (Adleman94)
- Input / Output (DNA)
- Data encoded using the DNA alphabet A, C, G, T
and synthesized as DNA strands - Bio-operations
- Cut
- Paste
- Recombination
- Anneal / Melt
- Copy
18Biomolecular (DNA) Computing
- Hamiltonian Path Problem Adleman, Science, 1994
- DNA-based addition Guarnieri et al, Science,
1996 - Maximal Clique Problem Ouyang et al, Science,
1997 - DNA computing by self-assembly Winfree et al,
Nature 1998 - Computations by circular insertions, deletions
Daley, Kari, Gloor, Siromoney, SPIRE99 - DNA computing on surfaces Liu et al, Nature,
2000 - Molecular computation by DNA hairpin
formationSakamoto et al, Science, 2000 - 20-variable Satisfiability Braich et al.,
Science 2002 - An autonomous molecular computer for logical
control of gene expression Benenson et al,
Nature, 2004 - Folding DNA to create nanoscale shapes and
patterns Rothemund, Nature, 2006 - Efficient Turing-universal computation with DNA
polymers Qian, Soloveichik, Winfree, DNA
Computing and Molecular Programming, 2010 - Molecular robots guided by prescriptive
landscapes Lund et al., Nature, 2010
19Encoding Information for DNA Computing
- DNA strands should form desired bonds
- DNA strands should be free of undesirable
intra-molecular bonds - DNA strands should be free of undesirable
inter-molecular bonds
20Intramolecular Bonds
21Intra- and inter-molecular bonds
22DNA-complementarity model (Kari,Kitto,Thierrin02
)
23Bond-free languages
- Bonds between DNA strands
24Sample Results (Hussini/Kari/Konstantinidis/Losse
va/Sosik 03)
25Sticker Systems (Freund,Paun,Rozenberg,Salomaa98
, Kari,Paun,Rozenberg,Salomaa,Yu98,
Hoogeboom,van Vugt00, Kuske,Weigel04,
Paun,Rozenberg 98)
- Given a complementarity relation, define an
alphabet of double-stranded columns -
26Sticking operation
27Complex Sticker Systems
- Sakakibara,Kobayashi 01 Sticker systems based
on hairpins - Alhazov,Cavaliere 05 Observable sticker
systems -
28Watson-Crick Automata (Freund,Paun,Rozenberg,Salo
maa99Paun,Rozenberg98 MartinVide,Paun,Rozenber
g,Salomaa98Czeizler,Czeizler06
Paun,Paun99Czeizler,Czeizler,Kari,Salomaa08)
29From DNA to TCS
- The genetic code
- Splicing systems
- Optimal encodings for DNA Computing
- Sticker systems
- Watson-Crick automata
- Combinatorics on DNA words
- Cellular computing
- DNA computation by self-assembly
30Combinatorics on DNA Words
- IDEA Consider the word w and its WK- complement,
WK(w), as equivalent - The word ACTG CAGT CAGT can be considered
repetitive (periodic) because it can be written
as ACGT WK(ACGT)2 - Generalize classical notions such as power of a
word, border, primitive word, palindrome,
conjugacy, commutativity
31Identity gt Antimorphic involution f
- Pseudo-palindrome (de Luca,De Luca06,
Kari,Mahalingam09) u f(u) - Pseudo-commutativity(Kari,Mahalingam08)
- u v f(v) u
- Pseudo-bordered word (Kari,Mahalingam07)
- w v x y f(v)
- Pseudoknot-bordered word (Kari,Seki09)
- w u
v x y f(u) f(v) - Pseudo-conjugacy of u, v (Kari,Mahalingam08)
- u x f(x) v
32Fine and Wilf Theorem
33Extended Fine and Wilf Theorem
34Extended Fine and Wilf Theorem
35Lyndon-Schutzenberger Equation
36Extended Lyndon-Schuzenberger
37Extended Lyndon-Schutzenberger
38Cellular Computing
Photo courtesy of L.F. Landweber
39Ciliates Genetic Info Exchange
Photo courtesy of L.F. Landweber
40Ciliates Gene Rearrangement
Photo courtesy of L.F. Landweber
41Ciliates Bio-operations
42Ciliate Computing
- Guided Recombination System A formal
computational model based on contextual circular
insertions and deletions - Such systems have the computational power of
Turing Machines (Landweber,Kari 99,Kari,Kari99)
43Other ciliate computing models
- Ld, hi, dlad model (Harju,Rozenberg 03,
Harju,Petre,Rozenberg 03, Prescott,
Ehrenfeucht,Rozenberg03) - Template guided recombination model
- (Angeleska,Jonoska,Saito,Landweber07,
- Daley,McQuillan 06, Kari,Rahman 10)
- RNA guided recombination model
- (Nowacki et. al, 07)
44From DNA to TCS
- The genetic code
- Splicing systems
- Optimal encodings for DNA Computing
- Sticker systems
- Watson-Crick automata
- Combinatorics on DNA words
- Cellular computing
- DNA computation by self-assembly
45DNA Computation by Self-Assembly (Mao, LaBean,
Reif, , Seeman, Nature, 2000)
46DNA self-assembly model (Adleman00, Winfree98)
- Tile square with the edges labelled from a
finite alphabet of glues (Wang 61)
- Tiles cannot be rotated
- Two adjacent tiles on the plane stick if they
have the same glue at the touching edges
47Dynamic Self-Assembly
- Tile System T Finite set of tiles, unlimited
supply of each tile type - Supertiles self-assemble with tiles from T
- Start with an arbitrary single tile seed
- Proceed by incremental additions of single tiles
that stick
48Self-Assembly Problem
- Given a tile system T, can arbitrarily large
supertiles self-assemble with tiles from T? - Equivalent to
- Given a tile system T, does there exist an
infinite ribbon of tiles from T?
49Sample Results
- Undecidability of existence of an infinite ribbon
(L.Adleman, J.Kari, L.Kari, D.Reishus, P.Sosik
09) - Consequence Undecidability of existence of
arbitrarily large supertiles that self-assemble
from a given tile set, starting from an arbitrary
seed - Self-assembly model with variable strength and
negative strength (repelling) glues (Doty, Kari,
Masson, 10)
50DNA Nanotechnology(Chen, Seeman, Nature, 01)
51DNA Clonable Octahedron (Shih, Joyce, Nature 04)
52Nanoscale DNA Tetrahedra(Goodman, Turberfield,
Science, 05)
53DNA Origami(Rothemund, Nature, 2006)
54From DNA to TCS
- The genetic code
- Splicing systems
- Optimal encodings for DNA Computing
- Sticker systems
- Watson-Crick automata
- Combinatorics on DNA words
- Cellular computing
- DNA computation by self-assembly
55Impact of DNA Computing on Theoretical Computer
Science
- Novel computing paradigms abstracted from
biological phenomena - Alternative physical substrates on which to
implement computations, e.g. DNA - Viewing natural processes as computations has
become essential, desirable, and inevitable - These developments challenge our assumptions, and
our very definition of computation - Biology and Computer Science life and
computation are related (Adleman)
56Our Challenge
- Discover a new, broader notion of computation
- Understand the world around us in terms of
information processing - Biology and Computer Science
- life and computation are related.
- I am confident that at their interface great
discoveries await whose who seek them.
(Adleman98)