Biosimplicity: Engineering Simple Life - PowerPoint PPT Presentation

About This Presentation
Title:

Biosimplicity: Engineering Simple Life

Description:

about whatever man builds, that all of man's ... met-tRNA formylation. Mfl409, Mfl569. Mfl152, Mfl153, Mfl154. Mfl233, Mfl234, Mfl235 ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 43
Provided by: kni68
Category:

less

Transcript and Presenter's Notes

Title: Biosimplicity: Engineering Simple Life


1
Biosimplicity Engineering Simple Life
  • Tom Knight
  • MIT Computer Science and
  • Artificial Intelligence Laboratory

Simplicity is the ultimate sophistication --
Leonardo da Vinci
2
Measuring Complexity
  • Number of components
  • Size of description
  • Size of functional model

Have you ever thought... about whatever man
builds, that all of man's industrial efforts, all
his calculations and computations, all the nights
spent over working draughts and blueprints,
invariably culminate in the production of a thing
whose sole and guiding principle is the ultimate
principle of simplicity? In any thing at all,
perfection is finally attained, not when there
is no longer anything to add, but when there is
no longer anything to take away. -- Antoine
de Sainte Exupery, in "Wind, Sand, and Stars"
3
Engineered Simple Organisms
  • modular
  • understood
  • malleable
  • low complexity
  • Start with a simple existing organism
  • Remove structure until failure
  • Rationalize the infrastructure
  • Learn new biology along the way

The chassis and power supply for our computing
4
Some history
  • Confusion over what PPLO/Mycoplasma were
  • The Microbe of pleuorpneumonia Nocard 1896
  • 1932 isolation of PPLO. Koch postulates.
  • 1958 Klieneberger-Nobel identifies them as free
    living bacterial species
  • Morowitz 1962 SciAm the smallest living cell
  • 1980 Gilbert effort to sequence M. capricolum
  • 1982 Morowitz complete understanding of life
  • 1996 Fraser et al. M. genitalium sequence
  • 1999 Hutchison et al. Minimal genome set for M.
    genitalium

5
Relative Complexity
Mycoplasma genitalium (580 kB)
Mesoplasma florum (793 kB)
S. Cerevisiae (12 MB)
T7 Phage (36 kB)
Human (3.3 GB)
E. Coli (4.6 MB)
Lilly (12 GB)
Plasmid
Gene
100
1K
10K
100K
1M
10M
100M
1G
10G
100G
Alive
Autotroph
Log Genome Size, base pairs
6
From Giovannoni et al. 2005
7
Choosing an organism
  • Safe
  • BSL-1 organism -- insect commensal
  • Un-regulated
  • Not a crop plant or domesticated animal pathogen
  • Fast growing
  • 40 minute doubling time
  • vs. six hours for M. genitalium
  • Convenient to work with
  • Facultative anaerobe
  • Small genome
  • Known sequence
  • Complete annotation

8
The Mollicute Bibliome
  • Complete collection of mycoplasma related papers
  • 6,418 and counting
  • All books and book chapters also
  • Endnote Refworks
  • Downloaded .pdfs for articles gt 1995
  • Scanned articles and books, OCR with Abbyy
    Finereader
  • Plans for a Google appliance search engine
  • Collaboration for shallow semantic
    understanding
  • people.csail.mit.edu/tk/mfpapers/
    usermeso, passmeso

9
(No Transcript)
10
Mesoplasma florum
1 u
11
Tomographic TEM
Courtesy Jensen Lab, Caltech
12
Culture Medium 1161
  • Beef Heart infusion
  • 4 Sucrose
  • Fresh yeast culture broth
  • 20 horse serum
  • Penicillin
  • Phenol red

13
Defined medium
Amino acids (minus asp, glu) Guanine Uracil Thymin
e Adenine Glucose BSA Palmitic acid Oleic acid
Sodium phosphate KCl Magnesium sulfate Glycerol S
permine Nicotinic acid Thiamine Riboflavin Pyrido
xamine Thioctic acid Coenzyme A
14
Synthesis vs. Import
  • Mycoplasma import virtually all small biochemical
    molecules
  • Each import is done with a specific membrane
    protein some are capable of importing a class
  • Complexity is reduced if the import is simpler
    than the synthesis
  • Example of the opposite
  • Glutamine ? Glutamic acid
  • Asparagine ? Aspartic acid

15
Sequencing the genome
  • First sequenced small portions of the genome to
    test that we had the correct species
  • Compared the results to Genbank entries
  • Sequenced PTS system gene, identical to reported
    sequence
  • Sequenced 16S rRNA (unreported)
  • Discovered identical to Mesoplasma entomophilum
    16S rRNA sequence probably the same species
  • Genbank entry
  • Measured genome size with pulse field gels
  • Sequenced 12 of genome to see what we were up
    against

16
PFGE of Mesoplasmagenomic DNA
y yeast marker l lambda marker me Mesoplasma
entomophilum mf Mesoplasma florum ml Mesoplasma
lactucae
1.2 agarose 9C 6V/cm Ramped 90 - 120 sec 48
hours
17
I-CeuI digestion
  • Special restriction enzymes cut only at 23S rRNA
    sites
  • 5 ..TAACTATAACGGTC CTAAGGTAGCGA..3
  • 3 ..ATTGATATTGCCAGGATT CCATCGCT..5
  • Calculate the number of rRNA sites in the genome
    from the number of cut fragments

18
I-CeuI digests
19
rRNA sequences
  • Degenerate primers
  • The two rRNA sequences are different

20
(No Transcript)
21
Library creation
  • Randomly cut genomic DNA with EcoRI
  • Shotgun cloned into pUC18 vector
  • Sequenced the inserts (0.1 8 Kb)
  • Sheared the genomic DNA with a needle
  • End repaired
  • Cloned into defective lambda phage vector
  • Packaged vector into phage heads
  • Infected E. coli cells with phage
  • 40 Kb inserts

22
What we learned from partial sequence
  • Almost all old friends
  • Little or no extra junk
  • Inter-gene sequences small (-4 to 30 bp)
  • Little transcriptional control
  • High AT vs. GC content 27 average GC
  • 20 in typical genes, even less in control
    regions
  • 40 in rRNA regions

23
Origin of Replication
24
Whitehead Agreement
  • Whitehead agreed in January 2002 to sequence the
    organism
  • Estimated to take about two hours of time on
    their sequencers
  • Sure, we can do it Tom, but what do we do with
    the rest of the day after the coffee break?
  • How many other organisms like this are there?
  • 300
  • Why dont we sequence them all?
  • Good draft available November 2002
  • Gaps closed July 2003 -- final November 2003

25
Gap closure
  • 9 gaps remained
  • Long range PCR
  • Primer walking
  • One difficult sequence
  • Poly A region 16-17 bp long
  • Sequencing stuttered
  • Reprime with aaaaaaaaaaaaaaaag
  • Repeat region 186 bp in surface lipoprotein
  • Give up on accurate sequence, PCR for length
  • Final assembly verification

26
Genome characteristics
  • 793281 base pairs
  • 26.52 G C
  • 682 protein coding regions
  • UGA for tryptophan
  • No CGG codon or corresponding tRNA
  • Classic circular genome
  • oriC, terminator region, gene orientation
  • 39 stable RNAs
  • 29 tRNAs
  • 2x 16S, 23S, 5S
  • RNAse-P, tmRNA, SRP

27
Standard Motifs
  • -10 present usually very highly conserved
  • Often preceded by a TG 1-2 bp upstream
  • Seldom a conserved -35 region
  • RBS is standard Shine-Dalgarno
  • Alternate RBS matches complementary region of 16S
    rRNA UAACAACAU (Loechel 91)
  • Standard stem-loop terminators with loop TTAA
  • 6-8 poly T tail in forward direction
  • dnaA box TTATCCACA
  • Four ribo-box sequences (Thi, Ile, Val, Guanine)

28
(No Transcript)
29
Understand the metabolism
  • Identify major metabolic pathways by finding
    critical genes coding for known enzymes
  • Predict necessary enzymes which may not have been
    found
  • Evaluate the list of unknown function genes for
    candidates
  • Build the major metabolic pathway map of the
    organism
  • Consider elimination of entire pathways

30
G. Fournier 02/23/04
PTS II System
Mfl519, Mfl565
sucrose
trehalose
xylose
beta-glucoside
glucose
unknown
Mfl516, Mfl527, Mfl187
Mfl500
Mfl669
Mfl009, Mfl033, Mfl318, Mfl312
ribose ABC transporter
fructose
Mfl214, Mfl187
Mfl619, Mfl431, Mfl426
ATP Synthase Complex
Mfl181
Mfl497
Mfl515, Mfl526
Mfl499
Mfl317?, Mfl313?
?
Mfl009, Mfl011, Mfl012, Mfl425, Mfl615, Mfl034,
Mfl617, Mfl430, Mfl313?
Mfl109, Mfl110, Mfl111, Mfl112, Mfl113, Mfl114,
Mfl115, Mfl116
Mfl666, Mfl667, Mfl668
glucose-6-phosphate
chitin degradation
ATP
ADP
Mfl347, Mfl558
sn-glycerol-3-phosphate ABC transporter
Mfl023, Mfl024, Mfl025, Mfl026
Glycolysis
Pentose-Phosphate Pathway
L-lactate, acetate
Mfl254, Mfl180, Mfl514, Mfl174, Mfl644, Mfl200,
Mfl504, Mfl578, Mfl577, Mfl502, Mfl120, Mfl468,
Mfl175, Mfl259 Mfl039, Mfl040, Mfl041, Mfl042,
Mfl043, Mfl044, Mfl596, Mfl281
Mfl223, Mfl640, Mfl642, Mfl105, Mfl349
glyceraldehyde-3-phosphate
Lipid Synthesis
fatty acid/lipid transporter
Mfl384, Mfl593, Mfl046, Mfl052
unknown substrate transporters
ribose-5-phosphate
Mfl590, Mfl591
Mfl230, Mfl382, Mfl286, Mfl663, Mfl465, Mfl626
acetyl-CoA
Mfl099, Mfl474,Mfl315, Mfl325,Mfl482
cardiolipin/ phospholipids
membrane synthesis
x13
PRPP
Purine/Pyrimidine Salvage
phospholipid membrane
Identified Metabolic Pathways in Mesoplasma florum
Mfl074, Mfl075, Mfl276, Mfl665, Mfl463, Mfl144,
Mfl342, Mfl343, Mfl170, Mfl195, Mfl372 Mfl419,
Mfl676, Mfl635, Mfl119, Mfl107, Mfl679, Mfl306,
Mfl648, Mfl143, Mfl466, Mfl198, Mfl556,
Mfl385 Mfl076, Mfl121, Mfl639, Mfl528, Mfl530,
Mfl529, Mfl547, Mfl375
niacin?
Mfl063, Mfl065, Mfl038, Mfl388
xanthine/uracil permease
variable surface lipoproteins
Pyridine Nucleotide Cycling
Mfl413, Mfl658
Mfl444, Mfl446, Mfl451
Mfl340, Mfl373, Mfl521, Mfl588
hypothetical lipoproteins
Mfl583, Mfl288, Mfl002, Mfl678, Mfl675, Mfl582,
Mfl055, Mfl328
Mfl150, Mfl598, Mfl597, Mfl270, Mfl649
DNA Polymerase
RNA Polymerase
x22
competence/ DNA transport
Mfl047, Mfl048, Mfl475
Electron Carrier Pathways
DNA
RNA
K, Na transporter
Mfl027, Mfl369
Mfl064, Mfl178 Nfl289, Mfl037, Mfl653, Mfl193
Mfl165, Mfl166
NAD
Flavin Synthesis
ribosomal RNA
transfer RNA
degradation
FMN, FAD
Mfl193
Mfl029, Mfl412, Mfl540, Mfl014, Mfl196,Mfl156,
Mfl282, Mfl387, Mfl682, Mfl673, Mfl077, rnpRNA
Mfl563, Mfl548, Mfl088, Mfl258, Mfl329, Mfl374,
Mfl541, Mfl005, Mfl647, Mfl231, Mfl209
malate transporter?
Mfl283, Mfl334
hypothetical transmembrane proteins
Mfl378
NADP
x57
Ribosome
metal ion transporter
Signal Recognition Particle (SRP)
Mfl356, Mfl496, Mfl217
riboflavin?
tRNA aminoacylation
messenger RNA
NADH
NADPH
protein secretion (ftsY)
srpRNA, Mfl479
23sRNA, 16sRNA, 5sRNA, Mfl122, Mfl149, Mfl624,
Mfl148, Mfl136, Mfl284, Mfl542, Mfl132,
Mfl082, Mfl127, Mfl561, Mfl368.1, Mfl362.1,
Mfl129, Mfl586, Mfl140, Mfl080, Mfl623, Mfl137,
Mfl492, Mfl406 Mfl608, Mfl602, Mfl609, Mfl493,
Mfl133, Mfl141, Mfl130, Mfl151, Mfl139, Mfl539,
Mfl126, Mfl190, Mfl441, Mfl128, Mfl125, Mfl134,
Mfl439, Mfl227, Mfl131, Mfl123, Mfl638, Mfl396,
Mfl089, Mfl380, Mfl682.1, Mfl189, Mfl147,
Mfl124, Mfl135, Mfl138, Mfl601, Mfl083, Mfl294,
Mfl440?
cobalt ABC transporter
Mfl237
Mfl152, Mfl153, Mfl154
proteins
Formyl-THF Synthesis
Export
Mfl613, Mfl554, Mfl480, Mfl087, Mfl651, Mfl268,
Mfl366, Mfl389, Mfl490, Mfl030, Mfl036, Mfl399,
Mfl398, Mfl589, Mfl017, Mfl476, Mfl177, Mfl192,
Mfl587, Mfl355 Mfl086, Mfl162, Mfl163, Mfl161
phosphonate ABC transporter
met-tRNA formylation
Mfl571, Mfl572
Mfl060, Mfl167, Mfl383, Mfl250
protein translocation complex (Sec)
Mfl409, Mfl569
phosphate ABC transporter
Mfl057, Mfl068, Mfl142,Mfl090, Mfl275
Mfl233, Mfl234, Mfl235
degradation
THF?
Mfl186
formate/nitrate transporter
amino acids
intraconversion?
Mfl509, Mfl510, Mfl511
Mfl094, Mfl095, Mfl096, Mfl097, Mfl098
Mfl418, Mfl404, Mfl241, Mfl287, Mfl659, Mfl263,
Mfl402, Mfl484, Mfl494, Mfl210, tmRNA
spermidine/putrescine ABC transporter
oligopeptide ABC transporter
Mfl016, Mfl664
putrescine/ornithine APC transporter
Mfl182, Mfl183, Mfl184
Mfl015
Mfl605
Mfl019
Mfl652
Mfl557
unknown amino acid ABC transporter
arginine/ornithine antiporter
glutamine ABC transporter
lysine APC transporter
alanine/Na symporter
glutamate/Na symporter
Amino Acid Transport
31
How Simple is this?
  • Missing cell wall, outer membrane
  • Missing TCA cycle
  • Missing amino acid synthesis
  • Missing fatty acid synthesis
  • One sigma factor
  • Small number of dna binding proteins
  • One insertion sequence, probably not active
  • One restriction system (Sau3AI-like)
  • CTG/CAG methylation (function?)
  • Evidence for shared protein function
  • MDH/LDH (Pollack 97 Crit rev microbiol 23269)

32
Minimal is not always simple
  • Shared function of parts
  • Overlapped genes
  • Tradeoff of import vs. synthesis
  • Example
  • Television set design
  • Shared deflection coil, high voltage power
    supply, isolated filament supply
  • Three functions, a single circuit, a difficult
    engineering, modeling, debugging, and repair task
  • how many genes have multiple functions

33
DNA Methylation
  • Bisulfite conversion of genomic DNA
  • Sequencing of converted DNA to identify
    methylated C positions
  • Results GATC sequences, as expected
  • Unexpected CAG and CTG sequences

34
Current work
  • Array experiments
  • Close species
  • Transcriptional units, pseudo-genes
  • TRASH
  • Protein species by LC/LC/MS/MS
  • Elimination of the restriction system
  • Plasmid system
  • pBG7AU based
  • Recombination system
  • Positive/negative selection
  • Yeast chromosome transfer
  • Genome edits to reduce size
  • Genome edits to modularize
  • Genome edits to eliminate complexity
  • Use as a construction chassis

35
Reconstruct the Genome
  • Use recombination techniques to edit the genome
  • Eliminate unnecessary genes
  • Remove overlaps
  • Standardize promoters, ribosomal binding sites
  • Identify transcriptional and translational
    regulators
  • Recode proteins to use a reduced portion of the
    coding space

The code is 4 billion years old, its time for a
rewrite
36
YAC mutagenesis
  • Bring up YAC technology
  • Spheroplasts, old YAC plasmid sequencing
  • Triple transform with MF chromosome and
  • pRML1 (Spencer 92)
  • pRML2
  • Genome inactive except for ARS, telomeres,
    selection markers
  • Use yeast recombination systems for genome
    editing
  • Isolate and recircularize YAC to form a new
    genome
  • Lipid encapsulate genome into vesicles
  • Fuse vesicles with genome-killed wild type cells

37
Proteome
  • Collaboration with Steve Tannenbaum / Yingwu Wang
  • 2-D gels MS spot ID
  • LC/LC/MS/MS ID of trypsin digests

38
Riboswitch Analysis
  • Collaboration with Ron Breaker / Adam Roth
  • Discovery of unique riboswitches specific for GTP
    rather than dGTP
  • Found in no other sequenced genomes
  • Analysis of close relatives under way

39
Engineer plasmids
  • No known plasmids for this class of organism
  • Renaudin has made plasmids using the chromosomal
    OriC as the replicative element
  • Lartigue 03
  • M. mycoides has pADB201 and pBG7AU rolling circle
    plasmids similar to pE194 (1082 bp)
  • We know the antibiotic sensitivities and have
    working resistance genes

40
Kit Part the genome
  • Make Biobrick parts from each gene, tRNA,
    promoter, other part-like genome element
  • Attempt to develop techniques for recombining
    parts into coherent modules
  • Develop techniques for assembling and modeling
    the resulting structures

41
Thanks to
  • Harold Morowitz
  • Greg Fournier
  • Gail Gasparich
  • Bob Whitcomb
  • Eric Lander
  • Bruce Birren
  • Nicole Stange-Thomann
  • George Church
  • Roger Brent
  • Grant Jensen
  • Yingwu Wang
  • Nick Papadakis
  • Ron Weiss
  • Drew Endy
  • Randy Rettberg
  • Austin Che
  • Reshma Shetty
  • MIT Synthetic biology working group
  • DARPA, NTT, NSF, Microsoft

42
Synthetic Biology
  • An alternative to understanding complexity is to
    remove it
  • This complements rather than replaces standard
    approaches
  • Engineering synthetic constructs will be easier
  • Enabling quicker more facile experiments
  • Enabling deeper understanding of the basic
    mechanisms
  • Enabling applications in nanotechnology, medicine
    and agriculture

Simplicity is the ultimate sophistication --
Leonardo da Vinci
Write a Comment
User Comments (0)
About PowerShow.com