Title: A Systems Biology Research Program at DOE
1A Systems Biology Research
Program at DOE
- Ari Patrinos
- Office of Biological and Environmental Research
- U.S. Department of Energy
2High Level Tank Wastes
- Single Shell Tanks at Hanford
- 149 tanks
- 65 leakers
- 35M gallons of wastes
- 190K tons of chemicals
- 132M curies of radioactivity
- 75 Sr-90, 24 Cs-137
- Issues
- Waste characterization -Contaminant transport in
the subsurface - Waste treatment/disposal -Characterization/monit
oring - Tank corrosion
3DOE Subsurface Contamination
- Contaminated Soil/Groundwater
- 29 million m3 contaminated soil
- 36 million m3 mill tailings
- 4.7 billion m3 contaminated groundwater
- Chemicals, metals and radionuclides
- Contaminant concentrations may exceed drinking
water standards at some sites for hundreds of
years
4U.S. Department of Energy
Climate Modeling
Office of Science
Increased computational capability enables
increased resolution for realistic simulations of
ocean (and atmosphere) processes.
Note the emergence of the simulated Kuroshio
current off the coast of Japan with
increased resolution.
5(No Transcript)
6A Diverse Technology Strategy Supports a Broad
and Competitively Balanced Portfolio (Theres No
Silver Bullet)
- Nuclear
- Fission
- Fusion
- Carbon Capture and Storage
- Advanced Transformation Systems
- Electricity
- Hydrogen
- Bio-derivative fuels
- Energy Intensity Improvements
- Industry
- Buildings
- Transportation
- Wind and Solar
- Biotechnology
- Soils
- Biomass crops
- Advanced biotechnology
7Achieving the Reference Case Will Not Necessarily
Be Easy
- Assumed Advances In
- Fossil Fuels
- Energy intensity
- Nuclear
- Renewables
- Gap Technologies
- Carbon capture disposal
- Adv. fossil
- H2 and Adv. Transportation
- Biotechnologies
- Soils, Bioenergy, adv. Biological energy
The Gap
8Adding to the Portfolio Will Be Essential
Teragrams of carbon
960
90
10(No Transcript)
11Sequestering carbon in terrestrial
ecosystemsMultiple benefits, but research is
needed
- Potential opportunities
- Optimize land use
- Biomass above below ground
- Soil groundwater carbon
- Multiple benefits
- Reduced erosion
- Improved soil fertility
- Increased water retention
- Research needed
- Ecosystem dynamics assessment
- Soil improvement
- Land ecosystem management
- Species selection biotechnology
- Measurement monitoring
Small changes can lead to large benefit
Photosynthesis
62 GtC/y
-60 GtC/y
Respiration
12The Populus Tree for Carbon Sequestration
A Populus tree
Greenhouse testing
13U.S. Department of Energy
Ocean Carbon SequestrationSouthern Ocean Iron
Enrichment Experiments (SOFeX)
Office of Science
Iron added to ocean surface (50 ppt, 100 times
background) triggered massive phytoplankton bloom
Vertical profiles of particulate organic carbon
(POC) outside (left) and inside (right) an
enriched patch
Phytoplankton from outside (left) and inside
(right) an enriched patch
14Antarctica
Sites of ocean fertilization
New Zealand
15Remotely Operated Vehicle (the Tiburon) used to
test the environmental impacts of deep sea CO2
sequestration
16Deep-Sea CO2 Experiment
Closeup of CO2 release
Collecting animals with suction sampler
CO2 Treatment Station At 3,000 meters
CO2 release into Corral
Placing animals in cage
Time lapse video
Benthic organisms sensitive to pH drop that
occurs as CO2 diffuses out of initially formed
clathrate
Animal cage
Full CO2 corral
17Biotechnology for Energy
- Terrestrial plants produce 120B metric tons of
biomass per year - The stored energy is 2,400 Quads (1 Quad1015BTU)
- World energy consumption for 2001 315 Quads
- In the U.S., 85 of energy needs from coal,
natural gas and petroleum - Carbon emissions in the U.S. 1.6B tons of C
- 25 Quads of petroleum had to be imported
- Carbon neutral energy in the U.S. only 3 of
total - Only 0.2 Quads of corn-derived ethanol
POTENTIAL FOR BIOMASS IS GREATER!
18Biomass Potential
- 2B gallons of ethanol produced per year in U.S.
- 2 stage process corn carbohydrate
carbohydrate yeast
ethanol - 1 bushel of corn 2.5 gallons of
ethanol
(one acre 125 bushels)
Entire U.S. corn crop
22 billion gallons of ethanol
20 of
U.S. automotive needs - efficiency of conversion of sunlight to
carbohydrate 1 efficiency of conversion of
cornstarch to ethanol 50
Overall Efficiency 0.5
A SMALL INCREASE IN
THE EFFICIENCY CAN SIGNIFICANTLY
INCREASE THE POTENTIAL!!
19Microbes Produce Various Fuels
H2O CO2 O2
Yeast Clostridia
Methanogens
Ethanol Hydrogen
Methane
20A More Efficient Process for Hydrogen Production
21Green Algae
- Sulfur deprivation induces hydrogenases
- Can a continuous process for hydrogen production
be developed? - For example, a reengineered hydrogenase with
lower sensitivity to oxygen
22GenomicsGTL A Systems Biology Research Program
From Molecules to Cells to Ecosystems
Ecosystems
Subcellular
Cellular
Identification, subcellular location, and
dynamics of molecular machines
Regulation of gene expression in individual cells
Who is expressing what, when, where, and under
what conditions? How do they work together?
23GTL Program Goals
GTL Program Goals
Using DNA sequence and high-throughput
technologies
goal 1 Identify and characterize the molecular
machines of life goal 2 Characterize gene
regulatory networks goal 3 Characterize the
functional repertoire of complex microbial
communities in their natural environments at the
molecular level goal 4 Develop the computational
capabilities to advance understanding of complex
biological systems and predict their behavior
Systems Biology Gain a comprehensive and
predictive understanding of the dynamic,
interconnected processes underlying living systems
24Microbes Provide Biotechnology Payoffs for the
Nation
25so we should venture on the study of every kind
of animal without distaste for each and all will
reveal to us something natural and something
beautiful. Aristotles On the Parts of Animals
26Tree of Life
Bacteria Archaea
Eukarya
Euryarchaeota
Methanosarcina
Chlamydiae
Thermoplasma
Thermococcus
Slime molds
Crenarchaeota
Thermoproteus Pyrodictium
Entamoebae
Ciliates
Green nonsulfur bacteria
Microsporidia
Diplomonads
- Representative species completely sequenced or
in the process of being sequenced by DOE6 Feb
2003
27Sequencing to Date (6/29/04)
- http//www.genomesonline.org/
- Published Complete Genomes (including 4
chromosomes) 199 - Prokaryotic Ongoing Genomes 508
- Eukaryotic Ongoing Genomes (including 8
chromosomes) 421 - Total 1,128
- (18 increase since 1/13/04)
28Using the natural diversity of microbes
to find biotechnology solutions
Methane production
Ocean carbon pumping
29 7 Core GenomicsGTL Projects
Molecular Machines
Ecogenomics Genomics
Shewanella Federation
30GenomicsGTL Center for Molecular and Cellular
Systems
- Oak Ridge National Lab Pacific Northwest
National Lab Argonne National Lab Sandia
National Lab U of North
Carolina, Chapel Hill U of Utah - Develop and use technologies needed to identify
and characterize a complete set of microbial
multiprotein complexes - Focus on carbon fixing, hydrogen producing,
organic degrading microbe R. palustris
Shewanella - Current goal characterize 500 complexes this
year!
ORNL/PNNL Protein Complex Pipeline
31Rapid Deduction of Stress Response Pathways in
Metal/ Radionuclide Reducing Bacteria
- Lawrence Berkeley National Lab Sandia
National Lab
Oak Ridge National Lab
U of Washington
U of Missouri
Miami U
Diversa - Develop computational models for behavior of
microbial gene regulatory networks in response to
environmental conditions at DOE waste sites
Targets - Desulfovibrio, Geobacter, Shewanella
- Implemented bacterial systems biology
pipeline controlled biomass production,
physiologic profiling, metabolomics, imaging,
proteomics, computational framework for
comparative analysis of genomic and functional
genomic data - Rapid assessment of the effect of
environmental conditions on cellular behavior
32Carbon Sequestration in Synechococcus From
Molecular Machines to Hierarchical Modeling
- Sandia National Lab
Oak Ridge National Lab
Lawrence Berkeley National
Lab Los Alamos
National Lab
U California, San Diego
U of
Georgia
U of Michigan
U of California, Riverside
U of Illinois
National
Center for Genome Resources - Develop experimental/computational methods to
understand proteins, protein-protein interactions
regulatory networks in a microbe with a
significant role in the carbon cycle - Developed new microarray analysis method with
increased sensitivity and reduced noise. - Developed microbial simulation model that will
enable microbial biochemistry to be modeled
together with the behavior of the molecular
machines that carry out those reactions
33Microbial Ecology, Proteogenomics and
Computational Optima
- Harvard U
Massachusetts Institute of Technology Brigham and
Womens Hospital
Massachusetts General Hospital - Proteins, protein-protein interactions, gene
regulatory networks, community behavior,
computational models of microbes important to the
carbon cycle bioremediation. - New proteomics method for finding untagged
protein complexes - Developing whole cell flux balance model of
Procholorococcus (major ocean photosynthetic
organism) for use in hypothesis generation about
its natural behavior, different strains, and gene
knockouts - New effort on synthetic genome
Targets Prochlorococcus (carbon) Caulobacter
(bioremediation)
34Genetic Potential of Microbial Communities
Involved in the in situ Bioremediation of Uranium
- U of Massachusetts, Amherst
Argonne National Lab
U of Tennessee, Memphis
The Institute for Genomic
Research - Develop computational models to predict activity
of natural communities of microbes for
bioremediation - Common, naturally-occurring microbes reduce U,
Tc, Cr other metals - Metal reduction enhanced by feeding microbes
carbon sources - We can stimulate growth and activity of metal
reducing organisms in situ
Geobacter precipitating U (Lovley, U. Mass.)
35Shewanella Federation
- Pacific Northwest National Lab Oak Ridge
National Lab Argonne National Lab
Biatech
U of Southern California
Michigan State U
Marine Biology Lab (Mass)
Desert Research Inst (Nevada) - Characterize/model biology of versatile metal
reducing microbe to understand how they sense
respond to their environment. - Most comprehensive study of microbe from
proteomics to biochemistry to imaging to pathway
modeling to bioremediation potential
36U.S. Department of Energy
First Steps to a Synthetic Genome
Office of Science
- Institute for Biological Energy Alternatives
- Independent bioethical review
- 5,386 base pair bacteriophage synthesized in 10
days from a pool of 42 base pair, chemically
synthesized, gel purified oligomers - Oligomers ligated and converted into full length
molecules using polymerase cycling assembly - Accuracy demonstrated by DNA sequencing and phage
infectivity - First step to a new field of (much larger and
more challenging) synthetic genomes research
37Microbial Solutions from
the Sargasso Sea?
- Institute for Biological Energy Alternatives
- Collection of 200 L samples with shotgun
sequencing - Preliminary analysis suggests presence of 1,600
species
- Identification of gt1.3 million new genes
- Discovery of 800 distinct rhodopsin homologs
(light-sensitive photoreceptors) - Understanding the genetic and biochemical
diversity in our oceans may lead to new methods
for carbon sequestration or alternative energy
production.
38Ecogenomics A New Frontier
- lt 1 of microbes are culturable
- Many unculturables live in interdependent
consortia of considerable diversity - Ecogenomics applying the tools of genomics,
proteomics, etc to ecology - Can we recover genome-scale sequences and reveal
metabolic capabilities? - What is the structure of natural microbial
populations? How do they interact? Are they
interdependent? - Can we harness their metabolic capabilities?
39Ecogenomics Studying an environment with
minimal microbial complexity
Iron Mountain - microbially mediated toxic metal
discharge Superfund Site, pH lt 1
- Biofilm with a few uncultured organisms
- Identification, through genome assembly, of
the organisms present and responsible - Understanding of microbial interactions,
roles and molecular mechanisms could lead to
biochemical solutions
40There are two kinds of scientific revolutions,
those driven by new tools and those driven by new
concepts The effect of a concept-driven
revolution is to explain old things in new ways.
The effect of a tool-driven revolution is to
discover new things that have to be explained. In
almost every branch of science, and especially in
biology and astronomy, there has been a
preponderance of tool-driven revolutions. We have
been more successful in discovering new things
than in explaining old ones. Imagined Worlds,
Cambridge, MA Harvard University Press, 1997.
pp49ff
41- New tools for looking at molecular interactions
and molecular machines - Computational tools to model microbial pathways
behavior - New tools to study uncultured microbes, microbial
communities microbial metabolites - Piloting new methods for high throughput protein
tag production
42High-throughput Proteomics Deinococcus
radiodurans R1
- 3.1 megabase 3,116 predicted ORFs
- Large redundancy in DNA complement
- Potential relevance to bioremediation and
understanding DNA repair - 83 of all predicted proteins observed/
confirmed using AMT tags
43GTL Facilities for the Future of Science A
Twenty-Year Outlook
Cellular Activities Understanding how cells
respond to environmental cues
Cellular Components Providing the basis to study
proteins in living systems
Developing a predictive understanding of the
functions of cells and communities of cells
Understanding how molecular machines are formed
and how they function
A New Infrastructure for Biological Research
44Discrete Automata models
Organisms
Finite element models
Evolutionary Processes
Ecosystems and Epidemiology
Organ function
Electrostatic continuum models
Cells
Cell signalling
Biology Involves Many Scales of Time and Size
DNA replication
Size Scale
Enzyme Mechanisms
Biopolymers
Ab initio Quantum Chemistry
Protein Folding
Empirical force field Molecular Dynamics
Atoms
Homology-based Protein modeling
First Principles Molecular Dynamics
Geologic Evolutionary Timescales
10-15
Timescale (seconds)
45High-Performance Computing Roadmap for
the Genomics GTL Program
Protein machine Interactions
?
1000 TF 100 TF 10 TF 1 TF
Molecule-based cell simulation
Molecular machine classical simulation
Cell, pathway, and network simulation
Community metabolic regulatory, signaling
simulations
Constrained rigid docking
Constraint-Based Flexible Docking
Current U.S. Computing
Genome-scale protein threading
?
Comparative Genomics
Teraflops
Biological Complexity
46GTL Experiment TemplateGenerating Petascale Data
Sets
While this example does not account for data
processing and compression it illustrates how
even simple raw data storage will quickly become
a bottleneck for biologists.
47GTL-type Science will Require High Performance
Computing for Both Capacity and Capability
Problems
48Biology and Computing
- Shift from qualitative, data-poor,
experiment- driven to quantitative, data-rich
discipline where simulations guide and interpret
experiments - Flood of diverse data DNA sequence, protein
structures, imaging, etc. - Use of first principle methods originally
developed for chemistry
However
- Much about even the simplest of life forms is
terra incognita - No overarching theories to provide context for
experimental data at the level of organisms,
cells, etc. - Biology defies Occams Razor
49Computers and Biology
- Two Principal Applications
- Bioinformatics
- Biomolecular simulations
- Bioinformatics Assembly, annotation, analysis,
comparison of DNA sequences, protein structures,
etc. - Biomolecular Simulations Chemical modeling
techniques applied to biochemical phenomena - Emerging Areas Kinetic models of
metabolic/regulatory pathways and
reaction-diffusion models of transport within
and between cells
50Bioinformatics
- Significant research challenges in sequence
analysis and higher-level genome annotation - Long way from reliable protein-function
prediction from sequence - Large data sets from emerging biotechnologies
expression levels, structures of protein
complexes and subcellular resolution of
biochemicals - Need for reliable, robust, and more efficient
tools, e.g., for genome
reconstruction and gene finding also protein
comparative modeling based on distant homologies
51Biomolecular Simulations
- Holy Grail atomic-level simulation of every
biochemical process in an organism - Difficulties
- large size of biomolecules
- long simulation times (for biology)
- subtle energetics of biochemical reactions
- biochemistry far from equilibrium
- Range of Modeling Tools
- One extreme is quantum mechanical calculations
high accuracy but computationally is limited to
small molecular systems - The other extreme is molecular dynamics using
simplified ball-spring force fields
limited accuracy - Combination of the two approaches e.g., (II)
providing data on large-scale conformational
changes in enzymes and (I) predicting the effects
of catalytic activity of the enzyme
52http//DOEGenomesToLife.org
53To give away money is an easy matter and in any
mans power. But to decide to whom to give it,
and how much and when, and for what purpose and
how, is neither in every mans power nor an easy
matter. Hence, it is that such excellence is
rare, praiseworthy, and noble. Aristotle