GTL Facilities Computing - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

GTL Facilities Computing

Description:

GTL Facilities Computing Infrastructure for 21st Century Systems Biology Ed Uberbacher ORNL & Mike Colvin LLNL – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 17
Provided by: KimN46
Category:

less

Transcript and Presenter's Notes

Title: GTL Facilities Computing


1
GTL Facilities Computing
  • Infrastructure for 21st Century
  • Systems Biology

Ed Uberbacher ORNL Mike Colvin LLNL
2
Ultimate Goal is to Provide Predictive Models of
Microbes
This goal drives data collection and computing
strategy.
  • Experimental
  • Complete datasets
  • Quantitative measurements
  • Comprehensive physical characterization
  • Protein expression and interactions
  • Spatial distributions
  • Process kinetics
  • Computational
  • Automated data analysis and validation
  • Automated integration of diverse data sets
  • Human and computer-accessible databases
  • Molecular, Pathway and cell-level simulations

The goals require a new synergy between computing
and biology.
3
GTL Biology ParadigmIntegrated Large-Scale
Experiment-Computing Cycles
Real-Time Analysis
Large-Scale Data Sets
Design or Revise Models
Experiment
Simulate and Generate Hypotheses
4
Facility IProduction and Characterization of
ProteinsEstimating Microbial Genome Capability
  • Computational Analysis
  • Genome analysis of genes, proteins, and operons
  • Metabolic pathways analysis from reference data
  • Protein machines estimate from PM reference data
  • Knowledge Captured
  • Initial annotation of genome
  • Initial perceptions of pathways and processes
  • Recognized machines, function, and homology
  • Novel proteins/machines (including
    prioritization)
  • Production conditions and experience

5
Facility II Whole Proteome AnalysisModeling
Proteome Expression, Regulation, and Pathways
  • Analysis and Modeling
  • Mass spectrometry expression analysis
  • Metabolic and regulatory pathway / network
    analysis and modeling
  • Knowledge Captured
  • Expression data and conditions
  • Novel pathways and processes
  • Functional inferences about novel
    proteins/machines
  • Genome super annotation regulation, function,
    and processes (deep knowledge about cellular
    subsystems)

6
Regulatory Gene Network Model for Endomesoderm
Specification
Eric Davidson
Skeletogenic
7
Facility III Characterization and Imaging of
Molecular MachinesExploring Molecular Machine
Geometry and Dynamics
  • Computational Analysis, Modeling and Simulation
  • Image analysis/cryoelectron microscopy
  • Protein interaction analysis/mass spec
  • Machine geometry and docking modeling
  • Machine biophysical dynamic simulation
  • Knowledge Captured
  • Machine composition, organization, geometry,
    assembly and disassembly
  • Component docking and dynamic simulations of
    machines

8
Example of Combined Experiment and Modeling to
Understand a Multiprotein Complex DNA Clamps
Clamp-Loading Mechanisms
Homology Modeling Venclovas et al. Prot. Sci.
112403 (2002)
Electron microscopy Mayanagi et al. J. Struct.
Bio. 134 35 (2001)
Atomic Force Microscopy Shiomi, et al. PNAS,
9714127 (2002)
Classical Mol. Dynamics Jeruzalmi et al. Cell
106417 (2001)
Mechanistic model based on physical and
biochemical data Jeruzalmi et al. Cell 106429
(2001)
9
Facility IV Analysis and Modeling of Cellular
Systems Simulating Cell and Community Dynamics
  • Analysis, Modeling and Simulation
  • Couple knowledge of pathways, networks, and
    machines to generate an understanding of cellular
    and multi-cellular systems
  • Metabolism, regulation, and machine simulation
  • Cell and multicell modeling and flux
    visualization
  • Knowledge Captured
  • Cell and community measurement data sets
  • Protein machine assembly time-course data sets
  • Dynamic models and simulations of cell processes

10
Centrally Planned Analysis and Modeling Tools
Libraries
Facility 1 genome annotation regulatory element
and operon identification metabolic pathway
analysis Facility 2 mass spec data
analysis expression analysis and
clustering metabolic and regulatory network
modeling Facility 3 image analysis mass spec
analysis protein / machine modeling docking and
molecular dynamics Facility 4 metabolic
simulation regulatory simulation cell modeling
and simulations
Collect and manage software - Maintain current
versions - Ensure hardware compatability - User
Interfaces - Documentation
11
GTL facilities will Require High Performance
Computing for Both Capacity and Capability
ATCGTAGCAATCGACCGT... CGGCTATAGCCGTTACCG TTATGCTA
TCCATAATCGA... GGCTTAATCGCATACGAC...
Best match
Thread onto templates
Capacity e.g., High-throughput protein
structure predictions
Capability e.g., Large scale biophysical
simulations
Large size and timescale classical simulations
Highly accurate quantum mechanical simulations
12
GTL High-Performance Computing Roadmap
Protein machine Interactions
?
1000 TF 100 TF 10 TF 1 TF
Molecule-based cell simulation
Molecular machine classical simulation
Cell, pathway, and network simulation
Community metabolic regulatory, signaling
simulations
Constrained rigid docking
Constraint-Based Flexible Docking
Current U.S. Computing
Genome-scale protein threading
?
Comparative Genomics
Teraflops
Biological Complexity
13
Swimming in Data Exploding Need to Capture and
Manipulate Data
  • Across Scales of Space and Time - Petabytes
  • From Acquisition, Refinement, Reduction and
    Deposition

14
Central Database Planning
  • Data Repositories
  • Genomes, annotation and community genomes
  • Expression data and proteome composition
  • Metabolite and flux data
  • Metabolic pathways and kinetic parameters
  • Protein interactions
  • Protein machines repository - machine
    composition, function, homology, models
  • Image data repository
  • Regulatory network data and models
  • Cell models repository
  • Integrated or integrable
  • Requires development of cross-facilities approach

microbial genomes
phylogeny
regulatory networks
protein domains
pathways
community genomes
Metabolic models
proteomics
regulatory elements
literature
Expression
protein machines
protein structure
15
The GTL Knowledge BaseIntegration of Large
Datasets is a Precursor to Predictive Modeling
  • GTL knowledge base will change how information
    about microbes
  • reaches the community
  • Models and simulations will be online
  • We will know more and more about systems in each
    consecutive microbe

16
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com