Title: Computational Biology Part 1: Biomolecular Modeling
1Computational BiologyPart 1 Biomolecular
Modeling
Instructor Prof. Jesus Izaguirre Textbook Tamar
Schlick, Molecular Modeling and Simulation An
Interdisciplinary Guide, Springer-Verlag,
Berlin-New York, in press, 2002 Reference C.
Brooks, M. Karplus, B. Pettitt, Proteins A
Theoretical Perspective of Dynamics, Structure,
and Thermodynamics, Wiley, 1988
2Outline
- What is biomolecular modeling?
- Historical perspective
- Theory and experiments
- Protein characterization
- Computational successes
- Remaining challenges
3What is biomolecular modeling?
- Application of computational models to understand
the structure, dynamics, and thermodynamics of
biological molecules - The models must be tailored to the question at
hand Schrodinger equation is not the answer to
everything! Reductionist view bound to fail! - This implies that biomolecular modeling must be
both multidisciplinary and multiscale
4Historical Perspective
- 1946 MD calculation
- 1960 force fields
- 1969 Levinthals paradox on protein folding
- 1970 MD of biological molecules
- 1971 protein data bank
- 1998 ion channel protein crystal structure
- 1999 IBM announces blue gene project
5Theoretical Foundations
- Born-Oppenheimer approximation (fixed nuclei)
- Force field parameters for families of chemical
compounds - System modeled using Newtons equations of motion
- Examples hard spheres simulations (alder and
Wainwright, 1959) Liquid water (Rahman and
Stillinger, 1970) BPTI (McCammon and Karplus)
Villin headpiece (Duan and Kollman, 1998)
6Experimental Foundations I
- X-ray crystallography
- Analysis of the X-ray diffraction pattern
produced when a beam of X-rays is directed onto a
well-ordered crystal. The phase has to be
reconstructed. - Phase problem solved by direct method for small
molecules - For larger molecules, sophisticated Multiple
Isomorphous Replacement (MIR) technique used - Current resolution below 2 \AA
- Protein crystallography
- Difficult to grow well-ordered crystals
- Early success in predicting alpha helices and
beta sheets (Pauling, 1950s)
7Experimental Foundations II
- NMR Spectroscopy
- Nuclear Magnetic Resonance provides structural
and dynamic information about molecules. It is
not as detailed as X-ray, limited to masses of 35
kDa - Distances between neighboring hydrogens are used
to reconstruct the 3D structure using global
optimization
8Proteins I
- Polypeptide chains made up of amino acids or
residues linked by peptide bonds - 20 aminoacids
- 50-500 residues, 1000-10000 atoms
- Native structure believed to correspond to energy
minimum, since proteins unfold when temperature
is increased
9Proteins II
- Secondary structure alpha helices, beta sheets,
turns - Tertiary structure proteins are tightly packed,
with hydrophobic groups in the core and charged
sidechains in the surface - Quaternary structure protein domains may
assemble into so called quaternary structures
10Proteins III
- Protein motions of importance are torsional
oscillations about the bonds that link groups
together - Substantial displacements of groups occur over
long time intervals - Collective motions either local (cage structure)
or rigid-body (displacement of different regions) - What is the importance of these fluctuations for
biological function?
11Proteins IV
- Effect of fluctuations
- Thermodynamics equilibrium behavior important
examples, energy of ligand binding - Dynamics displacements from average structure
important example, local sidechain motions that
act as conformational gates in oxygen transport
myoglobin, enzymes, ion channels
12Proteins V Local Motions
- 0.01-5 AA, 1 fs -0.1s
- Atomic fluctuations
- Small displacements for substrate binding in
enzymes - Energy source for barrier crossing and other
activated processes (e.g., ring flips) - Sidechain motions
- Opening pathways for ligand (myoglobin)
- Closing active site
- Loop motions
- Disorder-to-order transition as part of virus
formation
13Proteins VI Rigid-Body Motions
- 1-10 AA, 1 ns 1 s
- Helix motions
- Transitions between substrates (myoglobin)
- Hinge-bending motions
- Gating of active-site region (liver alcohol
dehydroginase) - Increasing binding range of antigens (antibodies)
14Proteins VII Large Scale Motion
- gt 5 AA, 1 microsecond 10000 s
- Helix-coil transition
- Activation of hormones
- Protein folding transition
- Dissociation
- Formation of viruses
- Folding and unfolding transition
- Synthesis and degradation of proteins
- Role of motions sometimes only inferred from two
or more conformations in structural studies
15Study of Dynamics I
- The computational study of atomic fluctuations in
BPTI and other proteins has shown that - Directional character of active-site fluctuations
in enzymes contributes to catalysis - Small amplitude fluctuations are lubricant
- It may be possible to extrapolate from short time
fluctuations to larger-scale protein motions
16Study of Dynamics II
- Collective motions particularly important for
biological function, e.g., displacements for
transition from inactive to active - Extended nature of these motions makes them
sensitive to environment great difference
between vacuum and solution simulations - Collective motions transmit external solvent
effects to protein interior
17Study of Dynamics III
- For the related storage protein, myoglobin
- Fluctuations in the globin are essential to
binding the protein matrix in X-ray is so
tightly packed that there is no low energy path
for the ligand to enter or leave the heme pocket - Only through structural fluctuations can the
barriers be lowered sufficiently - Demonstrated through energy minimization and
molecular dynamics
18Study of Dynamics IV
- For the transport protein hemoglobin there are
several important motions - Oxygen binding produces tertiary structural
change - A quaternary structural change from deoxy (low
oxygen affinity) to oxy configuration takes
place. This transmits information over a long
distance - From the X-ray deoxy and oxy structures, a
stochastic reaction path has been found. Detailed
ligand binding has been performed using MD. A
statistical mechanical model has provided
coupling between these two processes
19Study of Dynamics VI
- Three open problems are the following
- Ion channel gating highly correlated
fluctuations are likely to be of great
importance. Long time dynamics problem - Flexible docking for MMP, enzymes, etc.,
fluctuations enter into thermodynamics and
kinetic of reactions. Sampling problem - Protein folding too complicated for full
treatment but for smallest proteins, beyond
current methodology. Coarsening problem
20Lengthening scales DPD
- Dissipative Particle Dynamics combines coarsening
of atoms into fluid packages with dissipative
pair interactions, and a stochastic pair
interaction - Total momentum conserved
- Self-organization of lipid bilayer,
self-assembled aggregates formed by amphiphilic
lipid molecules in water.
21Lengthening of Scales SRP
- Enzyme simulation of a ms using stochastic
reaction path disadvantage need initial and
final configuration - Finds a trajectory where global energy is
minimized
22Lengthening of Scales MUSICO
- Multiscale molecular dynamics combining
- Symplectic splitting into nearly linear and
nonlinear parts - Implicit integration of linear part (similar to
SRP) with constraining of internal d.o.f - Explicit treatment of highly nonlinear part
- Optional pairwise stochasticity for stability
- No coarsening yet
23Scalable Parallelization of ProtoMol
ProtoMol--parallel software framework for the
simulation of bio-molecules
- OBJECTIVE Make ProtoMol a more scalable parallel
program - Hundreds of nodes
- Heterogeneous platforms
- APPROACH
- Abstract parallel layer
- Dynamic load balancing
- Multithreading
- More scalable algorithms
ProtoMol is open source and available at
http//www.cse.nd.edu/lcls/Protomol.html
24Web-based Simulation Services
Simulation Request
Results via XML
ProtoMol Parallel Server
- OBJECTIVE Make ProtoMol a web application
- Web service for molecular and cellular
simulations - Component that provides data and simulation
capabilities through the web
- APPROACH
- .NET platform for Windows and Linux
- .NETMicrosofts platform for XML Web services
25Interactive Simulation Interfaces
- OBJECTIVE Interactive interfaces for ProtoMol
- User friendly interface to setup, monitor, and
steer simulations - Ability to quickly experiment with molecules and
cells - APPROACH
- 3-D Visualization using OpenGL
- Sockets interface between ProtoMol and
visualization component - Haptic Device interface
A haptic device interface was demonstrated at
SuperComputing 2000, and will be again at the
2001 event.
26Acknowledgements
- The LCLS would like to thank the following--
- National Science Foundation Biocomplexity grant
PHY-0083653 - Department of Computer Science and Engineering,
Univ. of Notre Dame - and our Collaborators
- Dr. Mark Alber, Mathematics and Center for
Applied Mathematics, Notre Dame - Dr. Petter E. Bjorstad, Institutt for
Informatikk, U. of Bergen, Norway - Dr. Gabor Forgacs, Physics and Biology,
University of Missouri-Columbia - Dr. James A. Glazier, Physics, Notre Dame
- Dr. George Hentschel, Physics, Emory University
- Dr. Edward Maginn, Chemical Engineering, Notre
Dame - Dr. J. Andrew McCammon, Chemistry Biochemistry,
University of California, San Diego - Dr. Stuart Newman, Cell Biology and Anatomy, New
York Medical College - Dr. Martin Tenniswood, Biological Sciences and
Walther Cancer Institute, Notre Dame - Dr. Robert Skeel, Computer Science and Beckman
Institute, University of Illinois at
Urbana-Champaign