Computational Biology Introduction to Biomolecular Modeling - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Computational Biology Introduction to Biomolecular Modeling

Description:

Computational Biology Introduction to Biomolecular Modeling Instructor: Prof. Jes s A. Izaguirre Textbook: Tamar Schlick, Molecular Modeling and Simulation: An ... – PowerPoint PPT presentation

Number of Views:145
Avg rating:3.0/5.0
Slides: 40
Provided by: cseNdEdu
Category:

less

Transcript and Presenter's Notes

Title: Computational Biology Introduction to Biomolecular Modeling


1
Computational BiologyIntroduction to
Biomolecular Modeling
Instructor Prof. Jesús A. Izaguirre Textbook
Tamar Schlick, Molecular Modeling and Simulation
An Interdisciplinary Guide, Springer-Verlag,
Berlin-New York, 2002 Reference C. Brooks, M.
Karplus, B. Pettitt, Proteins A Theoretical
Perspective of Dynamics, Structure, and
Thermodynamics, Wiley, 1988
2
Outline
  • What is biomolecular modeling?
  • Historical perspective
  • Theory and experiments
  • Protein characterization
  • Computational successes
  • Remaining challenges

3
What is biomolecular modeling?
  • Application of computational models to understand
    the structure, dynamics, and thermodynamics of
    biological molecules
  • The models must be tailored to the question at
    hand Schrodinger equation is not the answer to
    everything! Reductionist view bound to fail!
  • This implies that biomolecular modeling must be
    both multidisciplinary and multiscale

4
Historical Perspective
  • 1946 MD calculation
  • 1960 force fields
  • 1969 Levinthals paradox on protein folding
  • 1970 MD of biological molecules
  • 1971 protein data bank
  • 1998 ion channel protein crystal structure
  • 1999 IBM announces blue gene project

5
Theoretical Foundations
  • Born-Oppenheimer approximation (fixed nuclei)
  • Force field parameters for families of chemical
    compounds
  • System modeled using Newtons equations of motion
  • Examples hard spheres simulations (alder and
    Wainwright, 1959) Liquid water (Rahman and
    Stillinger, 1970) BPTI (McCammon and Karplus)
    Villin headpiece (Duan and Kollman, 1998)

6
Experimental Foundations I
  • X-ray crystallography
  • Analysis of the X-ray diffraction pattern
    produced when a beam of X-rays is directed onto a
    well-ordered crystal. The phase has to be
    reconstructed.
  • Phase problem solved by direct method for small
    molecules
  • For larger molecules, sophisticated Multiple
    Isomorphous Replacement (MIR) technique used
  • Current resolution below 2 \AA
  • Protein crystallography
  • Difficult to grow well-ordered crystals
  • Early success in predicting alpha helices and
    beta sheets (Pauling, 1950s)

7
Experimental Foundations II
  • NMR Spectroscopy
  • Nuclear Magnetic Resonance provides structural
    and dynamic information about molecules. It is
    not as detailed as X-ray, limited to masses of 35
    kDa
  • Distances between neighboring hydrogens are used
    to reconstruct the 3D structure using global
    optimization

8
Proteins I
  • Polypeptide chains made up of amino acids or
    residues linked by peptide bonds
  • 20 aminoacids
  • 50-500 residues, 1000-10000 atoms
  • Native structure believed to correspond to energy
    minimum, since proteins unfold when temperature
    is increased

9
Protein Function in Cell
  • Enzymes
  • Catalyze biological reactions
  • Structural role
  • Cell wall
  • Cell membrane
  • Cytoplasm

10
Protein The Machinery of Life
NH2-Val-His-Leu-Thr-Pro-Glu-Glu- Lys-Ser-Ala-Val-T
hr-Ala-Leu-Trp- Gly-Lys-Val-Asn-Val-Asp-Glu-Val- G
ly-Gly-Glu-..
11
Proteins II
  • Secondary structure alpha helices, beta sheets,
    turns
  • Tertiary structure proteins are tightly packed,
    with hydrophobic groups in the core and charged
    sidechains in the surface
  • Quaternary structure protein domains may
    assemble into so called quaternary structures

12
Protein Structure
13
Protein Structure
14
Model Molecule Hemoglobin
15
Hemoglobin Background
  • Protein in red blood cells

16
Red Blood Cell (Erythrocyte)
17
Hemoglobin Background
  • Protein in red blood cells
  • Composed of four subunits, each containing a heme
    group a ring-like structure with a central iron
    atom that binds oxygen

18
Heme Groups in Hemoglobin
19
Hemoglobin Background
  • Protein in red blood cells
  • Composed of four subunits, each containing a heme
    group a ring-like structure with a central iron
    atom that binds oxygen
  • Picks up oxygen in lungs, releases it in
    peripheral tissues (e.g. muscles)

20
Hemoglobin Quaternary Structure
Two alpha subunits and two beta subunits (141 AA
per alpha, 146 AA per beta)
21
Hemoglobin Tertiary Structure
One beta subunit (8 alpha helices)
22
Hemoglobin Secondary Structure
alpha helix
23
Proteins III
  • Protein motions of importance are torsional
    oscillations about the bonds that link groups
    together
  • Substantial displacements of groups occur over
    long time intervals
  • Collective motions either local (cage structure)
    or rigid-body (displacement of different regions)
  • What is the importance of these fluctuations for
    biological function?

24
Proteins IV
  • Effect of fluctuations
  • Thermodynamics equilibrium behavior important
    examples, energy of ligand binding
  • Dynamics displacements from average structure
    important example, local sidechain motions that
    act as conformational gates in oxygen transport
    myoglobin, enzymes, ion channels

25
Proteins V Local Motions
  • 0.01-5 AA, 1 fs -0.1s
  • Atomic fluctuations
  • Small displacements for substrate binding in
    enzymes
  • Energy source for barrier crossing and other
    activated processes (e.g., ring flips)
  • Sidechain motions
  • Opening pathways for ligand (myoglobin)
  • Closing active site
  • Loop motions
  • Disorder-to-order transition as part of virus
    formation

26
Proteins VI Rigid-Body Motions
  • 1-10 AA, 1 ns 1 s
  • Helix motions
  • Transitions between substrates (myoglobin)
  • Hinge-bending motions
  • Gating of active-site region (liver alcohol
    dehydroginase)
  • Increasing binding range of antigens (antibodies)

27
Proteins VII Large Scale Motion
  • gt 5 AA, 1 microsecond 10000 s
  • Helix-coil transition
  • Activation of hormones
  • Protein folding transition
  • Dissociation
  • Formation of viruses
  • Folding and unfolding transition
  • Synthesis and degradation of proteins
  • Role of motions sometimes only inferred from two
    or more conformations in structural studies

28
Study of Dynamics I
  • The computational study of atomic fluctuations in
    BPTI and other proteins has shown that
  • Directional character of active-site fluctuations
    in enzymes contributes to catalysis
  • Small amplitude fluctuations are lubricant
  • It may be possible to extrapolate from short time
    fluctuations to larger-scale protein motions

29
Study of Dynamics II
  • Collective motions particularly important for
    biological function, e.g., displacements for
    transition from inactive to active
  • Extended nature of these motions makes them
    sensitive to environment great difference
    between vacuum and solution simulations
  • Collective motions transmit external solvent
    effects to protein interior

30
Study of Dynamics III
  • For the related storage protein, myoglobin
  • Fluctuations in the globin are essential to
    binding the protein matrix in X-ray is so
    tightly packed that there is no low energy path
    for the ligand to enter or leave the heme pocket
  • Only through structural fluctuations can the
    barriers be lowered sufficiently
  • Demonstrated through energy minimization and
    molecular dynamics

31
Study of Dynamics IV
  • For the transport protein hemoglobin there are
    several important motions
  • Oxygen binding produces tertiary structural
    change
  • A quaternary structural change from deoxy (low
    oxygen affinity) to oxy configuration takes
    place. This transmits information over a long
    distance
  • From the X-ray deoxy and oxy structures, a
    stochastic reaction path has been found. Detailed
    ligand binding has been performed using MD. A
    statistical mechanical model has provided
    coupling between these two processes

32
Study of Dynamics VI
  • Three open problems are the following
  • Ion channel gating highly correlated
    fluctuations are likely to be of great
    importance. Long time dynamics problem
  • Flexible docking for MMP, enzymes, etc.,
    fluctuations enter into thermodynamics and
    kinetic of reactions. Sampling problem
  • Protein folding too complicated for full
    treatment but for smallest proteins, beyond
    current methodology. Coarsening problem

33
Possible topics for final projects
  • Applications
  • Virtual screening
  • Extend recommender for MD protocols
  • Algorithms
  • Multiscale integrators or sampling methods
  • Cellular automata solvers for diffusion,
    reaction, advection, etc.
  • Software
  • 3D Visualization
  • Extend simulation engines

34
How to create hierarchical, multiscale,
multilevel algorithms?
  • Examples
  • Algorithms for N-body problem (linear complexity,
    multiple grids) e.g., Matthey and Izaguirre
    (2004) J. Par. Dist. Comp.
  • Multiscale integration (15 order of magnitude gap
    on timescales) e.g. Ma and Izaguirre (2003),
    Multisc. Model. Simul.
  • Coarse approximations (use averaging or
    stochastic or ensemble) solutions, e.g. Izaguirre
    and Hampton (2004), J. Comp. Phys.

35
Lengthening scales DPD
  • Dissipative Particle Dynamics combines coarsening
    of atoms into fluid packages with dissipative
    pair interactions, and a stochastic pair
    interaction
  • Total momentum conserved
  • Self-organization of lipid bilayer,
    self-assembled aggregates formed by amphiphilic
    lipid molecules in water.

36
Lengthening Scales SRP
  • Enzyme simulation of a ms using stochastic
    reaction path disadvantage need initial and
    final configuration
  • Finds a trajectory where global energy is
    minimized

37
How to predict protein interaction networks?
  • Goal
  • Predict proteins in a genome that are likely to
    interact, thus giving clue as to their function.
  • Our current solution starts from experimental
    interaction data and uses clustering and a set
    cover approach to predict novel interactions.
  • This is documented in Huang et al. (2004),
    IEEE/ACM TCBB, submitted

38
How to create high-performance software that is
easy to use?
ProtoMol, CompuCell3D, Biologo
  • Goals
  • Encapsulate optimizations like parallelism and
    cluster/grid computing so that these can be used
    easily. MATLAB and Mathematica are examples of
    easy to use scientific software
  • Allow easy prototyping of algorithms, extensions
    of the software by computational scientists (not
    expert computer scientists)
  • Our current solutions use
  • Generic and object-oriented programming
  • Design patterns
  • XML-based domain specific languages
  • Related publications
  • Matthey et al. (2004) ACM Trans. Math. Software,
    20(3)
  • Cickovski et al. (2004) IEEE/ACM Trans. Comput.
    Biol. and Bioinformatics
  • Cickovski and Izaguirre (2004) ACM Trans. Prog.
    Lang. and Systems, in preparation

ProtoMol is open source and available at
http//protomol.sourceforge.net
39
How to help user select software, algorithms, and
parameters to solve their problems?
Simulation Requirements
Optimal parameters via XML
ProtoMol/MDSimAid Server
Our solution uses performance models and machine
learning to generate rules, run-time optimization
to fine tune suggestions. We want to use agents
and machine learning to update the rules. This is
documented in Ko (2002) and Crocker et al.
(2004), J. Comp. Chem.
Goal Recommend optimal software and
architectural parameters to solve particular
problems Make this easily available as web portals
Write a Comment
User Comments (0)
About PowerShow.com