Title: Seminar 1' Protein Biophysics Notes M'Buck 32007
1Seminar 1. Protein Biophysics Notes M.Buck
3/20/07 Structural Modeling and Conformational
Search /Minimization Main concept The way
the molecular structure is represented determines
both the potential function and the
conformational search/sampling that can be
carried out. It also determines the way we can
analyze the results, and thus ultimately the
questions we can address by modeling/simulation.
2Today we will consider three (interlinked)
topics Ways to represent/model a protein
structure Ways to sample/Search conformational
space Ways to analyze the results
3Table 1
- Level of presentation computational
method experimental method - Atoms electrons Quantum Mechanics
X-ray, Rhaman -
vibrational spect. - Atoms explicit water Molecular Dynamics
NMR - Atoms implicit solvation MD, Normal Mode
Anal. NMR, X-ray - Atoms segments NBO(N)D
HX, CD fluoresence -
threading, hydrodynamics - -Subunits, Calpha Brownian
Dynamics cryoEM
4- Table 2
- Typical Conformational parameters to consider
with proteins/polypeptides - Atom position, ri or (xi,yi,zi)
- Dihedral and chiral angles, peptide group
planarity - Surface accessibility vdW contact surface
packing (volume analysis) - Solvation by H2O, (counter) ions
- Contact distance, H-bonds, salt-bridges
- Population of conformations in N-dimensional
coordinate space - depends foremost on potential energy function
( kinetic energy) - Thus Energy Landscape determines accessible
conformational states
5Non-bonded and bonded potentials used in
molecular mechanics/dynamics
Classical bonded Molecular Mechanics (MM) atoms
treated as point (point charges) V S1,2pairs
½ Kb (b bo)2 Sbond angles ½ K? ( ? - ?0)2
Sdihedral angles K ? 1 cos(n? -
d) improper
dihedrals/torsion terms for planar or chiral
atoms Classical non-bonded MM potentials V S
non-bonded ij pairs 4eij (sij / r )12 (sij /
r)6 qi qj / (e r) Solvation term, e.g. (is
a free energy contribution) gi ASAi Hydrogen
bond term, e.g. VHB SH-bonds e (s / RD-A )6
(s / RD-A )4 cos4 (? - ?0) SW (RD-A) (F.
Fabiola et al., 2001)
6Goal Sampling of energetically meaningful
structures in conformational space Cartesian
Dihedral Angle Space Advantages and
problems Systematic Grid Search/Random Search
Trees pruning,
dead-end elimination
Fragments and templates
7Monte Carlo ! Metropolis Acceptance
criterion
Computer rolls the dice and based upon random
number constructs the next conformation e.g. if
working in dihedral angle space, firstly randomly
select amino acid position, secondly
randomly select which bond is to be rotated,
thirdly select a new torsion angle. Then build
new structure and calculate the new
potential Energy (?U potential energy
difference to previous conformation). Trial
conformation accepted or rejected based on a
temperature dependent probability, p, of the
Metropolis type. p e ß?U , if e ß?U lt 1 p
1, if e ß?U 1 where ß 1/(kBT) Then
random number r 0,1 is compared to p if r lt p
then conformation accepted. Note role of
Temperature T. Because there is a comparison
with the previous potential energy, there is a
probability of climbing up energy slopes.
8Differences Low/High Temp. Molecular Dynamics,
Minimization, Simulated
Annealing
9Distance Geometry Given a set of distance
restraints between atom a and b and b and c an
upper and lower bound can be put onto distance
between a and c (not just for bonded atoms but in
general).
e.g. upper distances (u) u ab u ab u bc
And lower bound (l) l ac l ab u bc
Matrix manipulations transform these restraints
into structures that satisfy them (called
embedding), followed by refinement (e.g.
minimization)
10Genetic Algorithms encoding of information
into chromosome
mutations, cross-overs
selection of viable population using
fitness function
11More tricks to search conformational space -
Low mode search (climbing out minima up the least
steepest ascent) - Poling (below)
- Scaling of potential function (right Hamelberg
et al., 2004)
12Characterization of ensemble of
conformations Real mean square deviation RMSD
1/N Sk1 to N (rik rjk)2 ½ (unit in
distance, i.e. Angstrom) First best fit
superposition, need identical molecules (same
number of atoms) alternative comparing Distance
Matrixes Local/Sliding RMSD RMS in Dihedral
space 1/N Sk1 to N Min(Tik
Tjk)2, (2p - Tik Tjk)2 ½
13Clustering (Wards method) merges two clusters
whose fusion minimizes information loss.
Information is defined as total sum of squared
deviations from the mean of each cluster.
Hierarchical clustering shown visually by
constructing a dendrogram (x- axis objects, y
axis distance between objects inter- cluster
distance)
Clustering Choice of discriminating Parameter
(here pairwise, Average rmsd)
14Principal component analysis is a technique for
simplifying a data set, by reducing
multidimensional data sets to lower dimensions
for analysis. PCA is an orthogonal linear
transformation that projects the data to a new
coordinate system such that the greatest
variance by any projection of the data comes to
lie on the first coordinate (called the first
principal component), the second greatest
variance on the second coordinate, and so on.
PCA can be used for
dimensionality reduction in a data set while
retaining those characteristics of the data set
that contribute most to its variance, by keeping
lower-order principal components and ignoring
higher-order ones. Such low-order components
often contain the "most important" aspects of
the data. (x to x transformation, where x is
the direction of greatest variance e.g.
greatest structural change in a protein). A
diagonalized covariance matrix resulting from the
procedure gives eigenvectors that capture most of
the variation in the position for the individual
atoms. (c.f. Normal Mode Analysis next time)
3N 6 dimensional space m structures in
ensemble/trajectory n by m matrix
15- Next time Modeling fluctuations/dynamics in
proteins and analysis - Question sheet and materials will follow