Title: FlexibleProtein Docking
1Flexible-Protein Docking
- Dr Jonathan Essex
- School of Chemistry
- University of Southampton
2Southampton
3Programme
- Existing small-molecule docking
- Typical approximations, and outcomes
- Evidence for receptor flexibility, and
consequences - Methods for accommodating protein flexibility in
docking - The ensemble approach
- The induced fit approach
4Existing small-molecule docking
- Taylor, R.D. et al. J. Comput. Aided Mol. Des.
16, 151-166 (2002) - Many docking algorithms (some 127 references in
this 2002 review!) - Most docking algorithms
- Rigid receptor hypothesis
- Limited receptor flexibility in, for example,
GOLD polar hydrogens
5Existing small-molecule docking
- Most docking algorithms
- Range of ligand sampling methods
- Pattern matching, GA, MD, MC
- Treatment of intermolecular forces
- Simplified scoring functions empirical,
knowledge-based and molecular mechanics - Very simple treatment of solvation and entropy,
or completely ignored!
6Existing small-molecule docking
- And how well do they work?
- Jones, G. et al. J. Mol. Biol. 267, 727-748
(1997) - In re-docking studies, achieved a 71 success
rate - This is probably typical of most of these methods
- So whats missing?
7The scoring function
- Existing functions inadequate
- Too simplified, for reasons of computational
expediency - Solvation and entropy often inadequately treated
- Possible solutions?
- More physics
8The rigid receptor hypothesis
- Murray, C.W. et al. J. Comput. Aided Mol. Des.
13, 547-562 (1999) - Docking to thrombin, thermolysin, and
neuraminidase - PRO_LEADS Tabu search
- In self docking, ligand conformation correctly
identified as the lowest energy structure 76 - For cross-docking 49 successful
- Some of the associated protein movements very
small
9The rigid receptor hypothesis
- Erickson, J.A. et al. J. Med. Chem. 47, 45-55
(2004) - Docking of trypsin, thrombin and HIV1-p
- Self-docking, docking to a single structure that
is closest to the average, and docking to apo
structures - Docking accuracy declines on docking to the
average structure, and is very poor for docking
to apo - Decline in accuracy correlated with degree of
protein movement
10The rigid receptor hypothesis
- Erickson, J.A. et al. J. Med. Chem. 47, 45-55
(2004)
11Models of Protein-Ligand Binding
- Goh, C.-S. et al. Curr. Opin. Struct. Biol. 14,
104-109 (2004) - Review of receptor flexibility for
protein-protein interactions
12Models of Protein-Ligand Binding
- This paper classifies protein-protein binding in
terms of these models - Induced fit assumed if there is no experimental
evidence for a pre-existing equilibrium of
multiple conformations - Note that strictly this is an artificial
distinction - Statistical mechanics all states are accessible
with a non-zero probability - For induced fit, probability of observing bound
conformation without the ligand may be very small
13Protein flexibility in drug design
- Teague, S.J. Nature Reviews 2, 527-541 (2003)
- Effect of ligand binding on free energy
14Protein flexibility in drug design
- Multiple conformations of a few residues
- Acetylcholinesterase
- Phe330 flexible acts as a swinging gate
15Protein flexibility in drug design
- Movement of a large number of residues
- Acetylcholinesterase (again!)
16Protein flexibility in drug design
- Table 1 in Teague paper lists pharmaceutically
relevant flexible targets (some 30 systems!) - Consequences of protein flexibility for ligand
design - One site, several ligand binding modes possible
17Protein flexibility in drug design
- Consequences
- Allosteric inhibition
- Binding often remote from active site NNRTIs
- Proteins in metabolism and transport
- Promiscuous
- Bind many compounds, in many orientations
- E.g P450cam substrates, camphor versus
thiocamphor (two orientations, different to
camphor!)
18Experimental evidence for population shift
- Binding kinetics
- Binding to low-population conformation should
yield slow kinetics DGbarrier - Observed for p38 MAP kinase - mobile loop
- Rates of association vary between 8.5 x 105 and
4.3 x 107 M-1s-1, depending on whether
conformational change involved - Slow kinetics can make experimental comparison
between assays difficult - Slow kinetics can improve ADME properties!
19Nitrogen Regulatory Protein C (NtrC) plays a
central role in the bacterial metabolism of
nitrogen
Experimental evidence for population shift
20Changing nitrogen levels promote the activity of
NtrB kinase
Protein conformational change
NtrB kinase phosphorylates NtrC at aspartate 54
in the receiver domain
21Phosphorylation promotes conformational change in
the receiver domain
Protein conformational change
22Protein conformational change
- NtrC active and inactive conformations apparent
- P-NtrC protein shifted towards activated
conformation - Volkman, B.F. et al. Science 291, 2429-33 (2001)
23Summary
- Protein flexibility important in ligand design
- Two basic mechanisms
- Selection of a binding conformation from a
pre-existing ensemble population shift - Induced fit binding to a previously unknown
conformation - Thermodynamically, these mechanisms are identical
- Evidence for population shift from binding
kinetics, and protein NMR
24Docking methods for incorporating receptor
flexibility
- Ensemble docking
- Docking to individual protein structures, or
parts of protein structures ensemble docking - Docking to a single average structure soft
docking - Induced fit modelling
- Carlson, H.A. Curr. Opin. Chem. Biol. 6, 447-452
(2002)
25Ensemble docking
- Generate an ensemble of structures, and dock to
them - Experimentally derived structures
- NMR or X-ray structures
- Computationally derived structures
- Molecular dynamics
- Simulated annealing
- Normal mode propagation
26FlexE
- Claussen, H. et al. J. Mol. Biol. 308, 377-395
(2001) - Extension of the FlexX algorithm
- Preferred conformations for ligands identified
- Simplified scoring function adopted based on
hydrogen bonds, ionic interactions etc. - Break ligand into base fragments by severing
acyclic single bonds
27FlexE
- Extension of the FlexX algorithm
- Base fragments placed in active site by
superposing interaction centres - Incrementally reconstruct ligand onto base
fragments - Test each partial solution and continue with the
best for further reconstruction
28FlexE
- United protein description
- Use a set of protein structures representing
flexibility, mutations, or alternative protein
models - Assumes that overall shape of the protein and
active site is maintained across the series - FlexE selects the combination of partial protein
structures that best suit the ligand - Flexibility given by FlexE is therefore defined
by the protein input structures
29FlexE
- United protein description
- Similar parts of the protein structures are
merged - Dissimilar parts of the protein are treated as
separate alternatives
30FlexE
- United protein description
- Some combinations of the structural features are
incompatible and not considered - As the ligand is constructed, the optimum protein
structure is identified - Combination strategy for the protein may result
in a structure not present in the original data
set
31FlexE
- Evaluation
- 10 proteins, 105 crystal structures
- RMSD lt 2.0 Å, within top ten solution, 67
success - Cross-docking with FlexX gave 63
- FlexE faster than cross-docking with FlexX
- Aldose reductase - very flexible active site
- FlexE docking successful (3 ligands)
- Using only one rigid protein structure would not
have worked
32Ensemble docking
- Advantages
- Well-defined computational problem
- Computational cost generally scales linearly with
number of structures (potential combinatorial
explosion) - Can use either experimental information, or
structures derived from computation - Disadvantages
- What happens if the appropriate bound receptor
conformation is not present in the ensemble?
33Soft-Docking
- Knegtel, R.M.A. et al. J. Mol. Biol. 266, 424-440
(1997) - Build interaction grids within DOCK that
incorporate the effect of more than one protein
structure - Effectively soften and average the different
structures
34Soft-Receptor Modelling
- Österberg, F. et al. Proteins 46, 34-40 (2002)
- Similar approach applied to Autodock grids
- Energy-weighted grid
- Boltzmann-type weighting applied to reduce the
influence of repulsive terms - Combined grids performed very well HIV protease
35Soft-Receptor Modelling
36Soft-Receptor Modelling
- Advantages
- Low computational cost use of single averaged
protein model - Can use experimental or simulation derived
structures - Disadvantages
- Cope with large-scale motion?
- How reliable is this averaged representation?
- Mutually exclusive binding regions could be
simultaneously exploited - Active sites enlarged
37Induced-Fit Docking Methods
- Allow protein conformational change at the same
time as the docking proceeds - Taking some of these algorithms, in no particular
order
38Induced-Fit Docking Methods
- Molecular dynamics methods
- Mangoni, R. et al. Proteins 35, 153-162 (1999)
- Separate thermal baths used for protein and
ligand to facilitate sampling - Multicanonical molecular dynamics
- Nakajima, N. et al. Chem. Phys. Lett. 278,
297-301 (1997) - Bias normal molecular dynamics to yield a flat
energy distribution
39Induced-Fit Docking Methods
- Monte Carlo methods
- Apostolakis, J. et al. J. Comput. Chem. 19, 21-37
(1998) - Hybrid Monte Carlo and minimisation method.
Poisson-Boltzmann continuum solvation used - ICM, Abagyan, R. et al. J. Comput. Chem. 15,
488-506 (1997) - Conventional MC, plus side-chain moves from a
rotamer library - Minimisation again required
- VS - J. Mol. Biol. 337, 209-225 (2004)
40Induced-Fit Docking Methods
- FDS Taylor, R. et al. J. Comput. Chem. 24,
1637-1656 (2003) - Flexible ligand/flexible protein docking
- large side chain motions, rotamer library
- Solvation included on the fly
- continuum solvation model GB/SA
- Soft-core potential energy function
- anneal the potential to improve sampling
41Arabinose Binding Protein
- Rigid protein docking
- Low energy structures are essentially identical
to the X-ray structure - Dock starting from experimental result, does not
return to it
42Arabinose Binding Protein
- Flexible protein docking
- Experimental structure found
- A number of other structures are isoenergetic
- Cannot uniquely identify the experimental
structure
43Arabinose Binding Protein
- Flexible protein docking
- Most successful structure with experiment
(transparent) - Most successful structure, experiment, and
isoenergetic mode
44Monte Carlo Docking
- 15 complexes studied
- Rigid receptor
- 13/15 identified X-ray binding mode
- 8/15 were the unique, lowest energy structures
- 3/15 were part of a cluster of low-energy binding
modes - Flexible receptor
- 11/15 identified X-ray binding mode
- 3/15 were the unique, lowest energy structure
- 6/15 were part of a cluster of low-energy binding
modes
45FAB Fragment
- Two isoenergetic binding modes
-
- Closest seed Isoenergetic seed
46Conclusion
- Rigid protein docking as successful as other
methods, but much more expensive - Flexible protein docking does find X-ray
structures, but does not uniquely identify them - Refine scoring function?
- Using this methodology, need to consider a number
of structures - Further validation required
47Summary
- Two main approaches for modelling receptor
flexibility - Use of multiple structures (experimental or
theoretical) either independently, or averaged in
some way ensemble approach - Allow the receptor to adopt conformations under
the influence of the ligand induced fit approach
48Summary
- Ensemble is the more widely employed less
expensive, but limited somewhat by the
composition of the ensemble - Induced fit should overcome this disadvantage of
ensemble methods - Induced fit methods can have significant sampling
problems - not computationally limited
- search space large, and increasing as extra
degrees of freedom added
49Flexible protein docking a case study
- Wei, B.Q. et al. J. Mol. Biol. 337, 1161-1182
(2004) - Use experimental structures
- Like FlexE, flexible regions move independently,
and are able to recombine - Modified version of DOCK used
50Flexible protein docking a case study
- Receptor decomposed into three parts
- Green rigid
- Blue and red two flexible parts
- Ligand scored against each component
- Best-fit protein conformation assembled from
these components
51Flexible protein docking a case study
- Scoring function
- Electrostatic (potential from PB), van der Waals
- Solvation (scaled AMSOL result according to
buried surface area) - Large ligands favoured for large cavities
- Penalty for forming the larger cavity introduced
52Flexible protein docking a case study
- In screening, enrichment improved compared to
docking against individual conformations - ACD screened against L99A M102Q mutant of T4L
- 18 compounds that were predicted to bind and
change cavity conformation, tested - 14 found to bind
- X-ray structures obtained on 7
53Flexible protein docking a case study
- Predicted ligand geometries reproduced (lt 0.7 Å)
- In five structures, part of observed cavity
changes reproduced - In two structures, receptor conformations not
part of original data set, and therefore not
reproduced!
54Flexible protein docking a case study
- New ligands found by flexible receptor docking
- Receptor conformational energy needs to be
considered
55Conclusion
- Rigid receptor approximation not universal
- Two main approaches to modelling receptor
flexibility - Ensemble
- Induced fit
- Further validation of these methods needed
56Acknowledgements
- Flexible Docking
- Richard Taylor, Phil Jewsbury, Astra Zeneca
- Practical
- Donna Goreham, Sebastien Foucher