Title: Scoring Functions
1Scoring Functions
- What works and what doesnt
Mark McGann OpenEye Scientific Software www.eyesop
en.com
2What is a scoring function?
- Approximate binding energy
- Easy to evaluate (less than 1ms/pose)
- Rule based, not physics based
3The Scoring Functions
- Gaussian Scoring Function1
- ChemGauss2
- PLP3
- Chemscore4
- Shape Surface2
- Electrostatic Surface2
- Mark R. McGann, Harold R. Almond, Anthony
Nicholls, J. Andrew Grant, and Frank K. Brown,
Gaussian Docking Functions'", Biopolymers, Vol.
68, pp. 76--90, 2003. - No publication at present.
- Gennady M. Verkivker, Djamal Bouzida, Daniel K.
Gehlaar, Paul A. Rejto, Sandra Arthurs, Anthony
B. Colson, Stephan T. Freer, Veda Larson, Brock
A. Luty, Tami Marrone Peter W. Rose,
Deciphering common failures in molecular docking
of ligand-protein complexes, Journal of
Computer-Aided Molecular Design, Vol. 14,
pp.731-751, 2000. - Matthew D. Eldridge, Christopher W. Murray,
Timothy R. Auton, Gaia V. Paolini and Roger P.
Mee, Empirical scoring functions I. The
development of a fast empirical scoring function
to estimate the binding affinity of ligands in
receptor complexes, Journal of Computer-Aided
Molecular Design, Vol. 11, pp. 425-445, 1997.
4Test Case The PDB
- Pros
- Readily available
- Many structures, 24244 total
- Cons
- Not all structures are complexes
- Not all complexes are with high affinity binders
- PDB files not exactly high quality
5PDB Complexes
5462 out of 24244 (23) of entries in the PDB
determined to be complexes
- Ligand candidate rules
- Not covalently bound to protein
- Between 10 and 50 heavy atoms
- Only H,C,N,O,F,P,S,Cl,Br or I atoms
- At least 1 Carbon atom
- Ligand Selection Rules (when there are 2 or more
candidates) - Favor ligands with closer to 30 heavy atoms
- Favor ligands that are more deeply buried
- Discard rules
- Abnormal valences on ligand (e.g., carbons with
more than 4 bonds) - Abnormal bond lengths on ligand (any bond greater
than 2.75Å) - Ligand protein heavy atom contacts less than 1.5Å
(center-center) - Ligands makes few close contacts with the protein
- Ligands with more than 12 rotatable bonds
6Repeated Ligands
7Testing Method
- Omega
- Conformer generation (Omega)
- Fred
- Pose ensemble generation
- Filter poses
- Score poses
8Conformer Generation Omega
- Start with connection table (SMILES string)
- Generate initial structure using distance
geometry - Refine structure with MMFF
- Enumerate conformers using torsion and ring
library
9Conformer Generation Results (Best conformer
overlay with binding mode)
10Pose Generation Ensemble Generation
Rotations
Translations
Systematic rotation and translation of conformers
within the active site 1 Angstrom step
size Conformers are rigid
11Visualization of systematic search
Rotations
Translations
12Negative Image Filter
13Pose Generation Negative Image Filter
Negative image
Median Poses
Average Poses
50,000
170,000
14Pose Ensemble Results(best pose after negative
image)
15The Gaussian Scoring Function
- Shape based (no chemical awareness)
- Smooth pair-wise Vdw like potentials
- Potential exist between heavy atoms
16Shape Surface
- Shape based surface matching (no chemical
awareness) - Smooth gaussian potentials
- Potentials exist between points on a molecular
surface
17PLP
- Chemically aware of
- Hydrogen bond acceptors
- Hydrogen bond donors
- Non-polar atoms
- Sulphurs
- Metals
- Interaction potentials
- Pair-wise linear potentials
- HB and metal interactions are partly angle based
18Electrostatic Surface
- Shape based surface matching with electrostatic
coloring - Smooth gaussian potentials
- Favorable between positive-negative and neutral
surface points - Negative potential between point without
favorable interactions
19Chemical Gaussian Scoring Function
- Chemically aware of
- Hydrogen bond acceptors
- Lone-pair positions
- Hydrogen bond donors
- Shape
- Aromatic Rings
- Metals
- Interaction potentials
- Smooth gaussian potentials
- Shape interaction is the same as the gaussian
scoring function
20Chemscore
- Chemically aware of
- Hydrogen bond acceptors
- Hydrogen bond donors
- Lipophilic atoms
- Frozen rotatable bonds
- Metals
- Interaction potentials
- Most potentials are simple linear potentials
- 4 body hydrogen bonds potentials
21Potential Outcomes of docking
Soft Docking Failure
Hard Docking Failure
Success!
Energy
Correct Structure
Configuration of ligand in active site
221 Rank Pose is RMSD lt 2.0
231 Rank Pose is RMSD lt 1.5
24M.A.S.C.
- Pre-dock ligand into multiple reference active
sites - Calculate score average and standard deviation in
the reference sites - Correct score in actual docking runs with the
following formula
- Reduces systematic biases in scoring functions
- Pre-dock calculations are expensive but only need
to be done once for a given database - Can be used with any scoring function
25MASC vs. Non-MASC Enrichment
26Acknowledgements
- Anthony Nicholls Zap Bind
- Roger Sayle OEChem
- Matt Stahl OEChem, MMFF, Omega
- Geoff Skillman OEChem, Omega
- Bob Tolbert OEChem, Zap Bind
- Stanislaw Wlodek MMFF