Title: Computational Docking
1Computational Docking
http//autodock.scripps.edu/faqs-help/manual/autod
ock-3-user-s-guide/AutoDock3.0.5_UserGuide.pdf
Inhibitor that block the protein
Protein (E) that does not work
Computational Docking aims to find inhibitors for
defective proteins
Diseases (cancers, cardiovascular diseases,
Alzheimers disease....)?
2Computational Docking
Ki
http//autodock.scripps.edu/faqs-help/manual/autod
ock-3-user-s-guide/AutoDock3.0.5_UserGuide.pdf
- Evaluation of the binding affinities of
protein-ligand complexes to support the design of
new drugs - ?G (relative free energie) calculation
- Ranked ligand predicted activities
- Help experiment
- Save time money
- Inhibition process understanding
3Docking Procedures
- 2 Kinds
- Virtual High-Throughput Screening
- Very fast programs
- Approximate binding affinities
- Ex Autodock4.0, Dock6.2
- Rational Drug Design
- More accurate
- Ex EADock2.0, QM/MM
VHTS (millions of compounds)?
100-1000 selected
RDD (tenths of compounds)?
10-20 selected
EXPERIMENT
4VHTS
5Levels of Approximation
- Before Docking
- Reduced representation of the Protein-Ligand
interactions - Grid of points
- Example Autodock4 At each point, interaction of
a probe atom (ligand atom type) with the target. - A grid per ligand atom type
- Stored, then fetched during docking
- Tri-linear interpolation
- 100 times faster than doing atom-atom
interactions
http//autodock.scripps.edu/faqs-help/manual/autod
ock-3-user-s-guide/AutoDock3.0.5_UserGuide.pdf
6 During Docking
First Docking
2d Docking
3th Docking
Ligand A - Protein
Ligand B - Protein
Ligand C - Protein
SEARCH
SEARCH
Fast search algorithms
Fast search algorithms
Best position/conformation in the active site
Best position/conformation in the active site
Best position/conformation in the active site
Fast scoring function
Fast scoring function
Binding Free energy
Binding Free energy
Binding Free energy
Ranking
B gt C gt A
-
INIBITHOR POWER
7Search space
Example of search space Protein is fixed. Ligand
moves to find the active site and the best
position/orientation in the active site.
Initial position of the ligand
Translate
Orient
Change Torsion
Global minimum Best position/orientation in the
active site
Change torsion
How to find the global minimum as fast as
possible ?
8Search algorithms
- Systematic search
- Graph algorithms
- Explore everything you can
- For low-D problems (time-demanding)?
- Molecular Dynamic
- Deterministic algorithms
- An input X always gives the same output Y
- Pb with local minimum, time-demanding
- Stochastic search
- Random search (ex the Monte Carlo algorithm)?
- Random number generator
- Random changes
- Stop when convergence
- For larger problems, but optimal solution not
guaranted - Has to be stringent (hybrid search ?)?
- May be a compromise between systematic/MD
9 Examples
- Autodock4
- Stochastic
- The ligand begins outside the protein (for large
grid size)? - The ligand randomly explores translations,
orientations, and conformations until the ideal
site and position are found.
- Dock6.2
- Systematic
- The ligands begins inside the protein
- The atoms of the ligand are match to many points
in the active site until an ideal position is
found - Much faster ?
Centers of spheres used to model the active site
Est ce que jai trouvé le site ? Verifier les
complexes
http//dock.compbio.ucsf.edu/DOCK_6/tutorials/ lig
and_sampling_dock/ligand_sampling_dock.html
10Search algorithms
- Autodock4
- Stochastic
- Lamarkian Genetic Algorithm or LGA (Hybrid
global-local search) - Global search with a genetic algorithm (mimics
Darwin's evolution theory)? - Local search using energy minimization (mimics
Lamarck's assertion)? -
- LGA gt Usual Random
search
- Dock6.2
- Systematic (reducing the space)?
- The Divide and conquer strategy
(anchor-and-grow algorithm) - Global search
- Rigid pieces of the ligand are docked one after
each other - Piece match with a graph algorithm (systematic
search)? - Local search using energy minimization
11Autodock4Lamarkian Genetic Algorithm
12Chromosome (genes A,B,C)?
Genotypes (a1,b1,c1a2,b2,c2..)
Phenotypes
Phenotypes Eval
Darwin's evolution theory
Best phenotypes survive reproduce
Offsprings
Crossovers mutations
Populations remain roughly the same size
Phenotypic characteristics acquired during an
individuals lifetime (environmental adaptation)
can become heritable traits
Lamarck's assertion
13i number of energy evaluation jnumber
of random changes eval.
One ligand Docking (position,orientation,
torsions)?
Chromosome (genes A,B,C)?
Random generation
Genotypes (a1,b1,c1a2,b2,c2..)
Population of conformers (p1,o1,t1p2,o2,t2...)?
Atomic coordinates
Phenotypes
Inter intra energy evaluation (scoring
function)?
Phenotypes Eval
_at_ i
Best phenotypes survive reproduce
f(E)?
Proportional selection elitism
If (i.or.j) lt limit doloop else free
energie of the best
conformer endif
Random generation
_at_ j
New translations, orientations, torsions
Offsprings
_at_ j
Random changes
Crossovers mutations
Populations remain roughly the same size
N Cst
New conformers replaced the parents
Lamarck's assertion phenotypic characteristics
(environmental adaptation) acquired during an
individuals lifetime can become heritable traits
Random Local search and replacement of the
conformer by the results of the local search
14Lamarkian genetic algorithm
Potential energy surface of the interaction of
one ligand with one protein
Local Search
Random
Journal of Computational Chemistry, Vol. 19, No.
14, 1639-1662(1998)?
15Dock6.2Anchor-and-grow algorithm
16One ligand docking
Divide into pieces
Anchor selection
f(size)?
Positioned in the model active site
list
Graph algorithm
Position search (systematic search makes lists
bigger and bigger of interaction
possibilities)? Evaluate the matching that
corresponds to each list (scoring
functions)?
Local search (mini)?
No more grow
Add fragments (grow)?
Conformational search
Free energie calculation
New anchor
http//dock.compbio.ucsf.edu/DOCK_6/dock6_manual.h
tm
17Orientation algorithm?
- max_h_pairsy nb_pairs0 h_atom_liga,b
h_site_recA,B - While (nb_pairs lt max_h_pairs)?
- If (dist a-b dist A-B) then
- Generate possible lists aA, aB,
bA,bB,aA,aB,bA,bB,aA,bA,aB,bB - Endif
- _at_ nb_pairs
- End
- Foreach list (aA aB .....)?
- tot_nb_pairs_list cat list wc -l
- While (non-H atom/site pairs exist)?
- add pairs
- _at_ tot_nb_pairs_list
- End
- If (tot_nb_pairs_list gt 4) then
- Calculate translation/rotation matrice
- Rotate/translate the anchor
- Check H-bond geometry
- Check steric overlap
- Check energy
18Initial Potential Surface
Orientation search
Local search
Conformation search
Add fragments
Final Potential Surface (entire ligand)?
19First Docking
2d Docking
3th Docking
Ligand A - Protein
Ligand B - Protein
Ligand C - Protein
SEARCH
SEARCH
Fast search algorithms
Fast search algorithms
Best position/conformation in the active site
Best position/conformation in the active site
Best position/conformation in the active site
Fast scoring function
Fast scoring function
Binding Free energy
Binding Free energy
Binding Free energy
Ranking
B gt C gt A
-
INIBITHOR POWER
20Scoring functions
- Simple energy evaluation (?H )
- More reliable Free energy calculation
- ?G ?H - T?S
- Accurate prediction
- Statistical Mechanic
- Too long
- Approximation
- Based on the thermodynamical cycle
- Fast (Autodock4.0, Dock6.2)?
- Based on statistical mechanic (Lee Seok,
Proteins Structure, Function, and
Bioinformatics, 70 (2007) 1074)? - Little extra computational costs
21- Free energies based on the thermodyn. cycle
http//autodock.scripps.edu/faqs-help/manual/autod
ock-3-user-s-guide/AutoDock3.0.5_UserGuide.pdf
?Gbinding,solution ?Gbinding,vacuo
?Gsolvation(EI) - ?Gsolvation(EI)?
Desolvation upon binding and solvent entropy
change at the solute-solvent interface
22- Free energies based on the thermodyn. cycle
?Gbinding,solution
?Gbinding,vacuo
?Gdesol
- ?Gbinding,solution ?Gvdw ?Ghbond ?Gelec
no intra ?Gtrans/rot/flex ?Gdesol
Molecular Mechanics
Entropy
loss of degrees of freedom upon binding
In autodock4 ?Gbinding,solution has been
parameterized using experimental data In
Dock6.2 ?Gbinding,solution is amber force
field-based
?Gsolvent?Gcomplex
Estimated through solvent accessible surface area
(SASA) Poisson-boltzman equation (PB) or
Generalized Born model (GB) PBSA or GBSA model
23- Free energies based on statistical mechanic
Ki
http//autodock.scripps.edu/faqs-help/manual/autod
ock-3-user-s-guide/AutoDock3.0.5_UserGuide.pdf
- The free energy of protein-ligand complex can be
written as - ?G -kBT ln Ki
- ?G -kBT ln Z
- ?G -kBT ln ? ZI
- Approximation the maximum ZI dominates (as
strong single binding mode) - ?G -kBT ln max ZI
24- Free energies based on statistical mechanic
- Free energy is predicted using the probability
function of pairs of conformations to be in the
same state i. - State ensemble of local energy minima (similar
conformation potential energy)? - The configuration space is partitioned into n
states and the partition function is written as
n
? ? e-ßE(r)dr ? ZI
Eq 1
I1
ZI Sample of a local minima
Potential energy evaluated with a scoring function
25n
? ? e-ßE(r)dr ? ZI
Eq 1
I1
n total number of states
?I ? e-ßE(r)dr
Eq 2
VI
VI configuration space volume
N
?I ? ?Ij ? e-ßE(r)dr
Eq 3
j1
?j
?j configuration space volume
26N
?I ? ?Ij ? e-ßE(r)dr
Eq 3
j1
?j
?j configuration space volume
N
?I ? ?Ij e-ßEj ?j
Eq 4
j1
?Ij 1 or 0
N
?I ? airef(I)j e-ßEj ?j
Eq 5
j1
airef(I)j Probability that i j belong to
the same state
27N
?I ? airef(I)j e-ßEj ?j
Eq 5
j1
airef(I)j Probability that i j belong to
the same state
N
max ?I ? airef(Imin)j e-ßEj ?j
Eq 6
j1
The state I that contains imin correponds to Imin
(for more details, see Proteins Structure,
Function, and Bioinformatics, 70 (2007) 1074)?
28- Free energies based on statistical mechanic
- Compared with original Autodock3.0.5 scoring
function - For 163 docking
- 99/163, RMSD improvement by 1.18Ã….
- 39/163, the same
- 25/163, RMSD get worse by 0.41Ã…
29- Approximative binding free energies calculations
- Allows to rank the ligands
Screening of millions of compounds
?GligX
Ligand X Ligand Y Ligand Z Ligand
T . . . . . . . . . . . . Ligand E
Activity of the ligand
?GligE gtgt ?GligX
-
30Methods Efficiency
Does the method find the real ligand among decoys
? More specifically does the method find the
real ligand among decoys within the fewest
computational time ?
?GligX
Ligand X Ligand Y Decoys Z Ligand T Ligand
Y Ligand Z . . . Decoys Y Decoys
Z . . . .. Decoys Y Decoys Z Decoys E
Activity of the compounds
?GligE gtgt ?GligX
-
31Methods Efficiency
- Need a Benchmark database with real ligands
and decoys - Ex The DUD database
- Similar physical properties
- Molecular weight
- Number of hydrogen bonds
acceptors donors - Number of rotable bonds
- Log P
- Different chemical properties or topology
(different functional groups)? - amine, carboxyl, hydroxyl...
- Are Autodock4 and Dock6.2 sufficiently powerful
to discriminate the real ligand from the decoys ?
Real Ligands
H
Decoys
32Methods Efficiency
- Efficiency evaluation
- Enrichment factor (EF1)?
- (Total computational time)?
DUD
Which proportion of real ligands are find in the
first 1 of the ranked database compare to the
proportion of real ligands in the whole database ?
Ligand X Ligand Y Decoys Z Ligand T Ligand
Y Ligand Z . . . Decoys Y Decoys
Z . . . .. Decoys Y Decoys Z. Decoys E
Activity of the compounds
EF1 a/n / A/N good EF1 gtgt 10 -gt selection
of the 1 of the DB to do RDD
-
33Efficiency evaluation
- EF vs TIME (preliminary results)?
For 1 proc., 2.1Ghz
No conclusion