AUTODOCK An Automated Docking Software for Predicting Optimal ProteinLigand Interaction

1 / 46
About This Presentation
Title:

AUTODOCK An Automated Docking Software for Predicting Optimal ProteinLigand Interaction

Description:

Quaternion reduction factor = 1. Torsional reduction factor = 1 # rotatable bonds = 12 ... Initial quaternion = Random. Initial dihedrals = Random. Translation ... –

Number of Views:1045
Avg rating:3.0/5.0
Slides: 47
Provided by: C228
Category:

less

Transcript and Presenter's Notes

Title: AUTODOCK An Automated Docking Software for Predicting Optimal ProteinLigand Interaction


1
AUTODOCKAn Automated Docking Software for
Predicting Optimal Protein-Ligand Interaction
  • By
  • Susan McClatchy, Milind Misra,
  • Chandreyee Mukherjee, Indu Shrivastava

2
Introduction
  • Chandreyee Mukherjee

3
Automated Docking Importance
  • Interaction between biomolecules lie at the core
    of all metabolic processes and life activities
  • The number of solved protein structures available
    in the databases is expanding exponentially
  • To understand their functions it is essential to
    elucidate the interaction mechanisms between the
    different molecules
  • Primary importance lies in rational drug design
  • Depending upon the success of the docked
    molecules the docking ligand may be redesigned or
    its structure further refined.
  • Also important in the area of immunology to study
    antigen-antibody interaction.

4
Inhibitor bound to active site of HIVPR
Surface structure of HIVPR with bound inhibitor
5
What is docking?
  • Prediction of the optimal physical configuration
    and energy between two molecules
  • The docking problem optimizes
  • Binding between two molecules such that their
    orientation maximizes the interaction
  • Evaluates the total energy of interaction such
    that for the best binding configuration the
    binding energy is the minimum
  • The resultant structural changes brought about by
    the interaction

6
Categories of docking
  • Protein-Protein Docking
  • Both molecules are rigid
  • Interaction produces no change in conformation
  • Similar to lock-and key model
  • Protein-Ligand Docking
  • Ligand is flexible but the receptor protein is
    rigid
  • Interaction produces conformational changes in
    ligand

7
  • Protein-Protein Docking
  • Protein-Ligand Docking

8
Docking uses a search and score method
  • It involves
  • Finding useful ways of representing the molecules
    and molecular properties.
  • Exploration of the configuration spaces available
    for interaction between ligand and receptor.
  • Evaluate and rank configurations using a scoring
    system, in this case the binding energy
  • However, since it is difficult to evaluate the
    binding energy because the binding sites may not
    be easily accessible, the binding energy is
    modeled as follows
  • ?G bind ?Gvdw ?Ghbond ?Gelect ?G conform
    ?G tor ?G sol

9
The AutoDock Software
  • Developed by AJ Olsons group in 1990.
  • AutoDock uses free energy of the docking
    molecules using 3D potential-grids
  • Uses heuristic search to minimize the energy.
  • Search Algorithms used
  • Simulated Annealing
  • Genetic Algorithm
  • Lamarckian GA (GALS hybrid)

10
Algorithms Overview
  • Simulated Annealing
  • Based on temperature effects
  • Start with high temperature and global search
  • Lower temperature local search
  • Genetic Algorithm
  • Charles Darwins Theory of Evolution
  • Genotype Phenotype
  • Lamarckian Algorithm ( Jean Baptiste de Lamarck)
  • Phenotype Genotype

11
Project Goal
  • Study algorithms used to perform the searches and
    to calculate minimum energy
  • Discuss why GALS hybrid better than SA
  • Look at an example, i.e., dock a ligand to a
    protein molecule using latest AutoDock version

12
The Algorithms
  • Sue McClatchy

13
Simulated Annealing
  • Algorithm modeled after the cooling of a solution
    to form glass, though its better explained by
    crystal formation
  • Given a long enough cooling time, molecules will
    relax into their lowest energy state to form the
    largest crystals
  • Quick cooling - highly disordered system
  • Slow cooling - highly ordered crystal, with each
    molecule in its lowest energy state
  • Algorithm simulates either linear or proportional
    slow cooling

14
The SA Algorithm
  • Uses neighborhood operator N(s) to generate a set
    of solutions according to a fixed distribution
  • New solution compared to preceding solution, and
    is accepted if its energy is lower than that of
    previous solution
  • If new solution has higher energy, it is accepted
    probabilistically according to Boltzmann
    distribution (see figure above)
  • At high temperatures, many higher energy
    solutions will be accepted at low temps.,
    majority of probabilistic moves rejected
  • Boltzmann probability distribution e exp(delta
    E/T) where delta E energy difference between
    two solutions, T
    temperature
  • Boltzmann finds p(of finding a system with energy
    E at temp T)

15
Pseudocode for SA
  • Compute a random initial state s
  • n0, xn s // initialize best solution to s
    and first state to 0
  • Repeat i 1, 2, // specify number of
    temperatures to try
  • Repeat j 1, 2, , mi // no. of steps to
    perform for each temp. Ti
  • Compute a neighbor s N(s) // s new
    solution from N(s)
  • if (f(s) lt f(s)) then // if energy of s
    lt energy of s
  • s s // accept new solution s
  • if (f(s) lt f(xn)) then // if energy of
    new solution lt
  • xn s // energy of best solution of
  • n n 1 // state n, replace best with
    new
  • endif
  • else // otherwise replace s with s
    using
  • s s with probability e (f(s) - f(s))/Ti
    // Boltzmann dist.
  • endif
  • EndRepeat
  • EndRepeat

16
How Genetic Algorithms Work - A Simple Example
  • Initial population of binary creatures having 6
    genes
  • Each gene has two different alleles, either a 0
    or a 1
  • Three operators crossover, mutation and selection

1 1 1 1 0 0
0 0 0 0 0 1
1 0 0 0 0 1
0 0 0 0 0 0
17
Selection
Score
  • Selection based on a fitness function f(x)
  • This operator chooses those individuals with the
    lowest values
  • Those with higher values chosen with a very low
    probability

1 1 1 1 0 0
20
0 0 0 0 0 1
13
1 0 0 0 0 1
48
52
0 0 0 0 0 0
18
Crossover
0 0 0 1 0 0
1 1 1 1 0 0
1 1 1 0 0 1
0 0 0 0 0 1
1 1 1 1 0 1
1 1 1 1 0 0
0 0 0 0 0 0
0 0 0 0 0 1
19
Mutation
0 0 1 1 0 0
0 0 0 1 0 0
1 1 1 0 1 1
1 1 1 0 0 1
1 1 1 1 0 1
1 1 1 1 0 1
0 0 1 0 1 0
0 0 0 0 0 0
20
Replacement
offsp
Score
  • Lower scoring individuals create more offspring,
    higher scoring ones create fewer or none at all
  • Offspring replace parental generation
  • Elitism function allows best individual from
    parent generation to persist, if it is a better
    solution than new individuals created
  • Cycle of selection, mutation, crossover and
    replacement repeated

0 0 1 1 0 0
15 1
1 1 1 0 1 1
9 1
1 1 1 1 0 1
22 0
0 0 1 0 1 0
1 2
21
Pseudocode for GA
  • Select an initial population set xi0 x10 ,
    x20,, xM0
  • Determine fitness values f(xi0) for each
    individual
  • Repeat for g 1, 2, of generations
  • Perform selection
  • Perform crossover with probability ?
  • Perform mutation with probability ?
  • Determine fitness f(xig) for new individuals
  • xg argmini1,M f(xig) and yg f(xg)
  • Perform replacement
  • Until stopping criterion ( of generations) is
    reached

22
How GA works in AutoDock
  • Ligands genes are its x, y and z coordinates
  • These form a unit vector, which is given a random
    rotation angle between 0o and 360o to form a
    quaternion
  • Additional genes may represent torsion angles
    between bonds of the ligand

23
Mapping
  • In standard GA, the genotype (x,y,z coordinates
    plus rotation and any torsion angles) are mapped
    to the fitness function f(x)
  • The fitness function value corresponds to each
    individuals phenotype
  • According to the right hand side of the figure,
    genotypes of parents with high f(x) values are
    mutated to form genotypes of children with lower
    f(x) values

24
Selection, Crossover Mutation
  • Selection chooses ligands with the lowest fitness
    (energy) values
  • Crossover exchanges x, y, z coordinates, or
    rotations or torsions between these ligands
  • Example Two ligands with xyz coordinates Abc and
    aBc Crossover results in new individuals with
    coordinates abc and ABc
  • Mutation operator mutates coordinate or other
    angle values by adding a random real number
    according to a Cauchy distribution, which is
    similar to a Gaussian but has thicker tails

25
Replacement
  • Individuals with better-than-average fitness
    receive proportionally more offspring
  • no (fw fi)/(fw - ltfgt),
  • fw ! ltfgt
  • where
  • no number of offspring
  • fi fitness of individual (energy of ligand)
  • fw fitness of worst individual in last g
    generations (typically 10)
  • ltfgt mean fitness of population

26
Lamarckian Genetic Algorithm
  • According to left hand side of figure, LGA finds
    lowest fitness function (energy) values first,
    then maps these values to their respective
    genotypes
  • Genetic algorithm plus Solis and Wets local
    search
  • Better performance than either simulated
    annealing or genetic algorithm alone

27
The Application
  • Milind Misra

28
HIV-1 Protease and AHA006
  • HIV-1 Protease in complex with the cyclic
    sulfamide inhibitor, AHA006
  • Source Protein Data Bank
  • Authors K. Backbro, T. Unge
  • Exp. Method X-ray Diffraction (2 Å res.)
  • Primary Citation Backbro et al, J Med Chem 40
    pp. 898 (1997)
  • Polymer Chains A, B Residues 198 Atoms
    1632

29
Protein (HIV-1 Protease)
Ligand (AHA006)
(Source PDB)
30
HIV-1 Protease dimer
(Rasmol)
31
Initial X-Ray crystallographic positions of
protein and ligand
(SYBYL)
32
Docking Preparation Ligand
  • Assign charges
  • Define rotatable bonds
  • Rename aromatic carbons
  • Merge non-polar hydrogens
  • Write .pdbq ligand file

33
Docking Preparation Protein
  • Add essential hydrogens
  • Load charges
  • Merge lone-pairs
  • Add solvation parameters
  • Write .pdbqs protein file

34
Docking Preparation Grid
  • AutoDock uses grid-based docking
  • Ligand-protein interaction energies are
    pre-calculated and then used as a look-up table
    during simulation
  • Grid maps are constructed based on atoms of
    interest in ligand (here CANOSH)

35
(AutoDockTools)
36
Docking Simulated Annealing
  • Runs 100
  • Cycles 50
  • Initial Temp (RT) 1,000
  • Temp reduction factor .95
  • Linear temperature reduction
  • Translation reduction factor 1
  • Quaternion reduction factor 1
  • Torsional reduction factor 1
  • rotatable bonds 12
  • Initial coordinates Random
  • Initial quaternion Random
  • Initial dihedrals Random
  • Translation step 2.0 Å
  • Quaternion step 50 deg
  • Torsion step 50 deg
  • Results
  • 100 different clusters
  • Energy range -0.63 to 64,000
  • Conformation 81 -0.63
  • Conformation 67 20.02
  • Conformation 68 10.74
  • Lowest energy conf not close to position but
    similar to original
  • Conf 67 closest to position and conformation of
    original ligand higher energy
  • Conf 68 close to position but not conformation
    of original ligand not as high energy

37
Original ligand conf SA conformation 67
(SYBYL)
38
Original ligand conf SA conformation 67
Close-up of previous
(SYBYL)
39
Original ligand conf SA conformation 67
(SYBYL)
40
100 Clustered SA Conformations
(gOpenMol)
41
Docking Genetic Algorithm
  • Runs 50
  • Evaluations 250,000
  • Population size 50
  • Elitism count 1
  • Mutation rate 0.02
  • Crossover rate 0.8
  • Window size 10
  • Cauchy alpha 0
  • Cauchy beta 1
  • rotatable bonds 12
  • Initial coordinates Random
  • Initial quaternion Random
  • Initial dihedrals Random
  • Translation step 2.0 Å
  • Quaternion step 50 deg
  • Torsion step 50 deg
  • Results
  • 50 different clusters
  • Energy range -18.66 to 86.28
  • Conformation 39 -18.66
  • Conformation 9 -10.60
  • Lowest energy conformation overall closest to
    original ligand conformation
  • If only 10 runs had been used instead of 50, then
    conf 9 would have been the lowest energy
    conformation.

42
Docking Local Search
  • Results
  • 18 different clusters
  • Energy range 35.92 to 215,200
  • Confs 20, 21, 22, 23 35.92
  • Lowest energy conformation was most dissimilar to
    original ligand conformation
  • Better results could have been obtained by
    reducing the step sizes
  • Runs 50
  • Solis-Wets iterations 300
  • Consecutive successes 4
  • Consecutive failures 4
  • Rho 1
  • Lower bound on rho 0.01
  • LS frequency 0.06
  • rotatable bonds 12
  • Initial coordinates Random
  • Initial quaternion Random
  • Initial dihedrals Random
  • Translation step 2.0 Å
  • Quaternion step 50 deg
  • Torsion step 50 deg

43
Docking Lamarckian GA
  • Results
  • 10 different clusters
  • Energy range -18.10 to 8.38
  • Conformation 7 -18.10
  • Lowest energy conformation fairly similar to
    original ligand conformation
  • If the number of runs was restricted to 10 for
    both GA and LGA, LGA would have generated the
    best structure
  • Runs 10
  • Max Evaluations 250,000
  • Max Generations 27,000
  • Population size 50
  • Elitism count 1
  • Mutation rate 0.02
  • Crossover rate 0.8
  • Window size 10
  • Cauchy alpha 0
  • Cauchy beta 1
  • Solis-Wets iterations 300
  • Consecutive successes 4
  • Consecutive failures 4
  • Rho 1
  • Lower bound on rho 0.01
  • LS frequency 0.06
  • Gray options

44
Original ligand conf Best GA conf Best LGA
conf Best SA conf Best LS conf
(SYBYL)
45
Original ligand conf Best GA conf Best LGA
conf Best SA conf
(SYBYL)
46
References
http//cmgm.stanford.edu/biochem218/Projects20199
8/Apaydin.pdf http//www.biz.uiowa.edu/class/6K299
_menczer/PPT/Hart/sld018.html http//cs.felk.cvut.
cz/xobitko/ga/ http//www.bch.msu.edu/labs/kuhn/w
eb/projects/screening/solvation.html http//wwwcmc
.pharm.uu.nl/gillies/thesis/ http//www.chem.uidah
o.edu/honors/boltz.html S.Kumar et.al. Protein
Flexibility and Electrostatic Interactions. IBM
Journal of Research and Development Vol45. No ¾
2001. G. Morris et.al. Automated Docking Using a
Lamarckian Genetic Algorithm and an Empirical
Binding Free Energy Function. Journal of
Computational Chemistry, Vol. 19, No. 14,
1639-1662 (1998) C. Rosin et.al. A Comparison of
Global and Local Search Methods in Drug Docking.
UCSD CSE Technical Report CS97-522 (1997) C. A.
Sotriffer et.al. Automated Docking of Ligands to
Antibodies Methods and Applications. Methods
20, 280-291 (2000) M. Vieth et.al. Assessing
Search Strategies for Flexible Docking.
Practical Handbook of Genetic Algorithms.
Edited by Lance Chambers An Introduction to
Genetic Algorithms. Melanie Mitchell. Goodsell
and Olson Prot. Struct. Func. Genet, 8,
195(1990). Principals of Biochemistry
Lehninger R. Durbin, S Eddy, A. Krogh, G.
Mitchison Biological sequence analysis Wm. E.
Hart. A Theoretical Comparison of Genetic
Algorithms and Simulated Annealing Sandia
National Laboratories, www.cs.sandia.gov/wehart.
Write a Comment
User Comments (0)
About PowerShow.com