Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine

About This Presentation
Title:

Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine

Description:

Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive ... –

Number of Views:110
Avg rating:3.0/5.0
Slides: 18
Provided by: SusanT155
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine


1
Surflex Fully Automatic Flexible Molecular
Docking Using a Molecular Similarity-Based Search
Engine
  • Ajay N. Jain
  • UCSF Cancer Research Institute and
    Comprehensive Cancer Center, University of
    California
  • Presentation by Susan Tang
  • CS 379a
  • January 23, 2006

2
Protein-Ligand Docking Overview
  • Goal
  • - To predict how well a given set of ligands will
    bind to a protein structure
  • - To predict the structure of bound
    protein-ligand complexes
  • Components
  • - Search method explore different ways that
    ligand can interact/fit with protein
  • - Scoring function assign a quantitative value
    to each ligand/protein fit

3
Protein-Ligand Docking Overview
  • Criteria
  • 1) Docking accuracy
  • Measures ability to find a conformation
    alignment (pose) of a protein-ligand that is
    close to reality
  • 2) Scoring accuracy
  • Ability to rank a correct pose of a molecule
    higher than an incorrect one
  • 3) Screening utility
  • Ability to identify only true ligands in a set
    that contains false positives
  • 4) Speed
  • How fast the algorithm can screen a library of
    ligands

4
Surflex A new docking methodology
  • Combines Hammerheads empirical scoring function
    with a molecular similarity method to generate
    putative poses of ligand fragments
  • Like Hammerhead, Surflex has 1 mode that uses an
    incremental construction search approach. But
    Surflex also has another mode a whole molecule
    approach that is faster/more accurate
  • Surflex is designed primarily as a screening tool
    for small molecule libraries

5
Surflex Computational Design
  • Protomol Generation
  • First create an ideal active site ligand from
    the protein structure of interest
  • Input
  • (a) protein structure (b) list of residues to
    identify protein active site
  • Output
  • A protomol, or target to which potential ligands
    or ligand fragments are aligned based on
    molecular similarity
  • Procedure
  • Molecular fragments are put into the protein
    binding site in multiple positions ? optimized
    for interaction with protein ? select
    high-scoring nonredundant fragments ? protomol
    formation

6
Surflex Computational Design
  • Protomol for streptavidin compared with the
    native pose of biotin (green)
  • The bond being pointed to is broken by Surflex to
    make fragments of biotin for docking.

7
Surflex Computational Design
  • Docking
  • Ligands are docked into the protein to optimize
    scoring function
  • Input
  • (a) protein structure, (b) protomol, (c)
    ligand(s)
  • Output
  • The optimized poses of docked ligands along with
    corresponding scores
  • Procedure
  • Divide input ligand into 1-10 molecular
    fragments ? search each fragment in terms of
    conformation ? each conformation of each
    fragment is aligned to protomol to get poses with
    maximum molecular similarity to protomol ? score
    aligned fragments and keep those with highest
    score and minimal protein interpenetration ?
    construct full ligand molecule from the aligned
    fragments using either an incremental
    construction approach or whole molecule approach
    ? highest scoring poses undergo further
    refinement of conformation and alignment

8
Surflex Computational Design
  • Incremental Construction vs. Whole Molecule
    Algorithm
  • Incremental Construction
  • - Makes strong assumption that maximizing the
    similarity of tiny fragments to the protomol will
    generate good poses
  • Whole Molecule Algorithm
  • - bypasses the strong independence assumption
    made in incremental construction
  • - dead pieces are carried with the live
    piece during conformation search
  • - when creating putative poses to protomol, the
    dead pieces in their arbitrary initial
    conformation are carried into the molecular
    similarity computation ? eliminate those with
    worst protein interpenetration
  • - for remaining poses, score on basis of
    individual fragments
  • - recursive search yields whole molecules that
    consist of fragments selected from different
    docked poses
  • - these whole molecules score well in total,
    over all fragments

9
Surflex Computational Design
  • Illustrates the process of docking biotin to
    streptavidin (blue)
  • Gray indicates the live fragment
  • Magenta indicates the dead fragment
  • Green lines show the result of merging the two
    well-docked fragments at the atoms indicated by
    yellow circles
  • The merged pose closely follows the parent
    fragments original configurations

10
Surflex Evaluation
  • Evaluation of reliability and accuracy of
    dockings
  • - Comparison with experimental results on 81
    protein/ligand pairs
  • - The pairs were selected to represent
    structural diversity
  • Evaluation of Surflexs utility as a screening
    tool
  • Performed on 2 protein targets (thymidine kinase
    and estrogen receptor)
  • Competing docking methods were tested side by
    side using the same data set for comparison
    purposes (GOLD, Dock, FlexX)
  • Evaluation of the Surflexs docking speed
  • - Investigate relationship between docking
    time and of rotatable bonds

11
Surflex Evaluation Data Set Construction
134 protein-ligand Complexes
81 protein-ligand complexes
filter
  • Filtering Criteria
  • 15 or fewer rotatable bonds
  • ? Most small molecules have lt 15 rotable bonds
  • no covalent attachments between ligand and
    protein
  • ? Since Surflexs scoring function was developed
    strictly on noncovalent complexes
  • ligands with no obvious errors in structure
  • ? Undesirable to modify an existing
    protein-ligand complex prior to testing
  • data set used for GOLD docking program

12
Surflex EvaluationResults
  • 1) Evaluation of reliability and accuracy of
    dockings
  • Describes how thorough the search procedure is
    and to what extent scoring function can recognize
    good dockings
  • Surflex returned a pose within 2.5 angstroms rmsd
    (94 of cases)
  • Surflex returned a BEST scoring pose that was
    within 2.5 angstroms (86 of cases)
  • With a single docking from a random initial pose,
    chances of finding a correct or nearly correct
    pose is averaged to be 70

13
Surflex EvaluationResults
14
Surflex EvaluationResults
  • 2) Evaluation of Surflexs utility as a screening
    tool
  • Tests ability of program to detect true
    positives against a background of random
    molecules (sensitivity vs. specificity)
  • Surflex had a True Positive rate of gt 80 at a
    False Positive rate of lt 1
  • Surflex had the best performance (lowest FP rate
    for a given TP rate) out of the different
    individual and combined methods assayed

15
Surflex EvaluationResults
  • 3) Evaluation of the Surflexs docking speed
  • Docking speed becomes very important in
    screening large compound libraries.
  • Surflex demonstrated a docking time that was
    approx. linear in number of rotatable bonds
  • Rigid molecules took a few seconds and each
    additional rotatable bond took an additional 10
    seconds
  • Surflex yielded a mean running time of 44 seconds
    for the 81 protein-ligands in the test set used
    earlier
  • Docking speed ranges from 50-100 seconds per
    molecule for FlexX, DOCK, and GOLD (Surflex speed
    is comparable to these times)
  • Quantitative comparison across methods is
    difficult due to differences in hardware and
    methodology

16
Surflex EvaluationResults
17
Conclusions
  • Surflex marks a step forward in flexible
    molecular docking programs
  • Compared to the best docking methods available,
    Surflex is
  • as fast
  • as accurate in terms of docked ligand RMSD
  • much more accurate in terms of scoring
  • Assaying the top scoring 1 of compounds in the
    screening library should yield a large proportion
    of true positives
  • Potential areas of improvement
  • - scoring and penetration terms should be
    combined into a single score
  • - scoring function should include training on
    non-binding ligands (negative examples)
  • - effect of nonbonded self-interactions within
    ligands should be accounted for explicitly
  • - allow a degree of protein flexibility (side
    chain movement)
Write a Comment
User Comments (0)
About PowerShow.com