Title: DOCKING Modeling Protein Complexes
1DOCKINGModeling Protein Complexes
- Dr. Victor Lesk
- 29th October 2008
2- Protein / small molecules
- Enzyme / substrates
- Enzyme / drug
- Protein / protein
- Enzyme / inhibitor
- Inhibitor / modulator
- Macromolecular assemblies
- Protein / nucleic acid
- RNA/DNA / polymerase
- Ribosome / peptide
3- Docking two molecules means constructing the
coordinates of the bound state. - Bound state is called the complex.
- We require coordinates for the independent
molecules as input - Molecules move towards each other and bind/dock
- But aim is to predict their docked configuration
(not describe their motion).
4Hydrophobicity
- Hydrophobic molecules stick to other molecules
- Specificity of interactions tend to be conferred
by polar areas through e.g. salt bridges and
hydrogen bonds - Hydrophobic drugs are less quickly metabolized,
go to places where they are not meant to - Some hydrophobicity is required for good activity
5Drug design
- Drugs typically affect a protein targets ability
to bind substrate (compet. or non-). - Require clinical trials expensive, risky,
require animals - Use screening steps to eliminate unlikely
candidates, end up with leads - Start from library experimental screening using
physical library - Or virtual screening using an electronic database
6Virtual screening
- Start from electronic library with millions of
compounds ZINC (free) has 16 million - Take fastest steps first structural docking
(slow) can be final step when protein target is
known - First library decimated according to ADME
criteria - Absorption- Distribution- Metabolism- Excretion
- Lipinskis rule of 5 implements ADME well
7Virtual screening II
- QSAR when structure of protein target is unknown
- When protein structure is known, docking the
drugs onto the protein can be tried
(small-molecule docking). - If partner is also protein and binding site is
cannot be identified by expt. or bioinformatics,
protein-protein docking may be used to help find
it.
8Small-Molecule Dockingnot just for drug design
- For
- Drug design
- Lead optimization
- Toxicology
- Metabolism study
- Development of tags for imaging
- Software e.g.
- Commercial
- DOCK
- GOLD
- FlexX
- Free for non-commercial use
- Autodock
9Structural Drug Library
- Cambridge Structural Database has 450,000 organic
compounds (and growing) - Files are in CIF format
- (note mmCIF, macromolecular CIF, as
- proposed replacement for PDB)
- Also exist databases for inorganic, oligo-
peptide and oligonucleotide structures
10Small molecule docking for drug design
- Try to dock putative drug molecule on to protein
- Each molecule has few atoms, so docking of each
is computationally efficient.. but many
molecules. - Search from library of millions of compounds
- Pre-filtered using heuristics (Lipinski)?
- Score with pairwise energy function
- Protein remains rigid
- Torsion angles of drug are allowed to rotate.
Bonds are not generally allowed to stretch or
flex.
11MACROMOLECULE DOCKING(proteins and
polynucleotides)?
- Background
- Basic concepts
- Methods
- Assessment
- Summary
12Background why do protein-protein docking?
- Aside from helping with virtual screening,
- Protein-protein interaction networks are of
widespread interest in systems biology - Exist proteins with no information, arising from
genome projects - And known proteins having as yet unknown
interactions - Structure prediction technology advances
13What information do we want?
- Structures of complexes
- (within reach, particularly if not much
- conformational change)
- Do A and B bind?
- Affinity information
- (distant future)
14Protein-Protein Docking
- Configuration space large, computationally
intensive problem - Even larger when one of interactors is not rigid
enough - Throw away as many atoms as possible
- Search remaining space efficiently
- And/or use high performance computing
15Protein-protein dockingMethods
- Set of configurations must contain good enough
one - Good enough configuration must have nearly the
best score - Use non-structural help where possible
- Use a series of methods, or protocol
16Scores for selecting the best configuration
- Free energy
- Electostatics
- Stereochemistry
- Solvation score
- Statistical scores
- Geometric scores
- Phylogenetic scores
- Weighted sums of the above.
17Free energy
- Contribution from all pairs of atoms
- Same/opposite electric charge repel/attract
- Electron clouds exclude each other
- Atoms try to make glancing contact
- Hydrogen bonds are favourable (difficult to model
and calculate, direction-dependent)?
18Solvation score
- Water attracts polar groups
- Non-polar groups buried by interface
- Atomic contact energy (Zhang)?
19Statistical scores
- Interacting residue or atom type profile
- Profile from known complex interfaces
20Geometric scores
- Convex hull of surfaces
- Buried surface area
- Volume of intersection
21Phylogenetic scores
- Needs homologues for all interactors
- Conservation score
- Correlation of mutations across interface
22Protein-protein dockingMethods
- Fourier series methods
- Monte Carlo methods
- Surface methods
- Bioinformatics methods
- Normal modes
23Fast correlation methods for protein-protein
docking
- Correlation scores only dot product
- Non-structural help cannot be used efficiently
- Conformational change not allowed
24Monte-Carlo methods for protein-protein docking
- Make small random change
- Prefer to accept change with better score
- Random change may include non-rigidity
25Surface method for protein-protein docking
- Superpose a point on each proteins molecular
surface - Rotate to make normals antiparallel
- Surfaces created with marching cubes
26Movie of surface method protein-protein docking
- 2 x Plasmodium vivax 25kD protein
- Homodimer complex
- Symmetry not imposed
- 1000 active triangles on each protein
- 60,000,000 configurations total
- Score rate 30 configurations per second
27Marching cubes (Lorensen and Cline,1987)?
- Originally for medical imaging
28Marching cubes animated demonstration
- Molecular surface of P25 being constructed
- Low resolution
29Marching cubes properties
- Surface is constructed out of triangles
(simplicial complex)? - All mathematical topologies ok
- Restrictable to specified patches
- Internal pockets must be eliminated
- Major flexibility requires refinement stage
(although better than Fourier method)
30Marching cubes variables
Molecular surface
Solvent-accessible surface
31Marching cubes patches
800 sq. A patch around GLU152.Oe2
Sample from patch
32Bioinformatics methods for protein-protein docking
- Mutagenesis effects on affinity
- Surface residue conservation
- Correlated mutation between interactors
- Homologous complexes
- Bioinformatics auxiliary, not stand-alone
33Normal Modes
- About 10000 n.m.s, distinct oscillatory motion
patterns with distinct frequencies, for average
protein - Lowest frequency n.m.s may suggest
conformational changes on complexation - Can be used to restrict otherwise impossible
conformer search of backbone
34Normal Modes II
- Proteins that do not undergo large conformational
change on binding, but remain rigid, can be
identified by the absence of very low frequency
normal modes.
35Free energy
- Contribution from all pairs of atoms
- Same/opposite electric charge repel/attract
- Electron clouds exclude each other
- Atoms try to make glancing contact
- Hydrogen bonds are favourable (difficult to model
and calculate, direction-dependent)?
36Solvation score
- Water attracts polar groups
- Non-polar groups buried by interface
- Atomic contact energy (Zhang)?
37Statistical scores
- Interacting residue or atom type profile
- Profile from known complex interfaces
38Geometric scores
- Convex hull of surfaces
- Buried surface area
- Volume of intersection
39Phylogenetic scores
- Needs homologues for all interactors
- Conservation score
- Correlation of mutations across interface
40Combined scores
- best/worst rank(s1,s2,s3 )?
- reverse -s1
- s1 with configurations filtered out if s2gt5.7
- weighted sum a x s1 b x s2
- weighted product s1a x s2b x s3c
- Automated combined score trials
41Assessment of docking methods
- Benchmark
- Assessment event
- CAPRI double-blind trial
42Protein-protein docking benchmark
- 124 protein-protein complexes
- Unique structural family combinations
- Diverse biological roles
- Maintained by Prof. Z. Weng at Boston U.
43Protein-DNA docking benchmark
- 40 protein-DNA complexes
- Maintained by Prof. A Bonvin at Utrecht
44CAPRI Critical Assessment of PRotein Interactions
- Every 6 months or so, irregular
- Set of 1 to 6 target complexes
- Centralized double-blind assessment
- International meetings
- Proteins journal special CAPRI edition
45Interpretation of docking
- Not simulation (maybe Monte Carlo is)?
- Energy functions are no more than inspired by
physics - With greater understanding affinity prediction
could become possible
46Imperial structural bioinformatics group
- Virtual screening
- Prof. Mike Sternberg
- Dr. Ata Amini
- Dr. Paul Shrimpton
- Protein-protein docking
- Prof. Mike Sternberg
- Dr. Suhail Islam
- Dr. Victor Lesk
- Philip Carter
- Sara Dobbins
47Glossary
- Complex
- Interactor
- Ligand/Receptor
- Bond torsion
- Torsion angle
- Bond angle
- Configuration
- Decoy
- Blind Trial
- PDB file
- Coordinate file
- Energy function
- Scoring function
- Fitness function
- Pairwise energy
- Electrostatics
- Solvation
- Dielectric
- Affinity
- Fourier Transform
- Mutagenesis
48Summary
- Surface method fast, versatile, flexibility ok
- 600 processor hours for full rigid search
- To be done
- Score improvement
- Fast sidechain flexibility
- Backbone flexibility
49Bibliography
- Virtual screening
- Virtual Screening in Drug Discovery, Alvarez
Shoichet, CRC Press (2005)? - Structure-based virtual screening an overview,
Lyne, DDT 7 20 1047 (2002)? - Lipinskis rule of 5
- Experimental and computational approaches to
estimate solubility and permeability in drug
discovery and development settings, Lipinski et
al., Adv. Drug Del. Rev.. 26 1-3 3(2001)? - Protein-protein docking Fourier method
- Molecular surface recognition determination of
geometric fit between proteins and their ligands
by correlation methods, Katzir et al., PNAS 89
2195(1992)?
50- Protein-protein docking Monte Carlo
- Protein-protein docking with simultaneous
optimization of rigid-body displacement and
side-chain conformations, Gray et al., JMB 331 1
281 (2003)? - Benchmark for protein-protein docking
- Protein-Protein Docking Benchmark 2.0, Mintseris
et al., Proteins 60 2 216(2005) - Solvation modeling for proteins
- Determination of atomic desolvation energies from
the structures of crystallized proteins, Zhang et
al., JMB 267 3 707(1997)? - Automated protein-protein docking server
- CLUSPRO, Comeau et al., Bioinf. 20 1 45 (2004)
http//nrc.bu.edu/cluster/
51- CAPRI docking assessment
- Welcome to CAPRI, Janin, Proteins SFG 47 3
257(2002)? - CAPRI methods articles, Proteins SFG 52 1(2003)?
- The CAPRI experiment, its evaluation and
implications, Wodak Mendex, Curr Opin Struct
Biol 14 2 242 - Marching Cubes
- Marching cubes A high resolution 3d surface
construction algorithm, Lorensen Cline, Proc.
ACM Siggraph Aug 1987 163
52Critical
53- Scores and hybrid scores
- How to describe the success rate of a docking
method - Benchmark
- Changes in protein structure upon complexation
54Basic Scores
- electrostatic energy
- hard core repulsion penalty score (e.g.
Lennard-Jones)? - solvation score
55Quality of method
- In x of different protein-protein complexes the
best n guesses of method M contain at least one
guess closer than r Angstroms.
56Benchmark
- Benchmark of 39 complexes, filtered for
redundancy - Ranked in 3 difficulty classes based on degree of
conformational change - Enzyme-inhibitor complexes11
- Antibody-antigen complexes11
- 27 others of unclassified functional role
57Conformational change
- Proteins change shape a little (example) or a lot
(example) upon complexation - When conformational change is small, docking
methods can ignore it.
58Strategy
- Any docking method must work unfailingly in cases
of zero conformational change. - Methods are first tested with zero conformational
change imposed.
59Surface-based docking
- Automatically excludes a large subset of known
undesirable conformations - Can impose contact between specified surface
patches - N-fold rotational symmetry
- Provides alternative visualization
60Construction of Surface from pdb file
- Step 1 read pdb file and identify atom types
- Step 2 replace points with overlapping clouds of
density - Step 3 apply marching cubes to generate a set of
interlocking triangles representing the atomic
surface
61Generation of guesses for complexed structure
- A possible structure for the complex can be
generated by specifying - A triangle from the surface representing
interactor 1 - A triangle from interactor 2
- An angle
- Put triangle 1 against triangle 2 and rotate by
angle around the centre of the triangle. - Score the configuration and record the structure.
Repeat 1M times.
62Ultimate aims of Protein-Protein Docking, in
order of difficulty
- What is the complexed structure of proteins x, y
of known structure which are known to form a
complex? (Hard)? - Could proteins a, b of known structure form a
stable complex in vivo? (Very Hard)? - What, approximately, is the chemical affinity for
given interacting proteins? (Very Hard)?
63Hybrid scores
- Hybrid scores are scores produced by operations
on basic scores s1,s2,s3 - reverse(s1)?
- s1 with configurations filtered out if s2gt5.7
- weighted sum as1 bs2 cs3
- weighted product s1a x s2b. x s3c