Title: CCDC_Intro proposal for Industry day
1Insight into Molecular Geometry and Interactions
using Small Molecule Crystallographic Data
John Liebeschuetz Cambridge Crystallographic Data
Centre, 12 Union Road, Cambridge,
UK john_at_ccdc.cam.ac.uk
2How much Data is Available?
Growth of the Cambridge Structural Database over
40 years
Predicted Growth to 2010
gt500,000 entries during 2009
- 419,768 entries June 2007
3CSD Data Content
- Literature Reference
- G. Bringmann, M.
Ochse, - K. Wolf, J. Kraus, K. Peters,
- E-M. Peters, M. Herderich,
- L. Ake, F. Tayman
- Phytochemistry 51,1999,
271 - Other text
- R-factor .0506
- Colour pale yellow
- Habit acicular
- Polymorph Form IV
- Source Rothmannia longiflora
- 4-Oxonicotinamide-1-(1-beta-D-2,3,5-tri-O-acet
yl- - ribofuranoside)
- C17 H20 N2 O9
4Molecular Interactions as well as Geometry
HEPPEX
5Cambridge Structural Database System
6Using Structural Data in molecular modelling for
pharmaceutical design
- Intramolecular 3D geometry
- Designing in the desired Conformer
- Validation that models have correct geometry
- Intermolecular Interactions between molecules
- Design of pharmacophores
- Validation of interactions found during
modelling - Identification of new ways to satisfy binding
motifs - Knowledge-based scoring functions for docking
7Designing in the right Conformation (1)
- It is possible using Conquest to generate
incidence histograms for any geometric feature ,
for any substructure, if sufficient high quality
structures including that substructure, are
present in the CSD
Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl,
M. J. Chem. Inf. Mod, 48(1), 1-24 2008)
8Designing in the right Conformation (2)
- Sulphonamide is common in drug molecules. Its
conformational behaviour well captured by CSD - Ortho Substitution (Blue histogram) shifts the
maximum - Pyramidalisation of the N of the sulphonamide
can also be explored. This is a common effect in
sulphonamides (and piperidines) but is poorly
reproduced by modelling software
Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl,
M. J. Chem. Inf. Mod, 48(1), 1-24 2008)
9Designing in the right Conformation (3) Example
1
CSD analysis indicates the bioactive conformation
is stable only for most active structure
Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl,
M. J. Chem. Inf. Mod, 48(1), 1-24 2008)
10Validation of Model Geometry Mogul
- Rapid access of geometric information from the
CSD - Incorporates pre-computed libraries of bond
lengths, valence angles and torsion angles - gt20 million individual geometrical parameters
derived entirely from the CSD and updated
annually - Sketch or import molecule, then click on feature
of interest to view distribution, mean values and
statistics
Bruno et al., J. Chem. Inf. Comput. Sci., 44,
2133-2144, 2004
11(No Transcript)
12(No Transcript)
13Validation of PDB Ligand Geometry
- PDB structures suffer from less well defined
electron density - Protein X-Ray refinement force fields often are
poorly parameterised to reproduce ligand
geometries - Sometimes protein crystallographers start with a
poor ligand model
14Validation of Ligand structures in the PDB via
Protein/Ligand analysis tool Relibase
www.ccdc.cam.ac.uk
15Validation of ligand structures found in the PDB
Ligand from 1HAK, Two abnormal torsions indicated
Further examination reveals the piperidine to be
Boat form
16Validation of ligand structures found in the PDB
using Mogul
- 15 of 100 recent PDB entries have ligand
geometry that are almost certainly in
significant error (in house analysis using
Relibase/Mogul) - The good news - For structures deposited before
2000 the figure is 26
2006
Pre 2000
17Designing in the right Conformation (3) Large
Rings
Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl,
M. J. Chem. Inf. Mod, 48(1), 1-24 2008)
18Validation of ligand structures found in the PDB
Ligand from 1HAK, Two abnormal torsions indicated
Further examination reveals the piperidine to be
Boat form
19Mogul 1.3 Ring Conformations
- Mogul currently holds data on bonds, angles and
torsions. - In the 2010 release of the Cambridge Structural
Database System Mogul will also contain a
comprehensive ring knowledge base - Ring libraries from a-Mogul 1.3 have been
introduced into Gold 4.1 to allow knowledge-
based ring-flexing during docking
20A Knowledge Base of Intermolecular Interactions
IsoStar
- Experimental data from
- Cambridge Structural Database
- Protein Data Bank (protein-ligand complexes only)
- Theoretical potential energy minima (DMA, IMPT)
- Typical Uses
- Probability of an interaction occurring
- Preferred geometries
- Design Strategies
21IsoStar Methodology
central group -CONH2
contact group NH
- Search CSD or PDB for structures containing
contact - Superimpose hits and display distribution
22IsoStar Scatterplots vs. Density MapsN-H donors
around amide CO
Scatterplot
Contour surface
23IsoStar indole and isoxazole interactions with
faces of phenyl rings
24Using Intermolecular information to build
pharmacophores from proteins
- Use intermolecular information (IsoStar) to map
a protein binding site (e.g. using SuperStar, an
extra module to the CSDS ) - Create a pharmacophore from this information
(possible in SuperStar)
c.f. GRID/FLAP
25Motif searching
- Certain signature interaction motifs might be
key to identifying inhibitor substructures of
interest. Can we identify such motifs in the CSD
and thereby uncover new ideas? - Materials Mercury A new tool for the drug
development and crystal design community - Most tools are specific to small molecule
crystals .... - However .
26Packing Feature Search
- Comparison of crystal structures polymorphs,
solvates etc can identify significant packing
features. - We can then search the CSD using Packing Feature
Search
27H-bonding Motif Search Kinase Binding Motifs
Set up a Packing Feature Search around Hinge
Region
CDK2 Complex 1ke8
28H-bonding Motif Search Kinase Binding Motifs
MISTOX
WUSQAC
Provides ideas for new motifs Fragment based
design
29Knowledge-based scoring using small molecule
structural data
- Protein/Ligand Docking relies on a scoring
function to rank binding poses - Scoring functions may be Molecular Mechanics
based, Empirical or Knowledge Based - A Knowledge Based score is calculated by the sum
of atom-atom potentials derived from a
crystallographic database - The atom-atom potential - log
- Knowledge based scoring functions (PMF, Bleep,
DrugScore, ASP) have been developed using
protein-ligand data (PDB) - The CSD contains better resolved structures and a
much greater variety of chemical functionality
than the PDB - DrugScoreCSD has demonstrably improved
performance over DrugScore (Velec, Gohlke
Klebe, J. Med. Chem., 48 (2005), 6296 )
observed interactions
reference state
30Uses of Small Molecule Structural Data in Drug
Design Conclusions
- Use in Model Validation -
- Geometry of designed synthetic candidates
- Geometry of X-ray derived Ligand Structures
- Intermolecular interactions of a candidate
structure with a model of binding site - Design of Pharmacophores
- Search for fragments fitting a binding motif
- Creation of robust and versatile Knowledge-Based
scoring functions for docking
31Acknowledgements
Jana Henneman James Chisholm
Thank you for your attention