Title: Targeting Drug-like properties in Chemical Libraries
1Targeting Drug-like properties in Chemical
Libraries
- David Winkler, Frank Burden, Mitchell Polley
- Centre for Complexity in Drug Design
- CSIRO Molecular Science and
- Chemistry Department, Monash University
VICS
2Complexity in Drug Design Group
- Prof. Frank Burden - Scimetrics Ltd -consultant
to CSIRO - Dr. Mitchell Polley - CSS postdoctoral fellow
- Darryl Jones - CSS PhD top-up student - Flinders
University (Physics) - Prof. Dave Winkler - CSIRO Molecular Science and
Monash University/VICS
3Overview of Project
- Aims to develop a method for evolving a chemical
library of heterogeneous agents (molecules) using
'drug-like' fitness functions - Chemical space is vast (gt1080 possibilities)
- Method must explore drug-like chemical space and
identify islands of activity and novelty - Application in the discovery of novel bioactive
agents such as drugs, crop care products - Methodology applicable to design of new materials
and nanomachines using different fitness functions
4Overview of Project
- Steps
- Devise sparse, informative mathematical
representations of molecules - Devise sparse methods of selecting these for
models - Use agent-based methods (Bayesian neural nets) to
map representations to properties and use models
as fitness functions - Develop methods for evolving chemicals using
mutation operators so that maximum chemical space
can be traversed - Evolve chemical libraries using drug-like fitness
functions
5Highlights
- Representations
- Novel charge fingerprint descriptor devised and
tested - Theory of eigenvalue descriptors cracked
- momentum space descriptor work started
- Novel selectivity index developed
6Sparse Descriptors
- Many thousands of descriptors have been devised
(e.g. CoMFA fields, DRAGON) - Many are highly correlated with other descriptors
- contain the same information - Some (e.g. molecular weight) are information-poor
- Models using sparse descriptors can be more
predictive - We work to the premise that it is possible to
devise sparse, information-rich descriptors from
which suitable subsets could be drawn for a wide
variety of modelling problems
7Charge fingerprints
- These are widely applicable, easily computed
descriptors calculated by binning charges on
atoms in different environments
8EEM-based property descriptors
- Density Functional Theory (DFT) proposes that
knowledge of electron density allows computation
of many other properties - Electronegativity equalization methods (Mortier,
Bultinck and others) is a rapid, approximate DFT
method - All work to date has concentrated on charges or a
few other observables. - Main strength will probably lie with calculation
of other molecular properties, when method is
generalized and parameterized for more atom types
9Generalized eigenvalue matrices
10Why do eigenvalue descriptors work?
Eigenvalue matrix
EEM matrix
AT TL \ A TLT' A-1 TL-1 T' since T'T-1
for an orthogonal transformation i.e. inverse of
A is related to the eigenvalues
11Momentum space descriptors
- the more interesting part of the electron density
distribution in terms of biological activity is
located near to the k-space origin. The
corresponding r-space density distribution is
associated with the outermost valence regions of
the molecule - k-space descriptions of electron density are more
compact and simpler
12Optimum Selectivity Index So
13Highlights
- Sparse feature selection
- Automatic Relevance Determination (ARD) method
refined - Sparse Bayesian feature detection theory mastered
- Linear sparse feature detection using an EM
algorithm and Jeffrey's prior - Nonlinear Bayesian feature detection achieved but
needs more work - Novel variable selection when number of
descriptors is much larger than the number of
molecules in the data set
14Sparse Bayesian variable selection
Descriptor
15Highlights
- Optimum nonlinear modelling
- Bayesian regularized neural networks working well
- Linear sparse feature detection and modelling
- Nonlinear Bayesian feature detection and
modelling using radial basis function regression - Use of sparse Bayesian methods in neural networks
under study
16Highlights
- Models built
- Blood-brain barrier partitioning
- Drug intestinal absorption
- Acute toxicity
- Phase II metabolism - substrates and inhibitors
(Flinders medical school collaboration) - SVM - Several drug target models - e.g. farnesyl
transferase
17Blood-brain barrier model
Topological descriptors- 3 hidden nodes Training
set 85 compounds, test set 21 compounds
18Intestinal absorption QSAR model
Property-based descriptors- 5 hidden nodes-
optimum model
19Acute toxicity model
Burden index/binned charge descriptors 8 hidden
nodes Training set 450 compounds, external test
set 53 compounds
20Using SVM and EEM descriptors to model phase II
metabolism
21COX 1 and 2 QSAR and selectivity
- Built QSAR model for cyclooxygenase 1 and 2, and
S0 using a large data set from Tom Stockfisch at
Accelrys (454 compounds obtained from
http//www.accelrys.com/references/datasets/) - Used atomistic (A), Burden eigenvalue (B) and
charge fingerprint (C) descriptors together with
a Bayesian regularized neural net to build model - Compared MLR with a Bayesian neural net with 3
nodes in the hidden layer
22COX 1 and 2 QSAR and selectivity
Selectivity of cyclooxygenase 1 and 2 inhibitors
23Selectivity Index So QSAR Model
MLR R20.77 Q20.69
BRANN (3 nodes) R20.92 Q20.74