Screen Ligand based virtual screening presented by - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Screen Ligand based virtual screening presented by

Description:

Scientific – PowerPoint PPT presentation

Number of Views:175
Avg rating:3.0/5.0
Slides: 51
Provided by: chem2
Category:

less

Transcript and Presenter's Notes

Title: Screen Ligand based virtual screening presented by


1
Screen Ligand based virtual screeningpresented
by maintained by Miklós Vargyas
Last update 13 April 2010
2
Screen
Virtual screening by topological descriptors
3
Screen
Description of the product
Screen performs high throughput virtual screening
of compound libraries using similarity
comparisons by various molecular descriptors.
Availabilty
  • JChemBase
  • JChem Oracle cartridge
  • Instant Jchem
  • Server version
  • standalone command line application programs
  • KNIME
  • PipelinePilot

4
Key features
  • Various 2D descriptors
  • ChemAxon chemical fingerprint (CCFP)
  • PipelinePilot ECFP/FCFP
  • ChemAxon pharmacophore fingerprint (CPFP)
  • BCUT
  • Scalars (logP, logD, Szeged index )
  • custom descriptors, in-house fingerprints
  • Optimized similarity measures
  • Improves similarity prediction
  • depends on set of known actives
  • high enrichment ratios in virtual screening
  • Multiple queries
  • 3 types of hypotheses
  • combined hit lists

5
Benefits
  • Versatile
  • Use various descriptors in your well established
    model
  • Access your trusted in-house fingerprint in IJC,
    JCB, JCART
  • Easy integration in corporate discovery pipelines
  • Search chemical files directly no need to import
    structures in database
  • New descriptors are pluggable in deployed systems
  • Optimal
  • Consistent similarity scores
  • Smaller hit set
  • More focused library

6
Benefits
More consistent similarity scores
optimized Tanimoto
0.20
regular Tanimoto
0.06
0.28
7
Benefits
  • High enrichment ratio
  • Fewer false hits
  • Known actives are true positive hits (ACE
    inhibitors)

8
Results
NPY-5 (pharmacophore similarity)
9
Results
ß2-adrenoceptor (pharmacophore similarity)
10
Case study at Axovan
  • GPCR activity prediction
  • distinguishing between GPCR subclasses

GPCR-Tailored Pharmacophore Pattern Recognition
of Small Molecular Ligands Modest von Korff and
Matthias Steger, JCICS 2004, 44
11
Screen roadmap
  • New molecular descriptors
  • ECFP/FCFP (in 5.4)
  • Shape descriptors (in 5.4)
  • Hidden use of the optimiser
  • No-pain black-box approach
  • Simultaneous multi-descriptor search
  • Enhanced IJC integration
  • Easy descriptor configuration and generation
  • Similarity search type instead of descriptors,
    metrics and other unfriendly concepts

12
Screen roadmap
  • GUI
  • New web interface (HTML/AJAX)
  • Desktop application for descriptor generation
  • 3D shape similarity
  • fast pre-filtering by 3D fingerprint
  • Alignment based volumetric Tanimoto calculation
  • scaffold hopping by maximizing topological
    dissimilarity and spatial similarity

13
Supplementary slides
14
A typical approach
01010101000101000101001000000000000100100000100101
00100100010000
query fingerprint
query
metric
00000001000011010000001010100000000001100000100001
00001000001000 01000101100100100101100110100111001
11101000000110000000110001000 01000101000111010100
00110000101000010011000010100000000100100000 00011
01110011101111110100000100010000110110110000000100
110100000 0100010100110100010000000010000000010010
000000100100001000101000 0100011100011101000100001
011101100110110010010001101001100001000 0101110100
11010101011111100001000001111110001000010000100010
1000 010001010011110101000010001000000001001000001
0100100001000101000 000100010001010001010010000000
0000001010000010000100000100000000 010001010001001
1000000000000000000010100000010000000000000000000
01000101000101000000000000001010000100100000000001
00000000000000 01010101011111001111101000000000000
11010100011100100001100101000 01000101000110000100
00011000000000010001000000110000000001100000 00000
00100000000010000100000000000001010100000000100000
100100000 0100010100010100000000100000000000010000
000000000100001000011000 0001000100001100010010100
000010100101011100010000100001000101000 0100011100
01010001000010000100111001001000001000110000000010
1000 010101010001010001010010000000000001001000001
0010100100100010000
hits
targets
target fingerprints
15
ChemAxons approach
01000101000111010100001100001010000100110000101000
00000100100000 00011011100111011111101000001000100
00110110110000000100110100000 01000101001101000100
00000010000000010010000000100100001000101000 01011
10100110101010111111000010000011111100010000100001
000101000 0001000100010100010100100000000000001010
000010000100000100000000 0100010100010100000000000
000101000010010000000000100000000000000 0101010101
11110011111010000000000001101010001110010000110010
1000 010001010001100001000001100000000001000100000
0110000000001100000 000000010000000001000010000000
0000001010100000000100000100100000
01011101001101010101111110000100000111111000100
00100001000101000
hypothesis fingerprint
queries
optimized metric
optimization
00000001000011010000001010100000000001100000100001
00001000001000 01000101100100100101100110100111001
11101000000110000000110001000 01000101000111010100
00110000101000010011000010100000000100100000 00011
01110011101111110100000100010000110110110000000100
110100000 0100010100110100010000000010000000010010
000000100100001000101000 0100011100011101000100001
011101100110110010010001101001100001000 0101110100
11010101011111100001000001111110001000010000100010
1000 010001010011110101000010001000000001001000001
0100100001000101000 000100010001010001010010000000
0000001010000010000100000100000000 010001010001001
1000000000000000000010100000010000000000000000000
01000101000101000000000000001010000100100000000001
00000000000000 01010101011111001111101000000000000
11010100011100100001100101000 01000101000110000100
00011000000000010001000000110000000001100000 00000
00100000000010000100000000000001010100000000100000
100100000 0100010100010100000000100000000000010000
000000000100001000011000 0001000100001100010010100
000010100101011100010000100001000101000 0100011100
01010001000010000100111001001000001000110000000010
1000 010101010001010001010010000000000001001000001
0010100100100010000
hits
targets
target fingerprints
16
Performance
  • Chemical fingerprint generation 500/s
  • Pharmacophore fingerprint generation
  • calculated 80/s
  • rule-based 200/s
  • Screening 12000/s
  • Optimization 10s/metric
  • Hardware/software environment
  • P4 3GHz, 1GB RAM
  • Red Hat Linux 9
  • Java 1.4.2

17
Implementations
Use of various fingerprints and metrics in
JSP http//www.chemaxon.com/jchem/examples/jsp1_x
/index.jsp
UGM presentation by Aureus Pharma Improved
Virtual Screening Strategies and Enrichment of
Focused Libraries in Active Compounds Using
Target-Oriented Databases http//www.chemaxon.co
m/forum/viewpost2307.html
18
Molecular similarity
Chemical, pharmacological or biological
properties of two compounds match. The more the
common features, the higher the similarity
between two molecules.
Chemical
Pharmacophore
19
Similarity measures
  • Quantitative assessment of similarity of
    structures
  • need a numerically tractable form
  • molecular descriptors, fingerprints, structural
    keys

Sequences/vectors of bits, or numeric values that
can be compared by distance functions, similarity
metrics.
20
Standard metrics
21
Topological chemical fingerprint
  • hashed binary fingerprint
  • encodes topological properties of the chemical
    graph connectivity, edge label (bond type), node
    label (atom type)
  • allows the comparison of two molecules with
    respect to their chemical structure
  • Construction
  • find all 0, 1, , n step walks in the chemical
    graph
  • generate a bit array for each walks with given
    number of bits set
  • merge the bit arrays with logical OR operation

22
Construction of chemical fingerprint
23
Chemical similarity
01000101000101000100000000011010100110101000000101
00000000100000
01000101000101000100000000011010100110101000000001
00000000100000
24
Topological pharmacophore fingreprint
  • encodes pharmacophore properties of molecules as
    frequency counts of pharmacophore point pairs at
    given topological distance
  • allows the comparison of two molecules with
    respect to their pharmacophore
  • Construction
  • perceive pharmacophoric features
  • map pharmacophore point type to atoms
  • calculate length of shortest path between each
    pair of atoms
  • assign a histogram to every pharmacophore point
    pairs and count the frequency of the pair with
    respect to its distance

25
Pharmacophore perception
Rule based approach
  • Rule 1 The pharmacophore type of an atom is an
    acceptor, if
  • it is a nitrogen, oxygen or sulfur, and
  • it is not an amide nitrogen or sulfur, and
  • it is not an aniline nitrogen, and
  • it is not a sulfonyl sulfur, and
  • it is not a nitro group nitrogen.

26
Exceptions to simple rules
n-cyano-methil piperidine
sp2 atom
exception ? extra rules ? large number of rules
? maintenance, performance
27
Effect of pH
pH 7
pH 1
pH ? pH specific rules ? large number of rules
? maintenance, performance
28
Pharmacophore perception
Calculation based approach
Step 1 estimation of pKa
allows the determination of the protonation state
for ionizable groups at the given pH
Step 2 partial charge calculation
29
Pharmacophore perception
Calculation based approach
Step 3 hydrogen bond donor/acceptor
recognition Step 4 aromatic perception Step 5
pharmacophore property assignment
acceptor negatively charged acceptor acceptor and
donor hydrophobic none
30
Pharmacophore fingerprint
Pharmacophore type coloring acceptor, donor,
hydrophobic, none.
31
Fuzzy smoothing
32
Virtual screening using fingerprints
01010101000101000101001000000000000100100000100101
00100100010000
query fingerprint
query
metric
00000001000011010000001010100000000001100000100001
00001000001000 01000101100100100101100110100111001
11101000000110000000110001000 01000101000111010100
00110000101000010011000010100000000100100000 00011
01110011101111110100000100010000110110110000000100
110100000 0100010100110100010000000010000000010010
000000100100001000101000 0100011100011101000100001
011101100110110010010001101001100001000 0101110100
11010101011111100001000001111110001000010000100010
1000 010001010011110101000010001000000001001000001
0100100001000101000 000100010001010001010010000000
0000001010000010000100000100000000 010001010001001
1000000000000000000010100000010000000000000000000
01000101000101000000000000001010000100100000000001
00000000000000 01010101011111001111101000000000000
11010100011100100001100101000 01000101000110000100
00011000000000010001000000110000000001100000 00000
00100000000010000100000000000001010100000000100000
100100000 0100010100010100000000100000000000010000
000000000100001000011000 0001000100001100010010100
000010100101011100010000100001000101000 0100011100
01010001000010000100111001001000001000110000000010
1000 010101010001010001010010000000000001001000001
0010100100100010000
hits
targets
target fingerprints
33
Multiple query structures
01000101000111010100001100001010000100110000101000
00000100100000 00011011100111011111101000001000100
00110110110000000100110100000 01000101001101000100
00000010000000010010000000100100001000101000 01011
10100110101010111111000010000011111100010000100001
000101000 0001000100010100010100100000000000001010
000010000100000100000000 0100010100010100000000000
000101000010010000000000100000000000000 0101010101
11110011111010000000000001101010001110010000110010
1000 010001010001100001000001100000000001000100000
0110000000001100000 000000010000000001000010000000
0000001010100000000100000100100000
01011101001101010101111110000100000111111000100
00100001000101000
queries
hypothesis fingerprint
metric
00000001000011010000001010100000000001100000100001
00001000001000 01000101100100100101100110100111001
11101000000110000000110001000 01000101000111010100
00110000101000010011000010100000000100100000 00011
01110011101111110100000100010000110110110000000100
110100000 0100010100110100010000000010000000010010
000000100100001000101000 0100011100011101000100001
011101100110110010010001101001100001000 0101110100
11010101011111100001000001111110001000010000100010
1000 010001010011110101000010001000000001001000001
0100100001000101000 000100010001010001010010000000
0000001010000010000100000100000000 010001010001001
1000000000000000000010100000010000000000000000000
01000101000101000000000000001010000100100000000001
00000000000000 01010101011111001111101000000000000
11010100011100100001100101000 01000101000110000100
00011000000000010001000000110000000001100000 00000
00100000000010000100000000000001010100000000100000
100100000 0100010100010100000000100000000000010000
000000000100001000011000 0001000100001100010010100
000010100101011100010000100001000101000 0100011100
01010001000010000100111001001000001000110000000010
1000 010101010001010001010010000000000001001000001
0010100100100010000
hits
targets
target fingerprints
34
Hypothesis fingerprints
Advantages
  • allows faster operation
  • compiles features common to each individual
    actives
  • reduces noise

Hypothesis types
35
Hypothesis fingerprints
36
The need for optimization
Too many hits
37
The need for optimization
Inconsistent dissimilarity values
38
Parametrized metrics
asymmetry factor
scaling factor
39
Optimization of metrics
Step 1 optimize parameters for maximum
enrichment Step 2 validate metrics over an
independent test set
40
Optimization of metrics
Step 1 optimize parameters for maximum enrichment
query set
1111100010000100001000101000
query fingerprint
parametrized metric
41
Optimization of metrics
v1
v2
v3
vi
vn
42
Optimization of metrics
Step 2 validate metrics over an independent test
set
query set
1111100010000100001000101000
optimized metric
query fingerprint
43
Results of Optimization
1. Similar structures get closer
0.20
0.06
0.28
44
Results of Optimization
2. Hit set size reduced
Active set 18 mGlu-R1 antagonists Target set
10000 randomly selected drug-like structures
45
Results of Optimization
3. Higher enrichment
46
Results of Optimization
4. Top ranked structures are spikes
  • offers a more intuitive way to evaluate the
    efficiency of screening
  • based on sorting random set hits and known
    actives on dissimilarity values and counting the
    number of random set hits preceding each active
    in the sorted list

0.014 0.015 0.017 0.020 0.022 0.023 0.027 0.041 0.
043
number of virtual hits
number of spikes retrieved
47
Results
ACE (pharmacophore similarity)
48
Results
NPY-5 (pharmacophore similarity)
49
Results
ß2-adrenoceptor (pharmacophore similarity)
50
3D flexible search
  • Expected top performance 200 structures/s
Write a Comment
User Comments (0)
About PowerShow.com