Merq - PowerPoint PPT Presentation

About This Presentation
Title:

Merq

Description:

Merq. Yvonne Martin. Jeremy's Notes. merq.c - Merlin query filter. Similarity, superstructure, or smarts search on input smiles. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 13
Provided by: Mart45
Category:
Tags: merlin | merq

less

Transcript and Presenter's Notes

Title: Merq


1
Merq
  • Yvonne Martin

2
Jeremys Notes
  • merq.c - Merlin query filter. Similarity,
    superstructure, or smarts search on input smiles.
  • Normal output is smiles, number of hits, then
    each hit smiles, space separated.
  • HITSONLY omits the first two fields.
  • ONEHITPERLINE omits the first two fields, and
    adds a newline
  • after each hit smiles, to facilitate
    postprocessing (more
  • smarts filtering perhaps).
  • Author Jeremy Yang
  • Rev 10 Nov 2000

3
What Does merq Do?
  • Reads a list of smiles or smarts
  • Performs a similarity, superstructure, or smarts
    search of a database on each
  • Reports the number of hits and the smiles of the
    hits for each input smiles/smarts

4
Why Did I Need merq?
  • Wondered about the similarity of one vendors
    database to that of anotheris there some magic
    about certain vendors compounds?
  • Distrust of clustering
  • More about that later

5
What Have I Used merq For?
  • To check uniqueness of vendor databases
  • Are all the vendors selling the same compounds?
    This could happen because both the commercial
    reagents and the chemistries known to generalize
    are available to everyone.
  • If so, we dont need to worry about some
    unquantified quality attractive hit as part of
    the decision of which vendor to use.

6
0.85 Similarity to Maybridge
Number

Percent

Number

File
o
f Structures

i
n File

Similar

t
o Buy

chemstar

59568

28.35

16890

timtec

28387

25.47

7230

asinex

134957

22.12

29848

chembridge

51945

20.77

10790

scientific exchange

18501

19.38

3585

specs specs4

112595

18.25

20543

zelinsky

111418

16.93

18867

sherk

6869

16.92

1162

enamine

89010

14.99

13347

ibs

100634

12.77

12849

aventis

51140

11.48

5869

Total



140980


7
Cross-similarities of Vendor Compounds
8
Distribution of Number of 0.85 Similars within
Different Vendor Databases
1
92 75 65 60
0.8
Cumulative Fraction of Database
0.6
0.4
0.2
0
0
20
40
60
80
100
Number of Similar Structures
9
Distribution of Number of 0.85 Similars within
Different Vendor Databases
92 75 65 60
Number of Similar Structures
10
Distribution of Number of 0.85 Similars within
Different Vendor Databases
Number of Similar Structures
11
Wards Clusters of the MAO Dataset of 1645
Compounds
Compounds with four similar compounds
Compounds with no similar compounds
1
0.8
0.6
Fraction at this cluster size
0.4
0.2
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
11
Cluster size
12
Wards Clusters of a Dataset of 19533 Compounds
Sizes of Clusters--Compounds with no Similars
n3633
2500
2000
1500
1000
500
0
1
3
5
7
9
11
13
15
17
19
Sizes of Clusters--Compounds with 9 Similars
Sizes of Clusters--Compounds with One other
Similar, n1148
n39
800
10
700
8
600
500
6
400
4
300
200
2
100
0
0
1
3
5
7
9
1
4
7
10
13
16
19
22
25
11
13
15
17
19
21
Write a Comment
User Comments (0)
About PowerShow.com