Title: ROCS Ligand Shape Similarity
1ROCS Ligand Shape Similarity
- George Vacek
- August 25, 2004
2Agenda
- Shape Theory
- Basic ROCS usage
- Using VIDA to visualize results
- Using the Color Force Field
- Examples
3Shape Similarity Rigorous Definition
- The difference in the characteristic function, c,
across all space - That is, the distance apart in Shape space
- Solving Dmin yields maximal overlap
- D, Dmin are metric properties
4Characteristic Function Gaussian Atoms
- Atoms are Gaussians, not hard spheres
- Gaussians capture the volume correctly
- Full Gaussian expansion not necessary
- Easily integrable, analytic derivatives
- Easily calculate overlap of two atoms
- Product of 2 Gaussians is another Gaussian
5Shape Tanimoto
- Tanimoto
- Shape Tanimoto uses 3D overlap instead of bits
comparison - Interesting values in different range than 2D
- Sensitive to large size differences between two
structures
6Shape Tversky
- Tversky
- 2D version by John Bradshaw at Daylight
- Useful for small structure vs. large structure
- Weighting factor for size differences
- Typically asymmetric with ? 0.95, ? 0.05
- Field measure not strictly limited 0,1
7Similar Shapes
Shape Tanimoto gt 0.7
8Simple Example
- ROCS requires only 2 inputs
- a file containing shape query molecules
- a file containing the database of interesting
structures - corporate database, ACD-Screen, vendor databases
etc. - Must already be 3D
- File format can be SDF, MOL2, PDB, XYZ, MMOD, OEB
- rocs -dbase vendor.oeb.gz -query 6cox.mol2
9Other Default Settings
- -besthits 500
- number of results to keep in hitlist
- -cutoff 0.0
- minimum score to even consider
- -rankby tanimoto
- alternates tverskyd, tverskyq, scaledcolor,
combo - -prefix rocs
- text prefix for output file names
- -oformat sdf
- format of output structure file
10Initial Orientation
- Default - Inertial Frame alignment
- Overlay center of mass
- Align largest moment of inertia, then second
largest - each 2-fold degeneracy yields 4 starting points
- Extra 2 or 4 axes for top symmetry
- Random
- -randomstarts N
11More Simple Examples
- rocs -dbase spam.bin -query acetsali.sdf
- -prefix ACET -cutoff 0.5 -besthits 100
- -outputquery
- rocs -dbase spam.bin -query aminopy.sdf
- -rankby tverskyd -cutoff 0.4 -besthits 10
- -outputquery
12Analyze Results
- ROCS outputs a structure file and a report file
- Structure file is SDF (by default)
- scores are stored in SD tags
- automatically loaded into VIDAs spreadsheet
- Report file is tab-delimited text
- load into any spreadsheet for analysis, or
- parse by splitting each line on tabs
13Color Force Field
- Color atoms or groups of atoms with SMARTS to
describe pharmacophores - Report color score only
- Optimize overlay along color gradients
14Color Definitions
TYPE donor TYPE acceptor TYPE cation TYPE
anion TYPE rings
15Color Definitions
- Define SMARTS pattern for each type
These definitions of donor and acceptor are the
general definitions of Mills Dean, JCAMD
10607-622, 1996. Donor an electronegative
atom with a proton (no S or C, see
above). PATTERN donor
7,8h,H Acceptor a lone pair on an
electronegative atom (O or N S was removed,
see reference). Note, N in an amide or in an
alkyl-aniline system is too conjugated to
accept, however, analinic NH2 is a potential
acceptor. PATTERN acceptor
OD1(O-,,6,15,16)!(1,2,3) PATTERN
acceptor nH0,N,8!(nX3)((-,
,ee)-,,ee)! (NC)!(ND2,D3-a)!(
1,2,3) PATTERN acceptor
NX3((-,,ee)(-,,ee)-,,ee)!(
1,2,3)
16Included Force-fields
- simple.cff very simple color force field
- MillsDean.cff more complete
- Mills-Dean definitions of donors and acceptors
- Mills Dean, J. Comp.-Aid. Mol. Design, 10
(1996) 607-622 - rings, anions and cations
17Color Force-field
- Define interactions between atom types
- Weight is strength of interaction, relative to
shape gradients. - Radius affects range of interaction
INTERACTION donor donor attractive gaussian
weight1.0 radius1.0 INTERACTION acceptor
acceptor attractive gaussian weight1.0
radius1.0 INTERACTION rings rings attractive
gaussian weight1.0 radius1.0
18Roadmap for ROCS 2.1
- General grid query
- New-and-improved hitlist
- Memory management for large runs
- Improved management for parallel jobs
- Other optimizations