Hugo O. Villar, Ph.D., MBA - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Hugo O. Villar, Ph.D., MBA

Description:

Hugo O. Villar, Ph.D., MBA & Mark R. Hansen, Ph.D. Altoris, Inc. San Diego, CA ... (Bone & Villar, JCC, 1997) Tryptophan. Knowledge Based Chemical Browsing ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 25
Provided by: chem77
Category:
Tags: mba | hugo | villar

less

Transcript and Presenter's Notes

Title: Hugo O. Villar, Ph.D., MBA


1
Hugo O. Villar, Ph.D., MBA Mark R. Hansen,
Ph.D. Altoris, Inc. San Diego, CA
www.altoris.com www.chemapps.com
www.patentinformatics.com
2
Parameter Driven Data Management
(1,0,1,0,0,0,1,0,0,.) 3.25, 7.24,. 3D
structural data
Toxicology Affinity Activity Phamacokinetics
www.altoris.com www.chemapps.com
www.patentinformatics.com
3
Organizing Chemical Data
  • Different techniques
  • Classification of traditional data types
  • Molecular Properties (continuous, binary)
  • Clustering, Discriminant Analysis, etc.
  • Symbolic Data
  • Categorical
  • Ranges of properties

www.altoris.com www.chemapps.com
www.patentinformatics.com
4
Grouping by Molecular Properties
  • Well defined scaffolds lead to well defined
    clusters
  • Neighboring structures are related

www.altoris.com www.chemapps.com
www.patentinformatics.com
5
Same scaffold in multiple clusters
Cluster
Better defined scaffolds are found in fewer
clusters
www.altoris.com www.chemapps.com
www.patentinformatics.com
6
Same scaffold in multiple clusters
Even less common scaffolds are in multiple
clusters
www.altoris.com www.chemapps.com
www.patentinformatics.com
7
Not Systematic
Single cluster
Multiple clusters
www.altoris.com www.chemapps.com
www.patentinformatics.com
8
Single cluster, multiple scaffolds
Fingerprint degeneracy, high density (ties), etc.
www.altoris.com www.chemapps.com
www.patentinformatics.com
9
Substructure Enumeration
  • Alternative to molecular property based
    grouping
  • Possible for even large chemical databases
  • Mining the information is challenging
  • Large number of substructures
  • Multivariate statistics not useful
  • Tryptophan

675 Substructures
www.altoris.com www.chemapps.com
www.patentinformatics.com
10
Knowledge Based Substructure Enumeration
Tryptophan
(Bone Villar, JCC, 1997)
  • Not all substructures are informative
  • Relationships add complexity but no information
  • Atom by Atom

11
Knowledge Based Chemical Browsing
  • Large number of substructures
  • Make computations challenging
  • Complicate browsing through data

  • Eliminate non informative scaffolds and relations
  • Omit trivial extensions
  • Optimize number of occurrences
  • Organize in parent child relationships

www.altoris.com www.chemapps.com
www.patentinformatics.com
12
Knowledge-Base Scaffold Tree
Organized as parent child Nodes are distinct
extensions Trivial depends on use, flexibility
is key Correlate to molecules Correlate to
Bioactivity and Properties
www.altoris.com www.chemapps.com
www.patentinformatics.com
13
Trees facilitate browsing
14
Scaffold expansion parameters
Library Size
Atom by Atom expansion, grows explosively Scaffold
differentiation affects growth Knowledge Based
growth is controlled
15
Occurrence
Number of molecules with a given substructure (N)
Low occurrence substructures can Add up
significantly May contribute little information
NCI Library
( Scaffolds) N ( Scaffolds) Nn ( Scaffolds)
N Mols Diversity measure
16
Scaffold Complexity
N3, Maybridge Chemicals
Limiting only to substructures with 2 and 3 rings
can reduce tree complexity Drug like molecules
are already low complexity
www.altoris.com www.chemapps.com
www.patentinformatics.com
17
Tree nodes and leaves
N10 , NCI library
Unique scaffolds can be repeated through the tree
in different nodes
www.altoris.com www.chemapps.com
www.patentinformatics.com
18
Scaffolds Symbolic Analysis
  • Objects to numbers
  • Counts of objects is simplest form
  • Odds ratios, Proportions (CI), etc.
  • Objects to Objects
  • Unions, Intersections, OR

www.altoris.com www.chemapps.com
www.patentinformatics.com
19
Counts, Objects to numbers
Compute the odds that a scaffold is found in
mutagenic compound
Kho et al., JMC, 2005
20
Objects to Objects Venn Diagrams
Given two libraries identify scaffolds unique to
one
Bioblocks, 350 mols
Create a tree for each library Identify the
difference in scaffolds.
Maybridge gt50,000 mols
21
Objects to Objects Pharmacophores to Scaffolds
www.altoris.com www.chemapps.com
www.patentinformatics.com
22
Identify Bioisosteric Replacements
23
Summary
Substructure enumeration Alternative for
chemoinformatics work Large datasets can be
handled efficiently Large datasets can be
organized and viewed Correlation with
properties Object to object Object to
numerical or categorical More research is
needed.
www.altoris.com www.chemapps.com
www.patentinformatics.com
24
Additional Information www.altoris.com www.paten
tinformatics.com www.chemapps.com Contact
information hugo_at_altoris.com
PatentInformatics
PatBLAST
Write a Comment
User Comments (0)
About PowerShow.com