Use of Chemical Information in Organic Synthesis - PowerPoint PPT Presentation

About This Presentation
Title:

Use of Chemical Information in Organic Synthesis

Description:

April 2006. Use of Chemical Information in Organic Synthesis ... Aliphatic Nucleophilic. Free radical. Sigmatropic. Substitution. Nucleophilic. Darzens condensation ... – PowerPoint PPT presentation

Number of Views:751
Avg rating:3.0/5.0
Slides: 46
Provided by: guen6
Category:

less

Transcript and Presenter's Notes

Title: Use of Chemical Information in Organic Synthesis


1
Use of Chemical Information in Organic Synthesis
Reaction Information for the Practicing Synthetic
Chemist The Search for Relevant
Answers
Guenter Grethe May, 2006
2
Use of Chemical Information in Organic Synthesis
Information Needs of Synthetic Organic Chemists
in Basic Research and Development
  • new preparation of intermediates and starting
    materials
  • well established, high yield preparations
    (experimental procedures)
  • new synthetic methodologies (new reagents,
    catalysts etc.)
  • information on starting materials (availability,
    price, physical data etc.)
  • physical properties of reagents, solvents and
    catalysts
  • access to the primary, secondary, and tertiary
    literature
  • spectral information of related compounds

General searching for information on molecules
precedes retrieval of
synthetic methodology data
3
Use of Chemical Information in Organic Synthesis
Differences in Molecule vs. Reaction Searching
  • Query Is this particular molecule or similar
    ones known? Specific data?
  • Answer Yes or No from existing databases,
    including patents
  • Query How to selectively reduce the nitrile
    group (transformation?)
  • Answer Pointers to relevant examples in the
    literature
  • Criteria
  • Efficient transformation
  • Functional group compatibility
  • Reactions conditions

Molecules
Reaction Conditions?
Reactions
4
Use of Chemical Information in Organic Synthesis
Available Reaction Databases
  • online
  • CASREACT (CAS) (ca. 10.5 Mio, including Spresi
    database, 1985 - present )
  • Spresi (InfoChem) (ca. 4.5 Mio, 1974 2004)
  • CrossFireplusReactions (Elsevier MDL, STN) (ca.
    10 Mio, 1779 - present)
  • ChemInform RX on STN (FIZ Chemie) (ca. 0.8 Mio)
  • CCR (Thomson Scientific) (ca. 0.6 Mio)
  • inhouse
  • ChemInform Reaction Library (Elsevier MDL)
  • Spresi (InfoChem)
  • CrossFire Beilstein (Elsevier MDL)
  • Specialty Databases (several vendors)
  • Proprietary Databases

For a good review see Zass, E. "Reaction
Databases", In Encyclopedia of Computational
Chemistry, Schleyer, P. von R. Allinger, N.L.
Clark, T. Gasteiger, J. Kollman, P.A.
Schaefer, H.F. Shreiner, P.R. (Eds.). Wiley,
Chichester, 4, 2402-2420. QD39.3.E46 E53 1998
5
Use of Chemical Information in Organic Synthesis
Use of Available Information in Synthesis
  • Preparation of a distinct compound requires
  • access to information about new synthetic
    methodologies in journals and databases
  • experimental details for the preparation of known
    intermediates and starting materials from
    databases, journals and other sources
  • tools to plan syntheses and select optimal
    reaction conditions
  • Preparation of a library of diverse compounds
    requires
  • all of the above
  • knowledge about the characteristics of functional
    groups
  • information about available building blocks
  • Process development requirements are defined by
  • access to information about various reaction
    conditions of a reaction
  • knowledge about the characteristics of molecules
    or their fragments under required reaction
    condition
  • tools to calculate the behavior of reagents,
    solvents, and catalysts

6
Use of Chemical Information in Organic Synthesis
Barriers Impeding the Use of Available
Information by Endusers
  • multiple access systems
  • different user interfaces
  • different modi operandi
  • difficult query formulation
  • substructure concept
  • keyword inconsistencies
  • limited post-search management of large hitlists
  • some integrated access to other information
    sources

Most importantly failure of available systems to
recognize and to facilitate the integration of
the vast knowledge of synthetic chemists
7
Use of Chemical Information in Organic Synthesis
Search Modes
  • Structure-Based Searches
  • Full structure
  • Only for reactions with known molecules (not very
    useful)
  • Reaction substructure (RSS)
  • Most frequently used mode (difficult for
    end-users to formulate effective query)
  • Reaction similarity
  • Various methodologies using different parameters
    (results often vary greatly, good for browsing
    and idea generation)
  • Reaction classification
  • Several methodologies, mostly based on structural
    information about reaction centers and immediate
    environment (good indexing tool, improvement
    over reaction similarity)
  • Reagents, Solvents
  • Full structure and substructure searches for
    molecules (not available in all databases, used
    mostly in conjunction with other structural
    searches)
  • Data-Based Searches
  • Keywords
  • intellectually derived terms for name reactions,
    reaction types etc. (incomplete, not very useful)
  • Journal, author, title, yields, etc.
  • Text or numeric data searches (mostly used in
    conjunction with structural searches)

8
Use of Chemical Information in Organic Synthesis
Problems with Reaction Searching
Synthetic Problem
Full Structure Search
No hits
Reaction Substructure Search (colored fragment)
119 hits
Class Code Search
672 hits (broad, reaction center only)
2972 hits
Keyword Search Michael Addition
Results were obtained from Elsevier MDLs
combined reaction databases (ca. 1 Mio
reactions) 2006
9
Use of Chemical Information in Organic Synthesis
Problems with Substructure Searching
DATABASE SIZE ca. 1 million reactions
Narrowly Defined Query
0 Hits
Problems - how to avoid excessively large
hitlist - how to formulate reasonable search
queries
  • Solutions
  • combination of several queries (expert approach)
  • indexing of reactions (focusing on relevant
    reactions)
  • - facilitating query building (non-expert
    approach, intuitive)

10
Use of Chemical Information in Organic Synthesis
Goal for an Efficient Reaction Data Management
System
Create an environment that allows for combining
the intelligence and creativity of synthetic
chemists with the processing and simulating power
of computers and the wealth of information in
databases to meet the challenges in the
laboratory for developing efficient syntheses.
11
Use of Chemical Information in Organic Synthesis
Requirements to Facilitate Enduser Searching
  • User interfaces based on users tasks and
    capabilities
  • (e.g. CrossFire Web, DiscoveryGate, Reaction
    Browser, Scifinder)
  • (see A Framework for the Evaluation of Chemical
    Structure Databases, Cooke,F Schofield, H.
    J. Chem. Inf. Comput. Sci. 2001, 41, 1131-1140)
  • Hierarchical thesauri for keywords and reaction
    types
  • Effective indexing of databases (e.g.
    classification)
  • Simplification of the querying process
    (natural, not rule dependent)
  • Efficient post-search management tools
    (e.g.clustering)
  • Seamless integration of various information
    sources
  • (web environment, point-and-click)
  • Most importantly available tools must
    simulate the chemists problem solving process

12
Use of Chemical Information in Organic Synthesis
Databases in DiscoveryGate
13
Use of Chemical Information in Organic Synthesis
Reaction Classification as Indexing Tool
Reaction Classification as Indexing Tool
Do We Still Need a Classification of Organic
Reactions?
  • Reasons
  • alternate method for indexing databases -
    complement to structure-based retrieval systems
  • access to generic types of information in
    retrieval systems
  • post-search management of large hitlists
  • simplification of query generation
  • linking of reaction information from different
    sources
  • source for deriving knowledge bases for reaction
    prediction and synthesis design
  • automatic procedures for analyses and
    correlations, e.g. quality control and overlap
    studies

14
Use of Chemical Information in Organic Synthesis
Reaction Classification as Indexing Tool
  • Examples of some recent work
  • Horace An Automatic System for the Hierarchical
    Classification of Chemical Reactions.
  • Rose, J.R., Gasteiger, J. J. Chem. Inf. Comput.
    Sci. 1994, 34, 74
  • COGNOS A Beilstein-Type System for Organizing
    Organic Reactions.
  • Hendrickson, J.B., Sander, T. J. Chem. Inf.
    Comput. Sci. 1995, 35, 251
  • Knowledge Discovery in Reaction Databases
    Landscaping Organic Reactions by a
    Self-Organizing Neural Network.
  • Chen, L., Gasteiger, J. J. Am. Chem. Soc. 1997,
    119, 4033
  • Classification of Organic Reactions Similarity
    of Reactions Based on Changes in the Electronic
    Features of Oxygen Atoms at the Reaction Sites.
  • Satoh, H., Sacher, O., Nakata, T., Chen, L.,
    Gasteiger, J., Funatsu, K. J. Chem. Inf. Comput.
    Sci. 1998, 38, 210
  • Topology-Based Reaction Classification An
    Important Tool for the Efficient Management of
    Reaction Information.
  • Kraut, H., Löw, P., Matuszczyk, H., Saller, H.,
    Grethe, G. Proceed. 5th Internat. Conf. Chem.
    Struct., Noordwijkerhout, The Netherlands 1999,
    26
  • Analysis of Reaction Information.
  • Grethe, G. In Handbook of Chemoinformatics
    Gasteiger, J. (Ed.) Wiley-VCH, Volume 4, 1407
    1427, Weinheim, 2003

15
Use of Chemical Information in Organic Synthesis
Reaction Indexing through Classification
Based on
Keywords Michael addition, Michael reaction,
ring closure. Molecule Type N-heterocycle,
isoquinoline, quinolizidine.. Reaction Type
reaction centers
16
Use of Chemical Information in Organic Synthesis
Reaction Classification - Background
  • Classify v.2. 5, developed by InfoChem, Munich
  • Based on InfoChems reaction center perception
    algorithm
  • A bond is defined as a reaction center if it is
    made or broken
  • An atom is defined as a reaction center if it
    changes
  • number of implicit hydrogens
  • number of valencies
  • number of ?-electrons
  • atomic charge
  • the connecting bond is a reaction center

Rules and Definitions
17
Use of Chemical Information in Organic Synthesis
Reaction Classification - Background
Rules and Definitions
  • Hashcodes are calculated for all reaction centers
    taking into account atom properties
  • atom type
  • valence state
  • total number of bonded hydrogens (implicit plus
    explicitly drawn)
  • number of ?-electrons
  • aromaticity
  • formal charges
  • reaction center information
  • The sum of all reaction center hashcodes of all
    reactants and one product of a reaction provides
    the unique reaction classification code
  • ClassCode

18
Use of Chemical Information in Organic Synthesis
Reaction Classification - Background
Rules and Definitions
  • Inclusion of atoms in the immediate environment
    (spheres)
  • reaction centers only (0-sphere BROAD)
  • reaction centers ?-atoms (1-sphere MEDIUM)
  • reaction centers ?-atoms (2-sphere NARROW)
  • inclusion of one sp3-atoms during sphere
    expansion
  • Atom equivalency
  • atoms in the same group of the periodic table,
    with the exception of row-2 elements, are
    considered equivalent
  • Multiple occurrences of identical transformations
    are handled as one

19
Use of Chemical Information in Organic Synthesis
Reaction Classification - Background
Rules and Definitions
20
Use of Chemical Information in Organic Synthesis
Reaction Classification Clustering of Search
Results
  • Classification codes are data
  • stored in the database
  • usable for sorting (clustering)

Result 156 hits
Clustered by Classification Code MEDIUM)
RSS-Search Query (in red)
72 clusters
1.Cluster (20 rxns)
2.Cluster (15 rxns)
3.Cluster (13 rxns)
4.Cluster (8 rxns)
21
Use of Chemical Information in Organic Synthesis
Classification by Reaction Names
  • Chemists are familiar with Name Reactions
    (Diels-Alder, Michael etc.)
  • Papers in a one issue of JOC (22, 2004) mentioned
    20 name reactions, known and lesser known, some
    multiple times
  • e.g.,Mitsunobu reaction, Nazarov reaction, Wolff
    rearrangement etc.
  • Several books dealing exclusively with Name
    Reactions (ca.700 reactions)
  • Use of Name Reactions facilitates reaction
    retrieval
  • Complementary to other searches
  • Used in combination with other data
  • Easier alternative to formulating complex RSS
    queries
  • Excellent browsing tool
  • Overview of scope and limitations of a given
    reaction, e.g. Aldol reaction
  • Combining different reaction types leading to
    same compound class
  • Hantzsch pyridine synthesis from dihydropyridines
    or ß-keto esters
  • Fischer Indole synthesis from hydrazines or
    hydrazones
  • Darzens reaction of epoxides from esters, amides,
    sulfones, or nitriles

References
Named Organic Reactions, Laue, T. and Plagens,
A., Eds., John Wiley Sons, 1st Edition 1999, 2nd
Edition 2005 Organic Syntheses Based on Name
Reactions, Hassner, A. and Stumer,C., Eds.,
Elsevier Science,1st Edition 1994 2nd Edition
2002 Name Reactions, Li, J. J., Ed., Springer,
2002 Strategic Applications of Named Reactions,
Kürti, L. and Czakó, B., Eds., Elsevier,
2005 Name Reactions and Reagents in Organic
Synthesis, Mundy, B.P Ellerd, M.G. and Favaloro,
F.G., Jr. Wiley Interscience 2005
Note The work on classification by reaction
names is being developed at InfoChem (Munich) in
consultation with G.Grethe
22
Use of Chemical Information in Organic Synthesis
Classification by Reaction Names - Requirements
  • Established electronically not intellectually
  • NOW Intellectually derived
  • Inclusion of intellectually derived keywords
    greatly varies from database to database and
    depend on abstractors and are either too
    inclusive or not comprehensive
  • Example Michael addition 184 hits (keywords)
    vs. 89 hits (RSS search) 52
    hits (reaction name keywords)
  • FUTURE Electronically derived
  • Assignments based on single or multiple RSS
    searches
  • Boolean logic is applied to combine and/or
    subtract search results (queries)
  • Assignments are pre-processed and added as data
    to database(s)
  • Name reactions are aligned in hierarchical order
  • Based on main reaction categories (addition,
    substitution, rearrangements, eliminations,
    oxidations, reductions)
  • Reactions can be listed in multiple categories,
    e.g.
  • Baeyer-Villiger oxidation in Oxidation and
    Rearrangement
  • Hierarchy must be able to accommodate non-name
    reactions (future project)
  • Reactions containing n reactions (e.g., tandem
    reactions) are listed in n categories
  • Individual name reactions have to be recognizable
  • Otherwise, stored under Miscellaneous
  • Queries and corresponding names are stored in
    spreadsheet

Use of Chemical Information in Organic Synthesis
23
Use of Chemical Information in Organic Synthesis
Classification by Reaction Names - Hierarchy
Main categories
First Level
Second Level
Third Level
1,2-Addition
Darzens condensation
Sulfones
Addition
1,4-Addition
Michael reaction
Intermolecular
Cycloaddition
Diels-Alder reaction
42 Cycloadditions
Aromatic electrophilic
Friedel-Crafts acylation
Intramolecular
Substitution
Aliphatic Nucleophilic
Schotten-Baumann reaction
Free radical
Gomberg-Bachmann reaction
Intermolecular
Nucleophilic
Hofmann rearrangement
Alkyl
Rearrangements
Sigmatropic
3,3 Sigmatropic rearrangement
Claisen rearrangement
Radical
Cope reaction
Elimination
Chugaev reaction
Reductions
Cannizaro reaction
Intermolecular
Oxidations
Baeyer-Villiger oxidation
Lactones
Heterocyclic Synthesis
Hantzsch pyridine synthesis
Modified
Miscellaneous
Alper reaction
Cyclocarbonylation
24
Use of Chemical Information in Organic Synthesis
Classification by Reaction Names Keyword
Generation
Example Intermolecular Mannich reaction with
CH-acidic compounds
Procedure - generate query for general search
- check hitlist for non-relevant hits -
formulate queries to eliminate negatives -
combine queries using Boolean operators
Mannich reaction
Query Q1
Elimination of negative hits
Biginelli reaction
Query Q2
Aza Diels-Alder reaction
Query Q3
Query set for intermolecular Mannich reaction
with CH-acidic compounds Q1 (Q2Q3)
25
Use of Chemical Information in Organic Synthesis
Classification by Reaction Names
Example of query menu (partial view) from
InfoChems SpresiWeb
26
Use of Chemical Information in Organic Synthesis
The design of organic syntheses by chemists
without the help of computers proceeds in
anything but a systematic stepwise manner from
the target molecule to available starting
materials. A systematic stepwise approach is
more the exception than the rule. The human
mind solves problems by lateral thinking,
jumping from one idea to the next, from one
question to a different one, from retrosynthetic
thinking to considering the course and outcome of
a reaction ,etc. Gasteiger, J. Ihlenfeldt,
W.D. Roese, P. Recl.Trav.Chim.Pays-Bas 1992,
111, 270.
The paradigm in an ideal electronic world
Journals
Major Reference Works
Books
Databases
E-Labjournal
Knowledge, Intuition, and Experience of
Synthetic Chemist
27
Use of Chemical Information in Organic Synthesis
Integrated Major Reference Works (iMRW)
(Reaction Databases, DiscoveryGate )
(Elsevier MDL, Third Party, Proprietary
etc.)
present status
ClassCodes
LinkFinderPlus (citations)

LinkFinderPlus (citations)
Tertiary Sources
Primary Journals
Major Reference Works (MRWs)
iMRW links
Future links
28
Use of Chemical Information in Organic Synthesis
Integrated Major Reference Works - Concept
  • Simulating chemists approach of gathering
    information from various sources (lateral
    approach) for solving synthetic problems through
    a simple point-and-click mechanism
  • Assisting chemists with the synthesis of new
    compounds by providing complementary information
  • With examples for synthetic methodologies from
    reaction databases
  • From summaries, critically evaluated by experts,
    describing
  • reaction mechanisms
  • principles of stereo-controlled reactions
  • applications, preparations, and properties of
    reagents
  • and other information generally not found in
    reaction databases
  • Through one-click linking to the primary
    literature when combined with LinkFinderPlus

29
Use of Chemical Information in Organic Synthesis
Integrated Major Reference Works - Summary
iMRW.
  • is a unique collaboration between Elsevier MDL,
    InfoChem and leading scientific
  • publishers (Elsevier Science, Georg Thieme
    Verlag, and Springer-Verlag)
  • provides one-click, bi-directional linking based
    on reaction type between synthetic methodology
    databases and electronic versions of major
    reference works (MRWs) or between individual
    MRWs, i.e.a true integration of information
  • allows text and (sub)structure searching over
    multiple major reference works from a single
    user interface

30
Use of Chemical Information in Organic Synthesis
Major Reference Works in iMRW
  • Detailed information about methodologies based
    on reaction type
  • Information about scope and limitations of
    reactions
  • Evaluated experimental procedures
  • Information about reaction mechanism,
    stereo-control, effect of substituents and
    ligands, and other factors influencing a
    reaction
  • Information about reagents and catalysts, their
    preparation and properties
  • Updates for each of them are planned or under
    consideration by the publishers and will be added
    when available

31
Use of Chemical Information in Organic Synthesis
Comprehensive Asymmetric Catalysis (CAC) - Summary
Editors Eric N. Jacobsen, Andreas Pfaltz,
Hisashi Yamamoto
(1999)
CAC is an innovative reference work that reviews
in three volumes catalytic methods for asymmetric
organic synthesis, a major challenge in synthetic
chemistry today. Illustrated by over 6,000
reactions critically evaluated by 60 leading
experts in the field, the basic principles,
mechanisms, basis for stereoinduction, and scope
and limitations of asymmetric reactions are
covered in-depth.

32
Use of Chemical Information in Organic Synthesis
Comprehensive Organic Functional Group
Transformations (COFGT) Summary
Editors-in-Chief Alan R. Katritzky, Otto
Meth-Kohn, Charles W. Rees
(1995)
COFGT covers in 40,000 reactions and seven
volumes the vast subject of organic synthesis in
terms of the introduction and interconversion of
functional groups. The editors have adopted a
rather rigorous, logical and formal treatment on
the basis of structure, which enables a detailed
analysis of all known, and indeed of some as yet
unknown, functional groups. Therefore, the
treatise deals rationally and comprehensively
with the method of their construction.
33
Use of Chemical Information in Organic Synthesis
Science of Synthesis - Summary
Houben-Weyl Methods of Molecular Transformations
Editorial Board D. Bellus, S. V. Ley, R.
Noyori, M. Regitz P. J. Reider, E. Schaumann, I.
Shinkai, E. J. Thomas, B. M. Trost
2001
  • Science of Synthesis is the authoritative and
    comprehensive reference work for the entire field
    of organic and organometallic synthesis. The
    series of 48 volumes will be published over a
    period of 8 years, it will present 15,000
    selected synthetic methods for all classes of
    compounds illustrated by 150,000 reactions, and
    it includes
  • Methods critically evaluated by leading
    scientists
  • Background information and detailed
    experimental procedures
  • Schemes and tables which illustrate the
    reaction scope

34
Use of Chemical Information in Organic Synthesis
Collecting Information for the Synthesis of a new
Compound
Target molecule
Muray, E. Rifé, J. Branchadell,
V. Ortuno, R.M. J. Org. Chem. 2002, 67, 4520
4525 (The paper describes the syntheses of
cyclopropyl nucleosides as potential antiviral
and antitumor agents)
35
Use of Chemical Information in Organic Synthesis
Synthesis Plan
Retrosynthetic Analysis N1-alkylation of adenine
1.Step general information about the alkylation
reaction 2.Step information about the
preparation of A, including stereochemistry 3.Step
information about scope and limitations, effect
of substituents, applicable reagents etc.
36
Use of Chemical Information in Organic Synthesis
Reaction Substructure Data Search in
DiscoveryGate
37
Use of Chemical Information in Organic Synthesis
38
Use of Chemical Information in Organic Synthesis
39
Use of Chemical Information in Organic Synthesis
Search for Similar Reactions in iMRW
40
Use of Chemical Information in Organic Synthesis
Literature Linking
COFGT chapter
41
Use of Chemical Information in Organic Synthesis
Text Search in iMRW
42
Use of Chemical Information in Organic Synthesis
Information about Enantioselective
Cyclopropanation from CAC
43
Use of Chemical Information in Organic Synthesis
Text Search Results from COFGT and Linking to
Literature
44
Use of Chemical Information in Organic Synthesis
Integration of iMRW with Reaction Database
45
Use of Chemical Information in Organic Synthesis
Conclusion
  • DiscoveryGate provides chemists with relevant
    information from different sources required for
    solving synthetic problems in a single system
    allowing for interaction by the user in an
    interactive fashion
  • Access is provided from an intuitive
    user-interface by a simple point-and-click
    mechanism.
  • The system very closely simulates the lateral
    information gathering process of synthetic
    chemists
Write a Comment
User Comments (0)
About PowerShow.com