ChEBI - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

ChEBI

Description:

thiamine(1 ) chloride. INN: thiamine. CHEBI:49105 thiamine(2 ) dichloride. aka thiamine chloride hydrochloride. aka thiamine hydrochloride ' ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 58
Provided by: kirilldeg
Category:
Tags: chebi | thiamine

less

Transcript and Presenter's Notes

Title: ChEBI


1
ChEBI
Kirill Degtyarenko, EMBL-EBI / EPO
2
The team
  • Rafael Alcántara
  • Michael Ashburner
  • Volker Ast
  • Michael Darsow
  • Paula de Matos
  • Marcus Ennis
  • Janna Hastings
  • Alan McNaught
  • Inma Spiteri
  • Christoph Steinbeck
  • Martin Zbinden

3
ChEBI What is it?
Chemical Entities of Biological Interest an
EBI database/dictionary of biochemical compounds
4
What are the biochemical compounds?
Can be defined as consisting of molecules not
directly encoded by the genome ... that are
either the products of nature or are synthetic
products used ... to intervene in the processes
of living organisms Michael Ashburner
5
Molecular entity
Any constitutionally or isotopically distinct
atom, molecule, ion, ion pair, radical, radical
ion, complex, conformer etc., identifiable as a
separately distinguishable entity IUPAC Gold
Book
6
In fact, ChEBI contains
  • Molecular entities
  • trans-vaccenic acid
  • Groups
  • trans-vaccenoyl group
  • Classes
  • fatty acids

7
Small molecules?
  • Yes, but big molecules as well!
  • alumina
  • amylose
  • metaborate
  • poly(vinyl alcohol)

8
Current status (17.12.08)
9
1-D ChEBI
  • Numeric ID
  • Carefully checked terminology
  • Unambiguous ChEBI name
  • IUPAC names
  • Cross-references to free resources

10
Unambiguous ChEBI name
  • CHEBI28918
  • L-adrenaline
  • not just adrenaline

11
Systematic Name (IUPAC)
2-3-(trifluoromethyl)phenylaminobenzoic acid
12
Common Name
  • flufenamic acid (INN English)
  • acide flufénamique (INN French)
  • ácido flufenámico (INN Spanish)
  • acidum flufenamicum (INN Latin)
  • Flufenaminsäure (German)

13
The Unpronounceables
CHEBI48935 (E)-roxithromycin
IUPAC name (3R,4S,5S,6R,7R,9R,10E,11S,12R,13S,1
4R)-4-(2,6-dideoxy-3-C-methyl-3-O-methyl-a-L-ribo-
hexopyranosyloxy)-14-ethyl-7,12,13-trihydroxy-10-
(2-methoxyethoxy)methoxyimino-6-3,4,6-trideoxy
-3-(dimethylamino)-ß-D-xylo-hexopyranosyloxy-3,5,
7,9,11,13-hexamethyloxacyclotetradecan-2-one
14
What is the common name of roxithromycin?
15
Roxithromycin (2)
CHEBI48844 roxithromycin
(E)-roxithromycin
(Z)-roxithromycin
16
What is thiamine?
17
Need for 2-D
  • Better to see the face than to hear the name
    (Zen proverb)
  • Structures and identifiers based on structures
    offer new ways of crosslinking to other databases
  • Structure search

18
Connection table
ChEBI 9 10 0 0 0 0 999 V2000
11.8219 -7.2713 0.0000 C 0 0 0 0 0 0
0 0 0 0 0 0 11.8219 -8.0922 0.0000 C
0 0 0 0 0 0 0 0 0 0 0 0 12.6074
-7.0165 0.0000 N 0 0 0 0 0 0 0 0 0
0 0 0 11.1072 -6.8574 0.0000 C 0 0
0 0 0 0 0 0 0 0 0 0 12.6039 -8.3505
0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
11.1072 -8.5027 0.0000 N 0 0 0 0 0
0 0 0 0 0 0 0 13.0886 -7.6818
0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
10.3923 -7.2713 0.0000 N 0 0 0 0 0 0
0 0 0 0 0 0 10.3888 -8.0922 0.0000 C
0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0
0 0 0 1 3 1 0 0 0 0 1 4 1 0 0 0
0 2 5 1 0 0 0 0 2 6 1 0 0 0 0
3 7 1 0 0 0 0 4 8 2 0 0 0 0 6 9
2 0 0 0 0 5 7 2 0 0 0 0 8 9 1 0
0 0 0 M END
19
2-D ChEBI
  • One or more 2-D (or 3-D) connection tables
  • One is default
  • Autogenerated images (PNG)
  • Default diagrams should be unambiguous

20
The Fine Art of chemical drawing
21
Linear forms of monosaccharides
22
Pyranose forms of monosaccharides
23
Fused systems
(R)-camphor
ambiguous
unambiguous
24
Square planar geometry
cisplatin
transplatin
25
From 2-D back to 1-D
  • SMILES
  • InChI

26
SMILES (1)
  • Simplified Molecular Input Line Entry
    Specification
  • Developed by David Weininger in 1988
  • Extended by others (e.g. Daylight)
  • String of standard ASCII characters
  • A number of valid SMILES can be produced for the
    same molecule

27
SMILES (2)
  • N1CNC2C1CNCN2
  • c1ncc2ncnc2n1
  • C1N\CN/C\2N/CN\C1/2
  • c1ncnc2/NC\Nc12
  • n1cc2c(nc1)ncn2
  • Hc1nc(H)c2n(H)c(H)nc2n1

28
InChI (1)
  • IUPAC International Chemical Identifier or InChI
  • Open source
  • Developed by Stein, Heller, Tchekhovskoi and
    McNaught
  • Used by NIST, PubChem, CML and ChEBI

29
InChI (2)
InChI1/C5H4N4/c1-4-5(8-2-6-1)9-3-7-4/h1-3H,(H,6,7
,8,9)/f/h7H
InChIKeyKDCGOANMDULRCW-QDQILVOLCG
30
Limitations (1)
  • Stereochemistry other than sp3 tetrahedral and
    sp2 trigonal planar
  • Polymers
  • Conformers
  • Radicals/different spin state
  • Topological isomers
  • Mixtures
  • Markush structures

31
Limitations (2)
cisplatin
transplatin
InChI1/2ClH.2H3N.Pt/h21H21H3/q2/p-2
32
3-D ChEBI
cisplatin
33
Uncertainty and ambiguity in chemistry
  • Compositional uncertainty
  • Positional uncertainty
  • Configurational uncertainty
  • Conformational uncertainty

34
Compositional uncertainty
  • Examples
  • an alkali metal cation
  • vanadate(V) anion
  • 2Hethanol

35
Positional uncertainty
  • Examples
  • L-bromohistidine residue
  • pteroic acid (several tautomers)

36
Configurational uncertainty
  • Examples
  • androstane
  • rel-(2R,3R)-2-amino-3-methylpentanoic acid
  • tetradec-11-enoic acid

37
Conformational uncertainty
  • Examples
  • cyclohexane chair, boat, twist
  • protein secondary structure ?, ?, ?

38
ChEBI ontology
  • Molecular structure ontology
  • Subatomic particle ontology
  • Role ontology
  • Biological role
  • Application

39
L-adrenaline
  • Molecular structure ontology
  • catecholamines
  • Biological role
  • hormone
  • Application
  • antiglaucoma
  • bronchodilator
  • cardiostimulant

40
The family relations
L-cystein-S-yl
L-cysteine()
L-cysteine zwitterion
cysteine
D-cysteine
L-cysteino
L-cysteine
L-cysteinium
L-cysteinyl
L-cysteinate(1)
L-cysteine residue
L-cysteinate(2)
L-cysteinate residue
41
Relationships in ChEBI
? Is A generic
? Has Part generic
? Is Conjugate Acid Of specific
? Is Conjugate Base Of specific
? Is Enantiomer Of specific
? Is Tautomer Of specific
R Is Substituent Group From specific
H Has Parent Hydride specific
F Has Functional Parent specific
? Has Role generic?
42
Is A relationship
?
L-cysteine
cysteine
is a
43
Is Enantiomer Of
?
L-cysteine
D-cysteine
is enantiomer of
44
Has Part
has part
?
L-cysteinium
L-cysteine hydrochloride
is part of
45
Is Conjugate Acid Of
L-cysteinium
L-cysteinate(2)
L-cysteine
L-cysteinate(1)
is conjugate acid of
46
Is Conjugate Base Of
L-cysteinium
L-cysteinate(2)
L-cysteine
L-cysteinate(1)
47
Acid/base relationships
L-cysteinium
L-cysteinate(2)
?
?
L-cysteine
L-cysteinate(1)
48
Is Tautomer Of
L-cysteine
L-cysteine zwitterion
49
Is Tautomer Of
1H-pyrrole
3H-pyrrole
2H-pyrrole
50
Has Parent Hydride
is parent hydride of
H
salutaridinol
has parent hydride
morphinan
51
Has Functional Parent
is functional parent of
F
7-O-acetylsalutaridinol
has functional parent
salutaridinol
52
Is Substituent Group From
L-cysteine


L-cysteinyl
L-cysteino


L-cysteine residue
53
The family relations
L-cysteine()
L-cysteinium
L-cystein-S-yl
cysteine
L-cysteine zwitterion
L-cysteine
D-cysteine
L-cysteino
L-cysteinyl
L-cysteinate(1)
L-cysteine residue
L-cysteinate(2)
L-cysteinate residue
54
Ontology of L-cysteine
55
Ontology of L-cysteine (1)
56
Ontology of L-cysteine (2)
57
Thank you
Write a Comment
User Comments (0)
About PowerShow.com