Title: Overview of Biochemical Pathways
1Overview of Biochemical Pathways
Academia Sinica Taiwan International Graduate
Program
- Chun-nan Chi
- Ph.D. Student of TIGP
- Academia Sinica
- cnchi_at_iis.sinica.edu.tw
2Outlines
- Biological Backgrounds
- Study Fields in Biochemical Pathways
- Competitors
- Trends of Pathway Research
- Reference
- Q A
3Biological Backgrounds
- What is Biochemical Pathways
- Types of Biochemical Pathways
- How Pathways work Together
- Why Pathway Studies is Important
4What is Biochemical Pathways
- A map of the biological mechanism of an organism
Fig 1. Metabolic Pathways ExPASy,
http//www.expasy.org/cgi-bin/show_thumbnails.pl,
founded by Roche Applied Science.
5Types of Biochemical Pathways
6How Pathways work Together
Signal Transduction
1 Glucoses 2 ATP
Metabolism
4 ATP 2 Pyruvate 2 NADH
Regulation
7Why Pathway Studies is Important
8Study Fields in Biochemical Pathways
- Overviews of Pathway Studies
- Literature Mining
- Pathway Diagrams
- Graphical Modelers
- Pathway Databases
- Pathway Representation
- Management Tools for Pathway
- Database Integration
- Standard for Exchanging Pathways
- Pathway Querying Reasoning
- Simulation
9Overviews of Pathway Studies
10Literature Mining
ACTIVATOR activate ACTIVATEE
- RAF6 activates NF-kappaB.
- Lck is activated by autophosphorylation at Tyr
394. - Anandamide induces vasodilation by activating
vanilloid receptors. - the activation of Rap1 by C3G
- the GTPase-activating protein rhoGAP
- the stress-activated group of MAP kinases
11Literature Mining
Text Mining
Alexander S. Yeh et al. (2003) Evaluation of
text data mining for database curation
lessons learned from the KDD Challenge
Cup Bioinformatics, Vol. 19 Suppl. 1 2003, pages
i331i339
12Literature Mining
Very Good Seminar You shouldnt miss!
Title Literature mining for biomedical
text Speaker Dr. Ting-Yi Sung Date 2004/11/08
Monday 1030 AM 1200 PM Location Auditorium
106 at new IIS Building
13Pathway Diagrams
- Static Images of Pathways (Primary)
- BBID 2001 (Biological Biochemical Image
Database) - The search engine finding the figures on many
textbooks or journals - BPC 2001 (Biochemical Pathway Chart)
- A poster-liked handmade static image that
describes the metabolic pathways and cellular /
molecular processes. - BioCarta 2001
- A database that store the pathway information
with static image format. Can be browsed by
categories and searched by keywords. Hosted by a
commercial company. - KEGG 2001 (Kyoto Encyclopedia of Genes and
Genomes) - It is a very famous database that specially
focuses on storing metabolic pathways
information. All proteins are categorized in
their functions by EC (Enzyme Classification)
Number system. - BRIDE 2001 (Biomolecular Relations in
Information Transmission Expression) - It is a binary relation search engine based on
KEGG database. It can search the binary
interaction between genes, proteins, or other
biological molecules.
14Pathway Diagrams
- Static Images of Pathways (Minor)
- SPAD 2001 (Signal PAthway Database)
- It is a database for genetic information and
signal transduction pathways. The site is
unfinished and seems to stop the construction. - KMIM 1999 (Kohn Molecular Interaction Map)
- It is an interaction map for representing the
network of multi-protein complexes, protein
modifications, and enzymes for reactions.
15Pathway Diagrams
- Dynamic Images of Pathways
- BioCyc 2002
- This is a database that contains metabolic
pathway information for 17 species. It presents
a pathway chart dynamically. - STKE (Signal Transduction Knowledge Environment)
- This is the site hosted by Science Magazine. It
collects pathways related to signal transduction
and is searchable by keywords.
16Pathway Diagrams
Ogata 1998 Computation with the KEGG pathway
database Ogata H, Goto S, Fujibuchi W, Kanehisa
M.Computation with the KEGG pathway
databaseBiosystems. 1998 Jun-Jul47(1-2)119-28.
http//igs-server.cnrs-mrs.fr/ogata/Paper/ogata98
BioSys.html
Karp 2000 The EcoCyc and MetaCyc
databases Peter D. Karp, Monica Riley, Milton
Saier, Ian T. Paulsen, Suzanne M. Paley, Alida
Pellegrini-Toole"The EcoCyc and MetaCyc
databases"Nucleic Acids Research, Vol. 28, No.
1, pp. 56-59http//nar.oupjournals.org/cgi/reprin
t/28/1/56
17Pathway Diagrams
- Important Sites for Pathway Diagrams
Metabolic Pathways
Signal Transduction Pathways
18Graphical Modelers
- What is the Graphical Modeler?
19Graphical Modelers
CellDesigner
BioUML
JDesigner
PathwayBuilder
20Graphical Modelers
- Output Format for those Graphical Modelers
21Graphical Modelers
- Do we solve the problems for modeling pathways?
Problem 1
- No graphical modelers can represent different
knowledge granularities of pathways. - No graphical modelers can represent the
incomplete knowledge of pathways.
Solution 1
- Use Compound Graph Fukuda 2001
- Use recursive representation of pathways
Demir 2002
Fig 2. TGF-ß / Smad Signal Transduction
Pathway Fukuda et al. (2001) Knowledge
Representation of Signal Transduction Pathways.
Bioinformatics, Vol. 17 no.9, 829-837
22Graphical Modelers
- Do we solve the problems for modeling pathways?
Problem 2
- No way to annotate the metadata of a pathway
for the following information - Species -
Tissues or Cell Type - Environmental Condition
(pH, Tempetc.) - Developmental Stage of
Organism
Fig 3. AKT Signaling Pathway Biocarta,
http//www.biocarta.com/pathfiles/h_aktPathway.asp
23Graphical Modelers
- Do we solve the problems for modeling pathways?
Problem 3
- Hard to represent the following architectures
of pathways - Pre-conditions Post-conditions
(Especially for Signal Transduction Pathways
- Concurrency
Solution 3
- Petri Net could be a good solution - BioPNML
(Bio Petri Net Markup Language) - Chen 2003
Quantitative Petri Net model - Matsuno 2000
Hybrid Petri Net - Hofestädt 1998
Quantitative Modeling
Fig 3. An Example of Petri Net Hubert Becker
(2001) Logistik - Ein Überblick (Logistics An
Overview) http//home.t-online.de/home/becker2/pet
rinet.gif
24Graphical Modelers
- Do we solve the problems for modeling pathways?
Problem 4
- No tool has a good representation for modeling
sequence information including - DNA Sequence
- RNA Sequence - Protein Sequence
Not Yet!!
Fig 4. DNA Double Strand Structure Ian Stansfield
(2001) Web resources for sequence
analysis http//www.abdn.ac.uk/mbi094/dna.gif
Do we solve the modeling problems?
25Graphical Modelers
- What we need to solve the modeling problems?
26Pathway Databases
- Most famous metabolic pathway databases
27Pathway Databases
- Other metabolic pathway databases
28Pathway Databases
- Some signaling transduction pathway databases
29Pathway Databases
- Some gene regulatory network databases
DBTBS
SPAD
30Pathway Representation
- Can we create a model to represent all the
situations of pathways?
?
31Pathway Representation
- What is a good representation model for pathways?
- Attributes - List of reactions - Synonyms,
Species - Characteristics - Concurrence - Cyclic -
Incomplete
- Types Reactants, Products, Catalysts,
Inhibitors, Sequences - Attributes Synonyms, Location, Concentration,
other Substrates
Pathway
Substrate
- Types Synthesis, Degrade, Transport,
Denatured, Transform, Decomposition - Attributes - List of reactants, products,
modifiers - Conditions before / after the
reaction - Kinetic Functions (Forward /
Backward) - Synonyms
- Types Nucleus, Cytosol, Inner membrane, cell,
organ, body - Attribute Synonyms, Temperature, Size, Local
Time - Characteristics - Pass-thru (e.g.
trans-membrane)
Reaction
Compartment
Event
Function
Timer
All have the Class-Instance concept
32Pathway Representation
- The important papers for pathway modeling
- Fukuda 2001
- Compound Graph
- Demir 2002
- Incomplete Pathways
33Pathway Representation
- How to implement the pathway model?
Relational Database is not enough! We need
stronger relations between entities.
Use Ontology instead!!
34Pathway Representation
- What is Ontology?
- A set of vocabularies and their relationships
Mary
Mary ate an apple
apple
eat
35Pathway Representation
- The Compound Graph Fukuda 2001
- Compound Graph CG (G, T)
- G Interaction Graph
- T Decomposition Tree
- Interaction Graph G (V, EG)
- V Vertex that represent concepts of knowledge
- EG Edge in Interaction Graph
- Decomposition Tree T (V, ET, r)
- V Vertex that represent concepts of knowledge
- EG Edge in Interaction Graph
- r The root node of the decomposition tree T
36Pathway Representation
- The Compound Graph Fukuda 2001
Petri Net
Decomposition Tree
Interaction Graph
37Management Tools for Pathway
- The well-known Ontology editor
- Protégé 2000
- It is a tool which allows user to construct a
domain ontology - It can import the existing ontology with RDFS,
OWL, or CLIP format. - It provides a set of API for developers to
create their own plug-ins - It was developed by SRI (Stanford Research
Institute)
38Management Tools for Pathway
- The screenshot of Protégé 2000
39Database Integration
- Current public databases related to pathways
Gene Database
Metabolic Pathways
Signaling Pathways
Regulatory Pathways
Transcription Factors Database
Disease Database
Microarray Database
Protein- Protein Interaction
40Database Integration
- What database we are going to integrate?
Gene Symbols
Enzymes
Enzymes
DNA Transcription Factors
Transcription Splicing
Translation Protein Info
41Database Integration
- Interesting Problem 1
- Predict new pathways according the genome
databases
- Paley 2002
- PathoLogic
- Predict new pathways from a genome database
based on the known model pathway database
- Yamanishi 2004
- Predict new pathways from 4 different genomic
databases
42Database Integration
- Interesting Problem 2
- Find the sequence of pathways based on Microarray
database in terms of time
43Standard for Exchanging Pathways
- Too many specific file formats to exchange the
knowledge of pathways
44Standard for Exchanging Pathways
- Current famous formats for pathway
45Standard for Exchanging Pathways
System Biology Markup Language
46Standard for Exchanging Pathways
- The main structure for SBML
lt?xml version"1.0" encoding"UTF-8"?gt ltsbml
xmlns"http//www.sbml.org/sbml/level2" level"2"
version"1"gt ltmodelgt lt/listOfParametersgt
lt/listOfParametersgt ltlistOfUnitDefinitionsgt
lt/listOfUnitDefinitionsgt ltlistOfCompartmentsgt
lt/listOfCompartmentsgt ltlistOfSpeciesgt
lt/listOfSpeciesgt ltlistOfReactionsgt
lt/listOfReactionsgt ltlistOfFunctionDefinitionsgt
lt/listOfFunctionDefinitionsgt
lt/listOfRulesgt lt/listOfRulesgt
ltlistOfEventsgt lt/listOfEventsgt lt/modelgt
lt/sbmlgt
- Variables
- Units
- Compartments
- Substances
- Reactions
- Functions
- Rules
- Triggers
47Standard for Exchanging Pathways
48Standard for Exchanging Pathways
- Advantages of SBML
- Good for quantitative analysis
- Good for simulation
- Disadvantages of SBML
- Can not represent the following relationships
between reactions (Which is bad for representing
pathways) - Sequential Relationships
- Conditional Branch
- Iterative Relationships
49Standard for Exchanging Pathways
- How many tools support SBML?
50Standard for Exchanging Pathways
Biological Pathways Exchange
51Standard for Exchanging Pathways
52Standard for Exchanging Pathways
- The Classes Hierarchy of BioPAX
53Standard for Exchanging Pathways
- Who is in BioPAX Project?
54Standard for Exchanging Pathways
- Advantages of BioPAX
- Good for pathways knowledge management
- Good for qualitative inference
- Disadvantages of BioPAX
- Only support metabolic pathways
- No quantitative information can be stored in
BioPAX
55Standard for Exchanging Pathways
- How many tools can support BioPAX
- Protégé 2000 with the OWL Plug-in
56Standard for Exchanging Pathways
57Pathway Querying Reasoning
Protein Name
Protein Sequence
Neighborhood Navigation
Route Suggestion
58Pathway Querying Reasoning
59Pathway Querying Reasoning
- Other reasoning idea 1 Find all crosstalk
pathways in a specific cell type - Given a set of genes, tell me all the pathways
that will influence these genes including
following information - Species
- Cell line / Cell type
- From what tissue?
- From what age of the organism?
- Treatment
- Reacting period
- All crosstalk pathways of the candidate pathways
Literature Mining
Time-course Microarray Data
Computational Method
60Pathway Querying Reasoning
- Other reasoning idea 2 Pathway Construction
Helper
Pathway Database
A ? C
61Simulation
62Simulation
- Previous works in Simulation
- Simulate the Kinetics of Biochemical Metabolic
Pathway - METAMODEL (Cornish-Bowden and Hofmeyr, 1991)
- Contain up to 20 reactions (enzyme) and up to 30
metabolites as well as its concentration - SCAMP (Sauro, 1993)
- A general purpose simulator of metabolic and
chemical networks - GEPASI (Mendes, 1993, 1997)
- Simulate the kinetics of biochemical reactions
- KINSIM (Barshop et al., 1993 Dang Frieden,
1997) - Good tool to simulate the Kinetics of
biochemistry - MIST (Ehlde Zacchi, 1995)
- A software package which can be used for dynamic
simulations, stoichiometric calculations and
control analysis of metabolic pathways
63Simulation
- Previous works in Simulation
- Simulate Gene Regulation Expression
- Meyers and Friedland, "Knowledge-based simulation
of genetic regulation in bacteriophage lambda",
Nucleic Acids Research, January 1984 - K. Koile and GC Overton. "A qualitative model for
gene expression." In Proceedings of the 1989
Summer. Computer Simulation Conference, pages
415-421, 1989 - Peter D. Karp(1993), "A Qualitative Biochemistry
and Its Application to the Regulation of the
Tryptophan Operon" - Arita, M., Hagiya, M. and Shiratori, T. (1994),
"GEISHA SYSTEM an environment for simulating
protein interaction" - McAdams and Shapiro (1995), "Circuit Simulation
of Genetic Networks", Science, 269, 650-656
64Simulation
- Previous works in Simulation
- Simulate Cell Division Cycle
- Tyson,J.J. (1991), "Modeling the cell division
cycle cdc2 and cyclin interactions" - Novak,B. and Tyson,J.J. (1995), "Quantitative
analysis of a molecular model of mitotic control
in fission yeast - Simulate Signal Transduction Mechanisms
- Bray D, Bourret R B, Simon M I. Computer
Simulation of the Phosphorylation Cascade
Controlling Bacterial Chemotaxis., Molecular
Biology of the Cell. VOL.4, PAGE.469-482 (1993)
65Simulation
- Previous works in Simulation
- The whole cell simulation
66Competitors
- Peter Karp (USA)
- Minoru Kanehisa (Japan)
67Peter Karp (USA)
Peter D. Karp Director Bioinformatics Research
Group Artificial Intelligence Center Stanford
Research Institute
68Minoru Kanehisa (Japan)
Minoru Kanehisa (?? ?) Director Bioinformatics
Center Institute for Chemical ResearchKyoto
University
69Trends of Pathway Research
- DB Represent.
- Integration
- Exchange
- Text mining
- Graphical Modeler
Pathway Database
Annotation
Data Acquisition
Mechanism Understanding
Function Prediction
Re-annotation
DB Management
Ouzounis CA, Karp PD. (2002) The past, present
and future of genome-wide re-annotation. Genome
Biol. 20023(2)COMMENT2001. Epub 2002 Jan 31.
70Reference
- Alexander S. Yeh et al. (2003) "Evaluation of
text data mining for database curation lessons
learned from the KDD Challenge Cup"
Bioinformatics, Vol. 19 Suppl. 1 2003, pages
i331i339 - Ogata H, Goto S, Fujibuchi W, Kanehisa
M."Computation with the KEGG pathway
database."Biosystems. 1998 Jun-Jul47(1-2)119-28
. - Peter D. Karp, Monica Riley, Milton Saier, Ian T.
Paulsen, Suzanne M. Paley, Alida
Pellegrini-Toole"The EcoCyc and MetaCyc
databases"Nucleic Acids Research, Vol. 28, No.
1, pp. 56-59 - Ken-ichiro Fukuda and Toshihisa Takagi,
"Knowledge representation of signal transduction
pathways." Bioinformatics, Vol 17, No. 9, 2001 - E. Demir, O. Babur, U. Dogrusoz, A. Gursoy, G.
Nisanci, R. Cetin-Atalay, M. Ozturk, "PATIKA an
integrated visual environment for collaborative
constrcution and analysis of cellular pathways."
Bioinformatics, Vol. 18 no. 7 2002, Pages
996-1003 - Natalya Fridman Noy, Ray W. Fergerson, Mark A.
Musen, "The Knowledge Model of Protege 2000
combining interoperatability and flexibility."
2th International Conference on Knowledge
Engineering and Knowledge Management (EKAW'2000),
Juan-les-Pins, France, 2000. - Yoshihiro Yamanishi, Jean-Philippe Vert, Minoru
Kanehisa, "Protein network inference from
multiple genomic data a supervised approach."
Bioinformatics, Vol. 20, Suppl. 1, 2004, pp.
i363-i370 - Suzanne M. Paley, Peter D. Karp, "Evaluation of
Computational Metabolic-Pathway Predictions for
Helicobacter pylori." Bioinformatics, Vol. 18,
no. 5, 2002, pp. 715-724 - Ouzounis CA, Karp PD. (2002) The past, present
and future of genome-wide re-annotation. Genome
Biol. 20023(2)COMMENT2001. Epub 2002 Jan 31. - Tomita,M., Hashimoto,K., Shimizu,T.S.,
Matsuzaki,Y., Miyoshi,F., Saito,K., Tanida,S.,
Yugi,K., Venter,J.C. and Hutchison,III,C.A.
(1999)"E-CELL software environment for
whole-cell simulation."Bioinformatics, 15, 72-84 - Schaff,J.C. and Loew,L.M. (1999) "The virtual
cell" In Altman,R.B., Dunker,A.K., Hunter,L. and
Klein,T.E. (eds), Pacific Symposium on
Biocomputing, volume 4, World Scientific,
Singapore, pp. 228-239
71Q A