Title: Amarnath Gupta
1Department of Computer Science Engineering
University of California, San DiegoCSE-291
Ontologies in Data IntegrationSpring
2004Ontologies and Biological Pathways
2So, What is an Ontology Again?
- From previous classes
- Sowa The subject of ontology is the study of
the categories of things that exist or may exist
in some domain. The product of such a study,
called an ontology, is a catalog of the types of
things that are assumed to exist in a domain of
interest D from the perspective of a person who
uses a language L for the purpose of talking
about D A formal ontology is specified by a
collection of names for concept and relation
types organized in a partial ordering by the
type-subtype relation. - Guarino Theory of formal distinctions
- among things
- among relations
- Basic tools
- Theory of parthood
- Theory of integrity
- Theory of identity
- Theory of dependence
Is this good enough to characterize all concepts
and relations?
3Description Logics as Ontology Frameworks
- You have learnt about Description Logics
- DLs allow you to do the following
4Property Frames in DLs
- Some Description Logics like SHOQ(D)1, a
progenitor of OWL, allow - Roles or properties to be more powerful
- If R and S are roles, one can specify a role box
that contains - role equivalence axioms ?component_of.? ?
?part_of.? - role inverses (not present in SHOQ, but present
in SHIQ) - role inclusion axioms R ? S
- role transitivity axioms Trans(R)
- Thus one can construct role hierarchies in
addition to concept lattices
1Ian Horrocks and U. Sattler. Ontology Reasoning
for the Semantic Web. In B. Nebel, editor, Proc.
of the 17th Int. Joint Conf. on Articial
Intelligence (IJCAI'01), Morgan Kaufmann, pages
199-204, 2001.
5Thing-Centric Ontologies
- Now lets try these
- sky
- blue_sky sky ? ? has_color.blue
- cloudy_sky sky ? ? covered_by.cloud
- rain
- acid_rain ? rain
- acid_rain_from_cloudy_sky acid_rain ? ?
drops_from.cloudy_sky - Is this reasonable?
- How about these?
- year
- quarter ? ??4 part_of.year
- mid_term ? exam ? final_test ? ?
occurs_in.quarter - Is it working? Why?
Not every concept and relation is thing-centric!!
6Ontologies for Processes, Events, Time
- Temporal Description Logic2
- Allens interval relations
2A. Artale and E. Franconi. A temporal
description logic for reasoning about actions and
plans. Journal of Artificial Intelligence
Research, 9463--506, 1998
7Temporal Description Logic
- Ingredients
- non-temporal concepts E
- temporal concepts C
- things that change their state
- temporal qualifier C_at_X where X is a temporal
variable - temporal constraints Tc
- (X (R) Y) where
- X is any temporal variable or the NOW interval
- R can be Allens interval relations or an
expression composed from it - existential quantifiers
- ? (X) Tc.C
- selections pE where p is
- an atomic feature f
- a parameterized feature f
8Applying Temporal DL
translocation
in-nucleus(protein)
in-cytoplasm(protein)
y
x
- Translocation of a protein
- translocation ??(x y)(x m )( m y) ((Protein
InCytoplasm)_at_x ? (Protein InNucleus)_at_y) - Protein is the formal parameter of this action
- States of the Protein are treated as though they
are different type assignments for the same
variable - The above is a definition of the term
translocation - Now we can have an assertion (meaning data) of
the form - translocation(tp1, MAPK-translocation), i.e., of
the form translocation(Interval, Action) to
designate a specific case, thus implying - translocation(i, a) ? ?p. Protein(a, p) ?
- ?j,l.
(InCytoplasm(j,p) ? InNucleus(l,p) ? m(j,i) ?
m(i,l))
9Applying Temporal DL
- Some identities
- ? x (x a ). C_at_x ? xy (y mi )(x mi y). C_at_x
- ? x (x d ). C_at_x ? xy (y s )(x f y). C_at_x
- ? x (x o ). C_at_x ? xy (y s )(x fi y). C_at_x
- A little more complex case
We only really need the relations s, f and mi
GRB2_secondary_response
GRB2_binding
PTK_ligand_binding
w
z
tyrosin_phosphorylated
y
autophosphorylation
tyrosin
y
x
tyrosin_p ? ? x (x o ). (tyrosin_at_x ?
autophosphorylation) GRB2_s_r ? ? (y z w)(y b
w)(z b w) (tyrosine_p_at_y ? PTK_l_b_at_z ? GRB2_b_at_w )
10Applying Temporal DL
- More features of the temporal DL
- path p ? q
- Protein? bound should be interpreted as
- ? a,p,i,o1 Protein(a, p, i) ? bound(i, p, o1)
- Agreement operator ?
- (Protein? bound ? Receptor)_at_y means at the
interval y the object to which Protein is bound
is Receptor) - Substitution
- Suppose A ? ? (x y z w)() is an axiom and
- B ? ? (x u v)() is another axiom whose body is
a part of A - The temporal substitutive qualifier (Bx_at_v)
renames within the defined B action the variable
x to w and it is a way of making coreference
between two temporal variables, while the
temporal constraints peculiar to the renamed
variable x are inherited by the substituting
interval w. This will eliminate x from A. - This can be used to define one temporal concept
in terms of another
11And now on to Biological Pathways
- The goals are
- to comprehend what we need to represent before we
think about how to represent them - what computations we can do with them
12What are Pathways?
- A pathway is a set of linked biological
components interacting with each other over time
to generate a biological effect - A component in a pathway can often be broken down
into a finer level of interacting components that
finally get to single biochemical reactions - When people talk about pathways they refer to
- signal transduction networks
- metabolic pathways
- gene regulatory pathways
- protein-protein interaction networks
13Signal Transduction Networks
- What is Signal Transduction?
Process by which a cell converts one kind of
signal or stimulus into another
14The Big Picture
- How do organisms communicate with their
environment? - How do cells exchange information?
- What information needs to be exchanged?
- What is the currency of information?
15Events
- Stimuli
- Synthesis of signaling molecule by the signaling
cell. - Release of signaling molecule by the signaling
cell. - Transport of the signal to the target cell.
- Detection of the signal by a specific receptor
protein. - Responses
- Reception First messenger extracellular
molecule (signal), binds to a receptor. - Transduction
- Amplification Binding activates receptor
protein, which then activates relay protein. - Conversion Relay protein stimulates another
membrane protein which acts as an effector
(effects changes in cell). - Induction/Response Effector protein enzyme
that produces a secondary messenger (cytoplasmic
molecule that triggers metabolic and/or
structural responses within cell). - Removal of the signal, often terminating the
cellular response.
16Types of Signals
- Extracellular
- Signal molecules are specific to their receptors
- Receptors, usually proteins, have N terminal face
outwards and C terminal inside the cell. - When bound to a signal molecule, a receptor
changes its conformation
17Types of Signals
- Intracellular
- Mostly triggered by the extracellular signal
- Converts the extracellular signal into an
intracellular signal - Eg. - G protein, GTPase, cAMP, Ca, Kinases,
phosphatases and many more - Also called second messengers
18Types of Signals
- Intercellular
- Extracellular signalling
- Endocrinology
- Types
- Endocrine Travel through blood
- Paracrine In the vicinity
- Autocrine Same cell type
- Juxtacrine Along cell membranes
19Types of Signals
- Hormones
- Between cells or tissues within an individual
- Process
- Synthesis ? Storage and secretion ? Transport ?
Recognition of hormone by its receptor ? change
in receptor shape ? Relay and amplification of
signal ? Response - Sending cell is a specialized cell while the
receiving can be of any type - A single hormone can have many receptors for
different pathways or many hormones can have same
receptor to invoke same pathway - Two classes of hormone receptors
- Membrane associated
- Cytoplasmic
20Cellular Response
- depends on the particular signaling pathways -
may involve changes in - cell cycle progression
- gene expression
- protein trafficking
- cell migration
- cytoskeleton architecture
- adhesion
- metabolism
- cell survival
21Example RAS-RAF-MEK-MAPK pathways
- It should be noted that the RAS-RAF-MEK-MAPK
pathway is only one example of so called MAPK
(Mitogen-Activated Protein Kinase) pathways . - Two other mammalian MAPK pathways involving JNK1
and p38, are involved in stress responses (they
are also MAPK pathways).
22RAS-RAF-MEK-MAPK
Ligand binds receptor PTK
23RAS-RAF-MEK-MAPK
Ligand binds receptor PTK Autophosphorylation
on tyrosine
24RAS-RAF-MEK-MAPK
Ligand binds receptor PTK Autophosphorylation
on tyrosine GRB2 (a SH2- and SH3-containing
protein) binds to the receptor phosphotyrosine
motif Y-V/L-N-X via its SH2 domain
25RAS-RAF-MEK-MAPK
Ligand binds receptor PTK Autophosphorylation
on tyrosine GRB2 (a SH2- and SH3-containing
protein) binds to the receptor phosphotyrosine
motif Y-V/L-N-X via its SH2 domain The SH3 of
GRB2 binds constitutively to the proline-rich
sequence in the C-terminus of SOS (a guanine
nucleotide exchange factor for RAS).
SOS
26RAS-RAF-MEK-MAPK
Recruitment of SOS to the close proximity of
RAS in the membrane
SOS
27RAS-RAF-MEK-MAPK
RAS becomes activated by exchanging GDP for GTP
RAS
GTP
GDP
SOS
28RAS-RAF-MEK-MAPK
The RAS-GTP effector domain interacts with the
N-terminal regulatory region of the RAF
(serine/threonine protein kinase), hence
recruiting RAF to the membrane
RAS
GTP
RAF
SOS
29RAS-RAF-MEK-MAPK
Activation of RAF (most likely by
phosphorylation of RAF and binding to the
scaffold protein 14-3-3)
RAS
GTP
14-3-3
RAF
SOS
30RAS-RAF-MEK-MAPK
Activation of RAF (most likely by
phosphorylation of RAF and binding to the
scaffold protein 14-3-3)
RAS
GTP
14-3-3
RAF
SOS
31RAS-RAF-MEK-MAPK
Activated RAF in turn activates MEK (also
called MAPK kinase a dual specificity kinase) by
phosphorylation on two conserved serine residues
in MEK.
RAS
GTP
14-3-3
RAF
SOS
MEK
32RAS-RAF-MEK-MAPK
Activated RAF in turn activates MEK (also
called MAPK kinase a dual specificity kinase) by
phosphorylation on two conserved serine residues
in MEK.
RAS
GTP
14-3-3
RAF
SOS
MEK
33RAS-RAF-MEK-MAPK
Activated MEK activates MAPK (a
serine/threonine protein kinase) by
phosphorylation of conserved threonine and
tyrosine residues.
RAS
GTP
14-3-3
RAF
SOS
MEK
MAPK
34RAS-RAF-MEK-MAPK
Activated MEK activates MAPK (a
serine/threonine protein kinase) by
phosphorylation of conserved threonine and
tyrosine residues.
RAS
GTP
14-3-3
RAF
SOS
MEK
MAPK
35RAS-RAF-MEK-MAPK
Activated MAPK phosphorylates a number of
substrates in the plasma membrane and the
cytoplasm
RAS
GTP
14-3-3
RAF
SOS
MEK
Substrates
MAPK
Substrates
36RAS-RAF-MEK-MAPK
- Activated MAPK phosphorylates a number of
substrates in the plasma membrane and the
cytoplasm - It also translocated into the nucleus(within
min) where it phosphorylates nuclear
transcription factors.
RAS
GTP
14-3-3
RAF
SOS
MEK
MAPK
Substrates
37RAS-RAF-MEK-MAPK
- Activated MAPK phosphorylates a number of
substrates in the plasma membrane and the
cytoplasm - It also translocated into the nucleus(within
min) where it phosphorylates nuclear
transcription factors.
RAS
GTP
14-3-3
RAF
SOS
MEK
Substrates
38RAS-RAF-MEK-MAPK
- Activated MAPK phosphorylates a number of
substrates in the plasma membrane and the
cytoplasm - It also translocated into the nucleus(within
min) where it phosphorylates nuclear
transcription factors.
RAS
GTP
14-3-3
RAF
SOS
MEK
MAPK
Substrates
39RAS-RAF-MEK-MAPK
- Activated MAPK phosphorylates a number of
substrates in the plasma membrane and the
cytoplasm - It also translocated into the nucleus(within
min) where it phosphorylates nuclear
transcription factors.
RAS
GTP
14-3-3
RAF
SOS
MEK
MAPK
Substrates
40RAS-RAF-MEK-MAPK
- Activated MAPK phosphorylates a number of
substrates in the plasma membrane and the
cytoplasm - It also translocated into the nucleus(within
minutes) where it phosphorylates nuclear
transcription factors. - Transcription of genes important for cell
proliferation.
RAS
GTP
14-3-3
RAF
SOS
MEK
MAPK
Substrates
41Metabolic Pathways
- What is metabolism?
- The sum of all the chemical and physical changes
that take place within the body and enable its
continued growth and functioning. Metabolism
involves the breakdown of complex organic
constituents of the body with the liberation of
energy, which is required for other processes,
and the building up of complex substances, which
form the material of the tissues and organs.
42Chemical reactions
- Reactants and products
- together called metabolites
- Free energy change (?G) of a reaction
- A B ? C D
- ?G ?Go RT ln CD / AB
- depends on concentrations and nature of
metabolites - ?G lt 0 for a spontaneous (exergonic) reaction
- ?G gt 0 for an endergonic reaction
- Chemical equilibrium
- Same rate of forward and backward reactions
- ?G 0, let Keq CD/AB, the ratio of
products to reactants at equilibrium - ?Go - RT ln Keq
- Keq e?Go/RT
43Rate Law
- Consider a reaction of overall stoichiometry,
- The rate, or velocity, v of this reaction is the
amount of P formed or the amount of A consumed
per unit time. Thus - Rate law states that
- Where k is rate constant. v is a function of A
to the first power, or the first order. k is
called first order constant.
Â
44Equilibrium constant and equation rates
- For a reversible reaction A B ? C D
- the rate will be the difference between the
forward and reverse rates - dC/dt kf AB - kr C D
- At equilibrium,
- kf AB kr C D
- Keq kf / kr C D / AB
45Enzymes
- usually proteins. A small number of enzymes are
made of RNA (ribozymes).
- are usually quite big (compared to the portions
of the reactants or substrates which are modified
in the reaction to be catalyzed).
Ribozyme (self-splicing intron)
Enzyme(hexokinase)
46 Enzymes have a substrate binding site which
binds the reaction substrates and brings them
together in the orientations appropriate for the
reaction.
This binding is usually highly specific. Often,
one enzyme catalyses only one type of reaction
between a specific set of substrates.
47 Enzymes have an active sitea specialized
configuration of side-chain and main-chain atoms
located at the substrate binding site which
assist in the chemical steps of the reaction.
Active site
Triosephosphateisomerase
48Active sites
- 3-dimensional cleft
- can be formed by faraway residues
- Lysozymes active site includes residues at
positions 35, 52, 62, 63, 101, 108 (out of a
total of 129 residues) - Small fraction of the total volume of an enzyme
- Substrates are bound to enzymes through multiple
weak attractions
49Regulation of enzymes
- Reversible and irreversible inhibition
- Competitive and allosteric regulation
- Allosteric regulation can be activation or
inhibition - Tense (T) and relaxed (R) states
- Activator binds to R state
- Inhibitor binds to T state
- Different kinetics for each
50Rate of reactions
51Regulatory control of enzymes
- Alteration of enzyme activity
- Enzyme modification
- Covalent modification
- Protein-protein interaction
- Substrate control
- Product control
- Allosteric control
52Regulatory control of enzymes
- Alteration of number of enzyme molecules
- Transcription
- Translation
- Control of enzyme degradation
- Compartmentalization
- Example hexokinase in brain and liver
53Enzyme Nomenclature
- Oxidoreductases (EC Class 1)
- Transfer electrons (RedOx reactions)\
- Transferases (EC Class 2)
- Transfer functional groups between molecules
- Hydrolases (EC Class 3)
- Break bonds by adding H2O
- Lyases (EC Class 4)
- Elimination reactions to form double bonds
- Isomerases (EC Class 5)
- Intramolecular rearangements
- Ligases (EC Class 6)
- Join molecules with new bonds
54Example entry from the Enzyme Database at
http//www.expasy.ch/enzyme/
ID 2.3.1.43 DE Phosphatidylcholine--sterol
O-acyltransferase. AN Lecithin--cholesterol
acyltransferase. AN LCAT. AN
Phospholipid--cholesterol acyltransferase. CA
Phosphatidylcholine sterol sterol ester CA
1-acylglycerophosphocholine. CC -!- Palmitoyl,
oleoyl, and linoleoyl can be transferred a
number of CC sterols, including
cholesterol, can act as acceptor. CC -!- The
bacterial enzyme also catalyses the reactions of
EC 3.1.1.4 and CC EC 3.1.1.5. DI Norum
disease MIM245900. DI Fish-eye disease
MIM136120. PR PROSITE PDOC00110 DR BRENDA
2.3.1.43. DR EMP/PUMA 2.3.1.43. DR WIT
2.3.1.43. DR KYOTO UNIVERSITY LIGAND CHEMICAL
DATABASE 2.3.1.43. DR P10480, GCAT_AERHY
P53760, LCAT_CHICK P04180, LCAT_HUMAN DR
P16301, LCAT_MOUSE Q08758, LCAT_PAPAN P30930,
LCAT_PIG DR P53761, LCAT_RABIT P18424,
LCAT_RAT //
55Enzyme Catalytic Mechanisms
- Fundamentally familiar reactions from Organic
Chemistry - Acid Base Catalysis - Donation or abstraction of
protons - Covalent Catalysis - Covalent (co)enzyme-substrate
intermediate - Metal Ion - Substrates and metals positioned for
reaction - Electrostatic - Charge complimentarity to
transition state - Proximity and Orientation - Substrates aligned
for reaction - Transition state stabilization - ?G reduced
56Metabolic networks
- Each enzyme/reaction can be a path between nodes
- Each node is an enzyme substrate (product or
reactant) - Converting individual reactions to paths and
nodes - Produces directed graphs
- Classification of biochemical reactions
- EC numbering system (Enzyme Commission)
- Hierarchical numerical system i.e. 1.5.3.1
- Based on organic chemistry involved, not proteins
57Painthe Boehringer-Mannheim wallcharts
58more pain
59A Pathway Example
60Gene Regulatory Networks
- What is gene regulation?
- The primary role of a gene, is transcription,
which produces mRNA, a copy of a single strand of
the gene. Different proteins can control the
transcription process by activating, inhibiting,
or competitively binding to the promoter region
of genes.
61Protein Synthesis
- Transcription
- Before the synthesis of a protein begins, the
corresponding RNA molecule is produced by RNA
transcription. One strand of the DNA double helix
is used as a template by the RNA polymerase to
synthesize a messenger RNA (mRNA). - This mRNA migrates from the nucleus to the
cytoplasm. During this step, mRNA goes through
different types of maturation including one
called splicing when the non-coding sequences are
eliminated. The coding mRNA sequence can be
described as a unit of three nucleotides called a
codon.
62Protein Synthesis
- Translation
- The ribosome binds to the mRNA at the start codon
that is recognized only by the initiator tRNA. - The ribosome proceeds to the elongation phase of
protein synthesis. During this stage, complexes,
composed of an amino acid linked to tRNA,
sequentially bind to the appropriate codon in
mRNA by forming complementary base pairs with the
tRNA anticodon. - The ribosome moves from codon to codon along the
mRNA. Amino acids are added one by one,
translated into polypeptidic sequences dictated
by DNA and represented by mRNA. - At the end, a release factor binds to the stop
codon, terminating translation and releasing the
complete polypeptide from the ribosome.
63Control of Gene Expression
- Gene Expression is a term indicating the act of
protein synthesis by a gene - not all genes produce proteins in all cells or in
all phases of a cells life cycle - Many control points
- transcription, mRNA processing, nRNA transport,
translation, post-translational modifications - Each gene has its own control regions
- all genes differ slightly in the exact locations
of control and the exact set of transcription
factors (proteins that control transcription) - Different combinations of transcription factors,
and their relative timing of bindings create a
large space of control signals - some control signals may control the
transcription of more than one gene
64Transcription Regulation
65Transcription-Initiation Complex
66Events Leading to Transcription Initiation
67Enhancers can be equally complex
68A sense of the data the molecular neighborhood
of IME1
69Types of Interactions
70Ontologies and Databases for Biological Pathways
71BioPax
Database Exchange Formats
Simulation Model Exchange Formats
BioPAX
Small Molecules (CML)
SBML, CellML
PSI
Molecular Interactions ProPro
AllAll
Biochemical Reactions
Genetic Interactions
Rate Formulas
Metabolic Pathways Qualitative
Quantitative
Interaction Networks Molecular
Non-molecular ProPro TFGene
Genetic
Regulatory Pathways Qualitative
Quantitative
Enzymes
72Design Goals
- Encapsulation An entire pathway in one record
- Compatible Use existing standards wherever
possible - Computable From file reading to logical
inference - OWL (Ontology Web Language)
- Fast
- Complete all conclusions are guaranteed to be
computed - Decidable all computations will finish in finite
time (with OWL Lite, short amount of time.
73Requirements Specification
- Accommodate existing database representations
BioCyc, BIND, WIT, aMAZE, KEGG, etc. - Compatible as a superset of representations
- Support different pathway types
- Metabolic pathways
- Signaling pathways
- Protein-protein interactions
- Gene regulatory pathways
- OWL- used for encoding the ontology
74Implementation of BioPAX
- Implemented using OWL language
- OWL is
- Ontology Web Language
- XML based
- W3C standard www.W3C.org
- Example of a BioPAX Class and Instance in OWL
75Example Class def in OWL
ltowlClass rdfID"protein"gt
ltrdfssubClassOfgt ltowlClass
rdfabout"physicalEntity"/gt
lt/rdfssubClassOfgt ltrdfscomment
rdfdatatype"http//www.w3.org/2001/XMLSchemas
tring"gt A protein (e.g. The EGFR protein
sequence. See Swiss-Prot for more examples.)
lt/rdfscommentgt lt/owlClassgt
76Example Instance in OWL
ltbpxprotein rdfID"biopax-L1v0.5_Instance_42"gt
ltbpxNAMESgt ltbpxnamesType
rdfID"biopax-L1v0.5_Instance_43"gt
ltbpxSHORTLABELgtphosphoglucose isomeraselt/bpxSHOR
TLABELgt lt/bpxnamesTypegt lt/bpxNAMESgt
lt/bpxproteingt
77(No Transcript)
78BioPAX Ontology
- Current structure of
- class hierarchy
- Level 1 v0.9 (Dec. 2003)
79Annotation with BioPax
80Metabolic Data in BioPAX
EcoCyc Reaction
BioPAX Biochemical Reaction
81Metabolic Data in BioPAX
EcoCyc Enzyme-Catalyzed Reaction
BioPAX Catalysis
82Metabolic Data in BioPAX
EcoCyc Pathway
BioPAX Class Pathway
83Signal Transduction Data in BioPAX
CSNDB Signaling Pathway Step
84Signal Transduction Data in BioPAX
CSNDB Pathway
85Descriptions of some databases
- Name KEGG (Kyoto Encyclopedia of Genes and
Genomes) - Web http//www.genome.ad.jp/kegg/
- Owner Institute for Chemical Research, Kyoto
University - Description KEGG is an effort to computerize
current knowledge of molecular and cellular
biology in terms of the information pathways that
consist of interacting molecules or genes and
to provide links from the gene catalogs
produced by genome sequencing projects. The KEGG
project is undertaken in the Bioinformatics
Center, Institute for Chemical Research, Kyoto
Univ. - Name PathDB
- Web http//www.ncgr.org/pathdb/index.html
- Owner National Center for Genomic Resources
- Description PathDB is a functional prototype
research tool for biochemistry and functional
genomics. One of the key underlying philosophies
of their project is to capture discrete
metabolic steps. This allows them to build tools
to construct metabolic networks de novo from a
set of defined steps. PathDB is not simply a data
repository but a system around which tools can be
created for building, visualizing, and comparing
metabolic networks.
86List of Pathway Database/Tools (cont.)
- Name GenMAPP (Gene MicroArray Pathway Profiler)
- Gladstone Institute, UCSF.
- GenMAPP is a computer application designed to
visualize gene expression data on maps
representing biological pathways and groupings of
genes. The first release of GenMAPP 1.0 beta is
available with over 50 mouse and human pathways.
They also provide hundreds of functional
groupings of genes derived from the Gene Ontology
Project for the human, mouse, Drosophila, C.
elegans, and yeast genomes. GenMAPP seeks
collaborators in the biological community to
assist in the development of a library of
pathways that will encompass all known genes in
the major model organisms. - Â
- Name SPAD Signaling PAthway Database
- Graduate School of Genetic Resources Technology.
Kyushu University. - There are multiple signal transduction pathways
cascade of information from plasma membrane to
nucleus in response to an extracellular stimulus
in living organisms. Extracellular signal
molecule binds specific intracellular receptor,
and initiates the signaling pathway. Now, there
is a large amount of information about the
signaling pathways which control the gene
expression and cellular proliferation. They have
developed an integrated database SPAD to
understand the overview of signaling
transduction. SPAD is divided to four categories
based on extracellular signal molecules (Growth
factor, Cytokine, and Hormone) that initiate the
intracellular signaling pathway. SPAD is compiled
in order to describe information on interaction
between protein and protein, protein and DNA as
well as information on sequences of DNA and
proteins.
87Specific Pathway Databases
- Cytokine Signaling Pathway DB. Dept. of
Biochemistry. Kumamoto Univ. - The Database contains information on signaling
pathways of cytokines. It is designed for
researchers who work with cytokines and their
receptors, and provides biochemical data and
references about signaling molecules as well as
ligand-receptor relationships. - EcoCyc and MetaCyc Stanford Research Institute
- EcoCyc database describes the genome and the
biochemical machinery of E. coli. The database
contains up-to-date annotations of all E. coli
genes. EcoCyc describes all known pathways of E.
coli small-molecule metabolism. Each pathway and
its component reactions and enzymes are annotated
in rich detail, with extensive references to the
biomedical literature. The Pathway Tools software
provides query and visualization services. - BIND (Biomolecular Interaction Network
Database) UBC, Univ. of Toronto - -- BIND is a database designed to store full
descriptions of interactions, molecular complexes
and pathways, including interactions between any
two molecules composed of proteins, nucleic
acids and small molecules. Chemical reactions,
photochemical activation and conformational
changes can also be described. Abstraction is
made in such a way that graph theory methods may
be applied for data mining. The database can be
used to study networks of interactions, to map
pathways across taxonomic branches and to
generate information for kinetic simulations.
88Objectives of the KEGG Project
- Pathway Database Computerize current knowledge
of molecular and cellular biology in terms of the
pathway of interacting molecules or genes. - generic metabolic pathways (143)
- inferred pathways for all sequenced genomes
(2706) - Genes Database Maintain gene catalogs of all
sequenced organisms and link each gene product to
a pathway component - Ligand Database Organize a database of all
chemical compounds in living cells and link each
compound to a pathway component - Pathway Tools Develop new bioinformatics
technologies for functional genomics, such as
pathway comparison, pathway reconstruction, and
pathway design
89Data Representation in KEGG
- Entity a molecule or a gene
- Binary relation a relation between two entities
- Network a graph formed from a set of related
entities - Pathway metabolic pathway or regulatory pathway
90(No Transcript)
91(No Transcript)
92(No Transcript)
93(No Transcript)
94KEGG Model
95(No Transcript)
96KEGG query capabilities
- Searching an browsing
- Clickable maps
- Map coloring
- user provides a family of genes from gene
expression data - matching pathways are listed
- genes are colored on pathway maps
- Path finding between compounds
97Pathway models
98Concluding remarks
- We focused on what needs to be represented
- New kinds of queries
- Graph queries
- Comparison of models and traces
- is flux q possible in steady state for network N?
- Similarity of networks based on the similarity of
their flux cones - Compare networks based on
- Their structure
- Their flux cone
- Their dynamic behavior
- What-if queries
- We did not cover logics for simulation
- linear logic, computation tree logic