Title: BioDeducta
1BioDeducta
KnowOS Knowledge Operating System
Astro-Biology Road-Map Fundamental
Questions How does life begin and evolve?Does
life exist elsewhere in the universe?What is the
future of life on Earth and beyond?
2BioDeducta
The Cyanobacteria and Cyanobacterial
Extremeophiles Heavily influenced the early
and modern atmosphere. Occupy a very wide
range of niches, from deserts to oceans to
Antarctica, and everywhere in between!
Cyanobacteria lived happily at the Earth's
surface before 3.85 billion years ago despite
heavy bombardment. nai.arc.nasa.gov /...
J Waterbury, Woods Hole/NAI
3BioDeducta
KnowOS Knowledge Operating System
Computational tools are critical complements to
experiment Even simple organisms are too
complex to reason about. Most experiments
cannot be run. Molecular biology so far has
only seen the tip of the iceberg Probably
over 100 species of cyanobacteria in the oceans
alone. Many more on land, in nearly every
niche. The biocomputing community cant provide
off the shelf tools Most tools are designed
to understand medically-related problems.
4How do cells control response to light?
I.e., What genes are related to the adaptation
to high light?
Prochlorococcus MED4
Prochlorococcus MIT9313
5Hihara, Kamei, Kanehisa, Kaplan, and Ikeuchi
(2001) DNA microarray analysis of cyanobacterial
gene expression during acclimation to high light.
Plant Cell, 13(4)
6How do cells control response to light?
I.e., What genes are related to the adaptation
to high light?
Outline Protocol
Look for
- Gene present in Prochlorococcus MED4 MED4 is
naturally adapted to grow in high light.
- Ortholog absent in Prochlorococcus MIT9313
MIT9313 is naturally adapted to grow in low light
- Ortholog present in Synechocystis PCC 6803
In order to make contact with annotation and
microarray data
- Synechocystis PCC 6803 ortholog responds to high
light Gene turns on by factor gt 2 in response
to high light
7BioDeducta
KnowOS Knowledge Operating System
Goals To provide a means for scientists (esp.
biologists) to 1. Express conjectures in
a formal language 2. Test these conjectures
against data and background knowledge 3.
Discover and manipulate computable models
4. Share conjectures with others in the
community
8BioDeducta
KnowOS
Web Based Listener
Object Persistence
Security v. Collaboration
KnowOS
Process Management
Efficient Scripting
Integrated Knowledge
Toolset Integration
Ability to read Upside Down
9BioDeducta
KnowOS Knowledge Operating System
BioBike
Mnemotheque
CACHE
Other Apps...
KnowOS (Knowledge Operating System)
Snark Theorem Prover
Integrated Knowledge Base
Allegro Common Lisp
Linux (standard tools)
AllegroCache OODB
Relational DB
10BioDeducta
KnowOS Knowledge Operating System
Goals To provide a means for scientists (esp.
biologists) to 1. Express conjectures in
a formal language 2. Test these conjectures
against data and background knowledge 3.
Discover and manipulate computable models
4. Share conjectures with others in the
community
11BioDeducta
Language for Expressing Conjectures, and Platform
for Analysis A. First Order Logic (FOL)
representation B. Subject Domain Theory C.
Biological Process (and entities) Ontology D.
Visual query language.
Goal Query
Subject Domain Theory
Subject Domain Theory
12BioDeducta
Goals To provide a means for scientists (esp.
biologists) to 1. Express conjectures in
a formal language 2. Test these conjectures
against data and background knowledge 3.
Discover and manipulate computable models
4. Share conjectures with others in the
community
13How do cells control response to light?
I.e., What genes are related to the adaptation
to high light?
Outline Protocol
Look for
- Gene present in Prochlorococcus MED4 MED4 is
naturally adapted to grow in high light.
- Ortholog absent in Prochlorococcus MIT9313
MIT9313 is naturally adapted to grow in low light
- Ortholog present in Synechocystis PCC 6803
In order to make contact with annotation and
microarray data
- Synechocystis PCC 6803 ortholog responds to high
light Gene turns on by factor gt 2 in response
to high light
14Hihara, Kamei, Kanehisa, Kaplan, and Ikeuchi
(2001) DNA microarray analysis of cyanobacterial
gene expression during acclimation to high light.
Plant Cell, 13(4)
15BioDeducta
16BioDeducta
Goal Query
Result ?gene PMED4.PMM0817 ?organism2
prochlorococcus_marinus_mit9313 ?experiment
HIHARA ?organism3 synechocystis_pcc6803
?gene3 S6803.ssr2595 I.e., A low-light
organism that has no ortholog to ?gene is
prochlorococcus marinus pcc. 9313. Experiments
were performed by Hihara on the organism
synechocystis pcc 6803, and a high regulation
ratio was discovered in those experiments on gene
S6803.ssr2595, which is an ortholog of PMM0817.
The annotation for PMM0817 reads possible
high-light inducible protein. (Matches the
results from Bhaya, Dufresne, Vaulot, and
Grossman Analysis of the hli gene family in
marine and freshwater cyanobacteria. FEMS
Letters, 2002, 205(2). PMM0817 is called hli17 in
this paper.)
17BioDeducta
Goals To provide a means for scientists (esp.
biologists) to 1. Express conjectures in
a formal language 2. Test these conjectures
against data and background knowledge 3.
Discover and manipulate computable models
4. Share conjectures with others in the
community
18BioDeducta
Verbal Model
When "awake" (day) the cell regulates its
photosystem (PS) genes so as to match
photosynthetic output to energy demands. When
the available light exceeds its needs, the PS is
down-regulated, leading to an "M" pattern of
expression. At night, the cell sleeps, leading
to another drop in expression patterns at night.
Graphical Model
19BioDeducta
Knowledge-Based Microarray Casual Analysis
Constraints
20BioDeducta
An average model that satisfies the constraints
in the given constraint list
21Get the model in computable form.
BioDeducta
22BioDeducta
Computable Model
(photosynthesis isa process with inputs
(chloroplast-inside.water everywhere.light
chloroplast-outside.nadph chloroplast-outside.a
dp chloroplast-outside.pi) outputs
(chloroplast-outside.atp chloroplast-outside.nadph
everywhere.o2) implemented-by
photosystem) (photosystem composition (psii
antenna-array atpase pq-pool)) (light-absorption
isa process with inputs (everywhere.light)
outputs (chlorophyll.energy) function
absorption implemented-by chlorophyll) (light-en
ergy-concentration isa process with outputs
psii.energy driver chlorophyll.energy
function concentration implemented-by
antenna-array) (psii-water-breakdown isa process
with inputs (chloroplast-inside.water) driver
psii.energy outputs (psii.e- psii.e-
chloroplast-inside.h chloroplast-inside.o2)
function molecular-splitting implemented-by
psii) (psii-pq-reduction isa process with
inputs (psii.e- chloroplast-membrane.h
chloroplast-membrane.plastoquinone) outputs
(chloroplast-membrane.plastoquinol) function
reduction implemented-by psii inhibited-by
dcmu)
23BioDeducta
Explanation by Pathway Tracing
(photosynthesis isa process with inputs
(chloroplast-inside.water everywhere.light
chloroplast-outside.nadph chloroplast-outside.a
dp chloroplast-outside.pi) outputs
(chloroplast-outside.atp chloroplast-outside.nadph
everywhere.o2) implemented-by
photosystem) (photosystem composition (psii
antenna-array atpase pq-pool)) (light-absorption
isa process with inputs (everywhere.light)
outputs (chlorophyll.energy) function
absorption implemented-by chlorophyll) (light-en
ergy-concentration isa process with outputs
psii.energy driver chlorophyll.energy
function concentration implemented-by
antenna-array) (psii-water-breakdown isa process
with inputs (chloroplast-inside.water) driver
psii.energy outputs (psii.e- psii.e-
chloroplast-inside.h chloroplast-inside.o2)
function molecular-splitting implemented-by
psii) (psii-pq-reduction isa process with
inputs (psii.e- chloroplast-membrane.h
chloroplast-membrane.plastoquinone) outputs
(chloroplast-membrane.plastoquinol) function
reduction implemented-by psii inhibited-by
dcmu)
24BioDeducta
Explanation by Pathway Tracing
(track-object 'chloroplast-inside.water)Tracking
CHLOROPLAST-INSIDE.WATER -gt PHOTOSYNTHESIS
Tracking CHLOROPLAST-OUTSIDE.ATP Tracking
CHLOROPLAST-OUTSIDE.NADPH Tracking
EVERYWHERE.O2 -gt PSII-WATER-BREAKDOWN
Tracking PSII.E- -gt PSII-PQ-REDUCTION
Tracking CHLOROPLAST-MEMBRANE.PLASTOQUINOL
-gt E-FUNNLING-PSII-TO-PSI Tracking
PSI.E- -gt PSI-NADPH-FORMATION
Tracking CHLOROPLAST-INSIDE.H -gt
ATP-FORMATION Tracking CHLOROPLAST-INSIDE.O2
-gt O2-DIFFUSSION
25BioDeducta
Goals To provide a means for scientists (esp.
biologists) to 1. Express conjectures in
a formal language 2. Test these conjectures
against data and background knowledge 3.
Discover and manipulate computable models
4. Share conjectures with others in the
community
Bayes Community model
26BioDeducta
Bayes Community model of analysis
integration A. Each analyst forms her/his
own ACH matrix. B. Conjectures from one
matrix can be used as evidence in another C.
Bayes Community mechanism maintains the projected
influence network
Another Scientist
Scientific Community
One Scientist
http//www.dkfz.de/tbi/projects/images/BayesianNet
work.jpg
27BioDeducta
KnowOS Knowledge Operating System
Possible spin-offs Bugs on the Moon?
KnowOS use in other science missions?
(Climate CACHE)
www.biobike.org or www.knowos.org
28BioDeducta
KnowOS Knowledge Operating System
Participants Jeff Shrager, Steve Racunas, Pat
Langley, Stephen Bay _at_ Inst. for the Study
of Learning and Expertise Andrew Pohorille
Astrobiology _at_ NASA Ames Res. Ctr. Richard
Waldinger _at_ SRI (supported by NSF)
Other contributors JP Massar, Mike Travers, Mark
Slupesky, Bob Haxo, Pat Langley, Marc
Stickel, Franz, Inc., Lispworks.
Collaboration with experimentalists
Reysenbach lab (Portland state) Grossman lab
(Carnegie Inst., DPB) Nigam Shah and Amar Das
(Stanford) Jeff Elhai (VCU) - Cyanobacteria
www.biobike.org or www.knowos.org