Title: What next after the Human Genome Project
1What next after the Human Genome Project?
Ehud Shapiro Weizmann Institute of Science
2The talk will have three parts
- Molecular Biology in One Slide
- Why are computers so successful in some parts of
biology? - Why not (yet) in systems biology, and what should
be done about it?
3Part IMolecular Biology in One Slide
4A computer scientists view of Molecular Biology
in one slide
- Sequence Sequence of DNA and Proteins
5(No Transcript)
6(No Transcript)
7To a computer scientist Labeled 3D graphs
8Molecular Biology in One Slide
- Sequence Sequence of DNA and Proteins
- Structure 3D Structure of Proteins and other
biomolecules and molecular complexes
9(No Transcript)
10E.coli metabolism
11S.cerevisiae protein-protein interactions
12(No Transcript)
13(No Transcript)
14Molecular Biology in One Slide
- Sequence Sequence of DNA and Proteins
- Structure 3D Structure of Proteins and other
biomolecules and biomolecular complexes - Systems Function, activity and interaction of
molecular systems in cells (and beyond)
15Part IIWhy are computers so successful in
sequence and structure biology?
16Computers are the means for consolidating
sequence biology
- Computers are key to sequence identification
- Computer data bases store accumulated sequence
information - Computer algorithms are used for sequence analysis
17Computers are the means for consolidating
sequence biology
- Computers are used to is share, compare,
criticize and correct sequence information - The result Scientists converge to consensus
sequences quickly and effectively
18Computers are the means for consolidating
structural biology
- Computers are key to structure identification
- Computer data bases store accumulated structure
information - Computer algorithms are used for structure
analysis
19Computers are the means for consolidating
structural biology
- Computers are used to is share, compare,
criticize and correct structure information - The result Scientists converge to consensus
structures quickly and effectively
20Computer-based consolidation of systems biology?
- Tens of thousands of articles a year about the
function, activity and interaction of molecular
systems in cells - Knowledge is encapsulated in prose, pictures and
diagrams - Where are the computers?
21Computer-based consolidation of systems biology
would allow
- Handling the huge amount of accumulated knowledge
- An objective knowledge repository
- Sharing, comparing, criticizing and correcting
accumulated knowledge - Converging to a consensus quickly and effectively
22Computer-based consolidation of systems biology
- The deep reason The use of good abstractions
for sequence and structure knowledge, no
successful abstraction (yet) for systems biology
23What is an abstraction?
- a mapping from a real-world domain to a
mathematical domain (homomorphism) - highlights essential properties while ignoring
other, complicating, ones.
24Sequence biology uses the DNA-as-string
abstraction
- Relevant Captures sequence information, ignoring
many biochemical properties - Compute-able Enables string algorithms,
efficient data-bases - Understandable A string over A, T, C, G is the
universal format for genetic information - Extensible E.g., the addition of a fifth symbol
denoting methylated cytosine.
25Structural biology uses the Protein-as-3D-labeled
graph abstraction
26Part IIIWhy computers are not (yet) as useful in
systems biology, and what should be done about
it?
27Which abstraction for systems biology?
?
Which abstraction? may be the most crucial
question facing systems biology today.
28Further reading Cellular Abstractions Cells
as Computation, Regev and Shapiro, Nature,
September 26th, 2002
29A human engineering history perspective
- Sequence -- plowing a field
- Structure - building a house
- Systems building and flying an airplane
30A computer science perspective
- Sequence (strings) -- 1936 50s
- Structure (graphs) - 50s 70s
- Systems (processes) 60s present and future
31The New Biology
- The cell as an information processing device
- Cellular information processing and passing are
carried out by networks of interacting molecules - Ultimate understanding of the cell requires an
information processing model - Which?
32Conclusions
- The most advanced tools describing computer
systems may also be the best tools for the
describing biomolecular systems - Its great if we can use the decades-long effort
in the study of concurrency in computer science - An essential foundation for Virtual in silico
cell project