Title: Computational%20Biology%20and%20Bioinformatics%20in%20Computer%20Science
1Computational Biology and Bioinformatics in
Computer Science
Lenwood S. Heath Department of Computer
Science 2160J Torgersen Hall Virginia Tech
Department Seminar Series September 9, 2005
2Overview
- Computational biology and bioinformatics (CBB)
- What is it?
- History at VT
- Some biological terminology
- CBB faculty and projects
- Education in CBB
- Bioinformatics option
- GBCB
- Conclusion
3Computational Biology and Bioinformatics (CBB)
- Computational biology computational research
inspired by biology - Bioinformatics application of computational
research (computer science, mathematics,
statistics) to advance basic and applied research
in the life sciences - Agriculture
- Basic biological science
- Medicine
- Both ideally done within multidisciplinary
collaborations
4CBB History (Part I)
- Biological modeling (Tyson, Watson) gt 20 years
- Computational biology, genome rearrangements
(Heath) gt 10 years - Fralin Biotechnology sponsored faculty advisory
committee centered on bioinformatics 1998-2000 - Biochemistry biology CALS computer science
(Heath, Watson) statistics VetMed - Provost provided 1 million seed money
- First VT bioinformatics hire (Gibas, biology,
1999)
5CBB History (Part II)
- Outside initiative submitted to VT for a campus
bioinformatics center 1998 - Discussions of bioinformatics advisory committee
contributed to a proposal to the Gilmore
administration 1999 - Governor Gilmore puts plans and money for
bioinformatics center in budget 1999-2000 - Virginia Bioinformatics Institute (VBI)
established July, 2000 housed in CRC
6Virginia Bioinformatics Institute (VBI)
- Established by the state in July, 2000 high
visibility - Applies computational and information technology
in biological research - Research faculty (currently, about 18) expertise
includes - Biochemistry
- Comparative Genomics
- Computer Science
- Drug Discovery
- Human and Plant Pathogens
- More than 43 million funded research
- Mathematics
- Physics
- Simulation
- Statistics
7CBB History (Part III)
- Bioinformatics course and curriculum development
began with faculty subcommittee 1999 - Courses supporting bioinformatics now in many
life science and computational science
departments, including - Biology
- Biochemistry
- Computer Science
- Plant Pathology, Physiology, and Weed Science
(PPWS) - Mathematics
- Statistics
8Some Molecular Biology
- The encoded instruction set for an organism is
kept in DNA molecules. - Each DNA molecule contains 100s or 1000s of
genes. - A gene is transcribed to an mRNA molecule.
- An mRNA molecule is translated to a protein
(molecule).
9Elaborating Cellular Function
Regulation
Degradation
Transcription
Translation
DNA
mRNA
Protein
(Genetic Code)
Reverse Transcription
- Protein functions
- Structure
- Catalyze chemical reactions
- Regulate transcription
Thousands of Genes!
10Chromosomes
- Large molecules of DNA 104 to 108 base pairs.
- Human chromosomes 22 matched pairs plus X and
Y. - A gene is a subsequence of a chromosome that
encodes a protein. - Proteins associated with regulation are present
in chromosomes. - Every gene is present in every cell.
- Only a fraction of the genes are in use
(expressed) at any time.
11Genomics
Genomics Discovery of genetic sequences and
the ordering of those sequences into individual
genes, into gene families, and into chromosomes.
Identification of sequences that code for gene
products/proteins and sequences that act as
regulatory elements.
12Functional Genomics
Functional Genomics The biological role of
individual genes, mechanisms underlying the
regulation of their expression, and regulatory
interactions among them.
13Challenges for Computer Science
- Analyzing and synthesizing complex experimental
data - Representing and accessing vast quantities of
information - Pattern matching
- Data mining
- Gene discovery
- Function discovery
- Modeling the dynamics of cell function
14CBB Faculty in CS
- Chris Barrett (VBI, CS)
- Vicky Choi
- Roger Ehrich
- Edward A. Fox
- Lenny Heath
- Madhav Marathe (VBI, CS)
- T. M. Murali
- Chris North
- Alexey Onufriev
- Naren Ramakrishnan
- Adrian Sandu
- Eunice Santos
- João Setubal (VBI, CS)
- Cliff Shaffer
- Anil Vullikanti (VBI, CS)
- Layne Watson
- Liqing Zhang
15 Established CBB Faculty
- Layne Watson
- Lenny Heath
- Cliff Shaffer
- Naren Ramakrishnan
- Eunice Santos
16Layne Watson
- Professor of Computer Science and Mathematics
- Expertise algorithms image processing high
performance computing optimization scientific
computing - Computational biology has worked with John Tyson
(biology) for over 20 years - JigCell cell-cycle modeling environment with
Tyson, Shaffer, Ramakrishnan, Pedro Mendes of VBI - Expresso microarray experimentation with Heath,
Ramakrishnan
17Lenny Heath
- Professor of Computer Science
- Expertise algorithms theoretical computer
science graph theory - Computational biology worked in genome
rearrangements 10 years ago - Bioinformatics concentration in past 5 years
- Expresso microarray experimentation with
Ramakrishnan, Watson - Multimodal networks
- Computational models of gene silencing
18Cliff Shaffer
- Associate Professor of Computer Science
- Expertise algorithms problem solving
environments spatial data structures - JigCell cell-cycle modeling environment with
Ramakrishnan, Tyson, Watson
19Naren Ramakrishnan
- Associate Professor of Computer Science
- Expertise data mining machine learning problem
solving environments - JigCell cell-cycle modeling problem solving
environment with Shaffer, Watson - Expresso microarray experimentation with Heath,
Watson - Proteus inductive logic programming system for
biological applications - Computational models of gene silencing
20Eunice Santos
- Associate Professor of Computer Science
- Expertise Algorithms computational biology
computational complexity parallel and
distributed processing scientific computing - Relevant bioinformatics project modeling
progress of breast cancer
21 New CBB Faculty
- T. M. Murali (2003) CS bioinformatics hire
- Alexey Onufriev (2003) CS bioinformatics hire
- Adrian Sandu (2004) CS hire
- João Setubal (Early 2004) VBI and CS
- Vicky Choi (2004) CS bioinformatics hire
- Liqing Zhang (2004) CS bioinformatics hire
- Chris Barrett, Madhav Marathe (Fall 2004) VBI and
CS - Anil Vullikanti (Fall 2004) VBI and CS
- Yang Cao (January, 2006) CS bioinformatics hire
22T. M. Murali
- Assistant Professor of Computer Science
- Hired in 2003 for bioinformatics group
- Expertise algorithms computational geometry
computational systems biology - Projects
- Functional gene annotation
- xMotif find patterns of coexpression among
subsets of genes - RankGene rank genes according to predictive
power for disease
23Alexey Onufriev
- Assistant Professor of Computer Science
- Hired in 2003 for bioinformatics group
- Expertise Computational and theoretical
biophysics and chemistry structural
bioinformatics numerical methods scientific
programming - Projects
- Biomolecular electrostatics
- Theory of cooperative ligand binding
- Protein folding
- Protein dynamics how does myoglobin uptake
oxygen? - Computational models of gene silencing
24Adrian Sandu
- Associate Professor of Computer Science
- Hired in 2003
- Expertise Computational science numerical
methods parallel computing scientific and
engineering applications - Computational science
- New generation of air quality models
- computational tools for assimilation of
atmospheric chemical and optical measurements
into atmospheric chemical transport models
25João Setubal
- Research Associate Professor at VBI
- Associate Professor of Computer Science
- Joined in early 2004
- Expertise algorithms computational biology
bacterial genomes - Comparative genomics
26Vicky Choi
- Assistant Professor of Computer Science
- Hired in 2004 for bioinformatics group
- Expertise computational biology algorithms
- Projects
- Algorithms for genome assembly
- Protein docking
- Biological pathways
27Liqing Zhang
- Assistant Professor of Computer Science
- Hired in 2004 for bioinformatics group
- Expertise evolutionary biology bioinformatics
- Research interests
- Comparative evolutionary genomics
- Functional genomics
- Multi-scale models of bacterial evolution
28Selected CBB Research Projects
- JigCell
- Expresso
- Multimodal Networks
- Computational Modeling of Gene Silencing
29JigCell A PSE for Eukaryotic Cell Cycle Controls
Marc Vass, Nick Allen, Jason Zwolak, Dan Moisa,
Clifford A. Shaffer, Layne T. Watson, Naren
Ramakrishnan, and John J. Tyson Departments of
Computer Science and Biology
30Cell Cycle of Budding Yeast
Cdc20
Sister chromatid separation
PPX
Lte1
Esp1
Budding
Pds1
Tem1
Esp1
Net1P
Esp1
Bub2
Cdc15
Cln2
SBF
Unaligned chromosomes
Pds1
SBF
Net1
RENT
Mcm1
Unaligned chromosomes
Cdh1
Mcm1
Cdc20
Mad2
Cdc20
Cdc14
Cln3
Cdc15
and
Bck2
Cdh1
Mcm1
APC
Clb2
Cdc14
growth
CDKs
Swi5
SCF
Cdc14
?
Cdc20
MBF
Clb5
Esp1
DNA synthesis
31JigCell Problem-Solving Environment
32Why do these calculations?
- Is the model yeast-shaped?
- Bioinformatics role the model organizes
experimental information. - New science prediction, insight
- JigCell is part of the DARPA BioSPICE suite of
software tools for computational cell biology.
33Expresso A Next Generation Software System for
Microarray Experiment Management and Data Analysis
34Scenarios for Effects of Abiotic Stress on Gene
Expression in Plants
35The Expresso Pipeline
36Proteus Data Mining with ILP
- ILP (inductive logic programming) a data mining
algorithm for inferring relationships or rules - Proteus efficient system for ILP in
bioinformatics context - Flexibly incorporates a priori biological
knowledge (e.g., gene function) and experimental
data (e.g., gene expression) - Infers rules without explicit direction
37Fusion Chris North
- Snap together visualization environment
- Interactively linked data from multiple sources
- Data mining in the background
38Sequence Analysis
- Evolution implies changes in genomic sequence
through mutations and other mechanisms - Genomic or protein sequences that are similar
are called homologous - Algorithms to detect homology provide access to
evolutionary relationships and perhaps function
conservation through genomic data.
39Networks in Bioinformatics
- Mathematical Model(s) for Biological Networks
- Representation What biological entities and
parameters to represent and at what level of
granularity? - Operations and Computations What manipulations
and transformations are supported? - Presentation How can biologists visualize and
explore networks?
40Reconciling Networks
Munnik and Meijer, FEBS Letters, 2001
Shinozaki and Yamaguchi-Shinozaki, Current
Opinion in Plant Biology, 2000
41Multimodal Networks
- Nodes and edges have flexible semantics to
represent - Time
- Uncertainty
- Cellular decision making process regulation
- Cell topology and compartmentalization
- Rate constants
- Phylogeny
- Hierarchical
42Using Multimodal Networks
- Help biologists find new biological knowledge
- Visualize and explore
- Generating hypotheses and experiments
- Predict regulatory phenomena
- Predict responses to stress
- Incorporate into Expresso as part of closing the
loop
43Computational Modeling of Gene Silencing (CMGS)
Lenwood S. Heath, Richard Helm, Alexey Onufriev,
Naren Ramakrishnan, and Malcolm
Potts Departments of Computer Science and
Biochemistry
44RNA Interference (RNAi)
45CMGS System
46Other CBB Research Projects
- Bacterial genomics Setubal
- xMotif Murali
- Plant Orthologs and Paralogs (POPS)
- Heath, Murali, Setubal, Zhang, Ruth Grene (plant
physiology) - Protein structure and docking Choi
- Whole-genome functional annotation Murali
- Modeling biomolecular systems Onufriev
47CBB Education at VT
- CS has been training CS graduate students in CBB
since 2000 - Graduate bioinformatics option established in a
number of participating departments 2003 - Ph.D. program in Genetics, Bioinformatics, and
Computational Biology (GBCB) 2003 - First GBCB students arrived, Fall, 2003 now in
third year
48CBB Education in CS
- A key department of the Ph.D. program in
Genetics, Bioinformatics, and Computational
Biology (GBCB) - Computation for the Life Sciences I, II
- Algorithms in Bioinformatics
- Systems Biology
- Structural Bioinformatics and Computational
Biophysics - Databases for Bioinformatics
49Conclusions
- Important research area in department
- Close collaboration between life scientists and
computational scientists from the beginning of
CBB research at VT - Educational approach insists on adequate
multidisciplinary background - Multidisciplinary collaborators work closely on a
regular basis - Contributions to biology or medicine essential
outcomes
50Supported byNext Generation SoftwareInformation
Technology ResearchNSF