Title: Bacterial Comparative Genomics and Bioinformatics
1Bacterial Comparative Genomics and Bioinformatics
Research Seminar
- Yair Motro
- Center for Bioinformatics and Biological
Computing - http//cbbc.murdoch.edu.au
2An Introduction to CBBC
- Integrated Systems
- YABI client/server system for project
management, hypothesis driven bioinformatics
analysis, pipeline creation, audit trails,
comparative analysis - MASV and Multiblast genome viewer
- ERIC interactive genome viewer
- GRENDEL resource management
- MUMSDB Microsatellite database (allele
assignment) - UTILITIES (EST clustering, mirroring data, custom
database formatting) - Bioinformatics Analysis Audit trails and
Ontologies
- Comparative Genomic Analysis
- Mammalian species
- Human full-length cDNA annotation
- Cereal genome analysis
- Legume species analysis
- Bacterial genome annotation (Bacterial
Bioinformatics Group) - Genomics Based Vaccine Design and Development
(Bacterial Bioinformatics Group)
- Computation
- Sequence alignment (SANKOFF)
- Feature-based Sequence Alignment
- Other
- Protein profile classification (Neural Networks)
- DNA fingerprint profiling
- Microarray data analysis
3Bacterial Bioinformatics Group
- Collaboration between CBBC (Head Professor
Matthew Bellgard) and the Murdoch University
Bacteriology group (Head Professor David
Hampson) - Currently involved in
- Annotation of two bacterial genomes
- Analysis of these two bacterial genomes
(Comparative Genomics) - Genomics based vaccine design and development
(Reverse Vaccinology)
4Bacterial Genomics
- Genomics is the study of all the nucleotide
sequences, including structural genes, regulatory
sequences, and non-coding DNA segments, in the
chromosomes of an organism. - Comparative Bacterial Genomics is the analysis
of multiple bacterial genomes.
5(No Transcript)
6(No Transcript)
7Arising Issues and Focus of PhD Research
- The discussed bioinformatics analysis pipeline
relies heavily on comparative analyses, the
essence of my research - The research focuses on the use of comparative
genomics in answering the question - What makes a good potential drug/vaccine
candidate? - Focus on key areas in bioinformatics
- Development of software solutions for general
problems, with particular focus on visualisation
issues - Methodological limitations in comparative
genomics-based analysis
8Research Questions
- What species should be compared to generate
clusters? - From these clusters, what genes would be
considered unique/conserved? - Is visualisation the limiting factor in such
comparative analysis? - Is it valid to find unique genes by comparing
genomes of - Related species?
- Unrelated species?
- Conservative evolution of genes, metabolic
pathways, etc - Why are some proteins, pathways and
characteristics conserved in some species (even
those that are not taxonomically closely
related)? - Visualisations of these conservations?
- Hierarchy of conservation and uniqueness
- By using different types of comparisons,
different levels/aspects of gene uniqueness and
conservation is observed - To what extent can the reverse vaccinology
process be automated?
9Bacterial Genome Statistics
- First complete genome sequence available in 1995
- Now, more than 190 completely sequenced genomes
publicly available - Provides extensive data for comparative analysis
- All available statistics of completely sequenced
bacterial genomes to date, have been compiled
into a single table (present study) - Allowing for an initial comparative genomics
analysis
10Bacterial Genome Statistics
11Comparative genomics of C. tetani, C. perfringens
and C. acetobutylicum, identifies 1506 ORFs that
are homologous in all three genomes and 516 ORFs
that are unique to C. tetani.(Bruggemann et al.,
2003)This type of comparative work can lead to
the identification of particular ORFs that are
unique to the microbe, and could thus become a
drug target or vaccine candidate.
12Gene conservation among related species can be
depicted through complete genome sequence
alignments, such as this one between C. tetani
and C. botulinum (causative agents of tetanus and
botulism respectively) .(Bruggemann et al.,
2003)This is a powerful comparative analysis
tool, as particular segments of conservation can
be visually picked out and further analysed.
13From the genomic analysis, the ori (origin of
replication) for C. tetani was predicted. 81
of all ORFs are transcribed in same direction as
replication proceeds (termed co-directionality),
which was also found in other Gram positive
bacteria (in contrast to Gram negative bacteria).
This comparative analysis demonstrates that both
specific characteristics of the chromosome can be
determined (i.e. ori elucidation) along with
general characteristics (i.e. co-directionality).
(Bruggemann et al., 2003)
14The diagram demonstrates the use of comparative
genomics on the collagenase protein (involved in
the hydrolysis of collagen, thus potentially
contributing to the loss of tissue integrity in
the infected host). Part A diagrammatically
presents a segment comparison of collagenases
from a number of species of the
Clostridium/Bacillus group. This is useful in
determining insertions and deletions between
species, which can be used to predict
phylogenetic relationships among the species
(Part B).(Bruggemann et al., 2003)
15Genome annotation can lead to the identification
of metabolic pathways, which may be essential to
the microbes existence, and may also
differentiate it from its host.C. tetani has a
great number of genes that encode for sodium
ion-dependant systems, which may be the reason
for the organisms successful invasion of
infected tissue. (Bruggemann et al., 2003)
16References
- Bruggemann, H., et al., The genome sequence of
Clostridium tetani, the causative agent of
tetanus disease. Proc Natl Acad Sci U S A, 2003.
100(3) p. 1316-21. - Bruggemann, H. and G. Gottschalk, Insights in
metabolism and toxin production from the complete
genome sequence of Clostridium tetani. Anaerobe,
2003. in press. - Adu-Bobie, J., et al., Two years into reverse
vaccinology. Vaccine, 2003. 21(7-8) p. 605-10. - Motro, Y., Dunn, D., La, T., Hampson, D.,
Bellgard, M. (2004). A decade of bacterial genome
sequencing. in progress.