Title: Computational Microbiology
1Computational Microbiology
- Microb 343
- David Wishart
- david.wishart_at_ualberta.ca
2Objectives
- Definition of computational microbiology
- Raise awareness of new kinds of database
resources for learning about the molecular
aspects of microbes - Discussion of CCDB, BacMap, Basys and other tools
(MIPS, CMR)
3What is Computational Microbiology?
- Microbiology on a laptop
- Using computers databases to aid in
microbiological research and discovery - Includes computational biology, bioinformatics,
systems biology and biological simulation - An (indispensable) adjunct to basic microbiology
and key to microbial genomics, proteomics
metabolomics
4The CCDB
http//redpoll.pharmacy.ualberta.ca/CCDB Or just
type CCDB and coli on Google
5The CCDB
- Assembled using a combination of both manual and
automatic annotation methods - Covers DNA, RNA, proteins, metabolites
- Uses many of the tools (sequence alignment,
protein property prediction) weve already seen
to help infer name, function, structure and other
properties - Builds from other databases on E. coli
6E. coli Alliance
7The CyberCell Database (CCDB)
- Most complete, current, quantitative collection
of molecular data on E. coli - Web accessible, Web browsable, self-updating
- Supports many kinds of query, viewing and
browsing options - Structured using ColiCards as in the GeneCards
database - Informatic foundation for coordinating
integrating data collected for Project CyberCell
8(No Transcript)
9CCDB ColiCard
10ColiCard Contents
- Functional info (predicted or known)
- Sequence information (sites, modifications, pI,
MW, cleavage) - Location information (in chromosome cell)
- Interacting partners (known predicted)
- Structure (2o, 3o, 4o, predicted)
- Enzymatic rate and binding constants
- Abundance, copy number, concentration
- Links to other sites viewing tools
- Integrated version of all major Dbs
- 70 fields for each entry
11CCDB Annotation
12CCDB Annotation
13Searching Capabilities
- Text search, BLAST search, SQL search
- Show all membrane proteins that are essential
and have more than 6 membrane spanning regions - Chemical Structure search
- Find all metabolites similar to this prospective
drug structure
14How To Use CCDB
- Click on the Search button and type the name of
your favorite E. coli protein - Explore whats known about it by clicking on the
corresponding links - Learn about E. colis molecular details by
clicking on Stats or E. coli in the menu bar
at the top - Key is to explore and click links not
everything on E. coli is in CCDB proper
15BacMap
- Picking up where we left off with the CCDB
(Google bacmap) - Idea is to generate a visual atlas of all (not
just Escherichia coli) bacterial chromosomes and
plasmids but with links to extensive genome
annotation - Attempt to re-use annotation and graphing tools
originally developed for the CCDB
16BacMap
http//wishart.biology.ualberta.ca/BacMap/
17BacMap
18Text Search Tools
19Sequence Search Tools
20Bacterial Biography Card
21Genome Statistics
22Proteome Statistics
23BacMap
- Each genome has a short description of the
organism and sequence data - Supports zoomable, hyperlinked, clickable map
views of the genome - Supports text search of gene names, protein names
and synonyms - Supports BLAST search and supplies genome-wide
stats
Stothard P, et al. BacMap an interactive picture
atlas of annotated bacterial genomes. Nucleic
Acids Res. 2005 Jan 133 Database IssueD317-20.
24When To Use BacMap?
- Any time you need to find molecular biological
information about a bacterium (DNA, RNA or
protein) - Any time you need to find general information
about a bacterium - Does not provide same detailed information as
CCDB and does not support the advanced querying
found in CCDB (no metabolic information)
25What if Your Organism or Genome isnt in BacMap?
http//wishart.biology.ualberta.ca/basys/
26BASys
- Bacterial Annotation System
- A publicly available web server that performs
automated annotation of bacterial genomes given
only the gene sequence of a chromosome or plasmid - Takes about 24 hrs for an average genome (4
megabases) - Output includes images and annotation text (about
70 fields for each gene)
27BASys
- Reads either a genome sequence (finds genes using
Glimmer, performs translation) or a proteome
sequence (multi-FASTA format) - Performs BLAST search against CCDB to see if it
can transfer annotations from this database first - Performs BLAST search against SwissProt to see if
it can transfer annotations from this DB second
28BASys
- For all remaining proteins and all remaining
fields (MW, pI, structure, etc.) BASys calls on
various internal prediction programs to predict
or calculate these values - A BASys text card is generated and a series of
maps are prepared and posted on the website.
Source and reliability of annotations is also
provided
29Typical BASys Result
30BASys Output
31Other Microbial Resources
32MIPS (Munich Information Centre for Protein
Sequence)
http//mips.gsf.de/
33(No Transcript)
34Programs Used By Pedant
- HMMER
- PSORT
- PREDATOR
- COILS
- FGENESH
- pI
- PROSEARCH
- TargetP
- SAPS
- NCBI-BLAST
- SEG
- InterProScan
- SignalP
- TMHMM
- tRNAscan-SE
- GENSCAN
35Databases Used By Pedant
- EMBL
- PIR-PSD
- SWISS-PROT
- Functional Cat
- PROSITE
- TrEMBL
- Blocks
- PDB
- SCOP
- COGs
- Pfam
- STRIDE
36Navigating in PEDANT
37TIGR CMR
http//www.tigr.org/tigr-scripts/CMR2/CMRHomePage.
spl
38The CMR
- A database and searching system that allows
researchers to access all of the bacterial genome
sequences completed to date - Two kinds of annotation are displayed the
Primary annotation taken from the genome
sequencing center and the TIGR annotation
generated by an automated annotation process at
TIGR
39The CMR Genome Page
40CMR Page for A. pernix
41CMR Annotation
42Conclusion
- Content in these microbial resources is
equivalent to an encyclopedia that is 8000 pages
long just about that microbe alone - Far more visual and far more detailed than what
you can get in textbooks - Still not quite as thorough as a well written
book --- but were getting there!