GEBA A genomic encyclopedia of bacteria and archaea - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

GEBA A genomic encyclopedia of bacteria and archaea

Description:

Lane 4: DNA Molecular Weight Marker II (Roche 236250) Lane 5: DSM 13279, Collinsella stercoris ... Molecular Weight Marker II (Roche 236250) Lane 14: c( -Marker) ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 77
Provided by: tri5466
Category:

less

Transcript and Presenter's Notes

Title: GEBA A genomic encyclopedia of bacteria and archaea


1
GEBAA genomic encyclopedia of bacteria and
archaea
  • Jonathan A. Eisen
  • JGI User Meeting 2009

2
Nothing in biology makes sense except in the
light of evolution. T. Dobzhansky (1973)
3
(No Transcript)
4
rRNA Tree of Life
5
The Tree is not Happy
6
From http//genomesonline.org
7
As of 2002
  • At least 40 phyla of bacteria

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Based on Hugenholtz, 2002
8
As of 2002
  • At least 40 phyla of bacteria
  • Genome sequences are mostly from three phyla

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Based on Hugenholtz, 2002
9
As of 2002
  • At least 40 phyla of bacteria
  • Genome sequences are mostly from three phyla
  • Some other phyla are only sparsely sampled

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Based on Hugenholtz, 2002
10
As of 2002
  • At least 40 phyla of bacteria
  • Genome sequences are mostly from three phyla
  • Some other phyla are only sparsely sampled
  • Same trend in Archaea

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Based on Hugenholtz, 2002
11
Need for Tree Guidance Well Established
  • Common approach within some eukaryotic groups
  • Many small projects funded to fill in some
    bacterial or archaeal gaps
  • Phylogenetic gaps in bacterial and archaeal
    projects commonly lamented in literature

12
  • NSF-funded Tree of Life Project
  • A genome from each of eight phyla
  • At least 40 phyla of bacteria
  • Genome sequences are mostly from three phyla
  • Some other phyla are only sparsely sampled
  • Solution I sequence more phyla

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Eisen, Ward, Badger, Wu, Wu, et al.
Chloroflexi
13
Bacterial aTOL Project AIMS
  • Improve resolution of deep branches in the
    bacterial tree
  • Launch biological studies of these phyla and
    discover functional novelty
  • Leverage data for interpreting environmental
    surveys

14
T. roseum genome
15
The Tree of Life is Still Angry
16
Within Phyla Diversity Immense
  • Each phyla represents billions of years of
    evolution
  • Some have hundreds of major lineages
  • New lineages are being discovered all the time
  • Most branches within most phyla have few or no
    genomes

17
Major Lineages of Actinobacteria
18
Additional Impetus for Tree Guided Projects
  • Suggestion to sequence all bacteria and archaea
    in Bergeys Manual (Stevens et al)
  • Success in sequencing genomes from across the
    tree in animals
  • Multiple government reports suggest a more
    systematic approach to sequencing is needed

19
  • At least 100 phyla of bacteria
  • Genome sequences are mostly from three phyla
  • Most phyla with cultured species are sparsely
    sampled
  • Lineages with no cultured taxa even more poorly
    sampled
  • Solution - use tree to really fill gaps

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Well sampled phyla
20
http//www.jgi.doe.gov/programs/GEBA/pilot.html
21
GEBA Pilot Project Overview
  • Select 200 organisms using tree
  • Develop high throughput pipeline for strain
    growth and DNA preparation
  • Sequence and finish 100
  • Annotate, analyze, release data
  • Assess benefits of tree guided sequencing

22
GEBA Pilot I Selecting Targets
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
GEBA Pilot II The Importance of Project
Management
29
  • GEBA Project Flowchart

Annotation
Project Initiation
Sequencing
IMG1
Draft Sequencing and Assembly1
GEBA Proposal
Complete Genome GenBank Submission1
Shotgun Genome GenBank Submission1
Scientific and Technical Review1
OK?
OK?
IMG ER1
IMG ER1
Finish Sequencing and Assembly2
Negotiate Scope of Work
Gene-QA1
Draft Annotation3
Receive Starting Material1
Finish Annotation3
OK?
1 PGF 2 LANL 3 ORNL
David Bruce, Lynne Goodwin et al
30
GEBA Pilot III Partnership with DSMZ
31
GEBA Biggest ChallengeGetting DNA
  • Getting quality DNA is biggest bottleneck
  • Solution Beg Borrow and Steal
  • DSMZ offered to do for free
  • ATCC is doing a small number for a fee
  • In discussions with other PCC and other
    collections

32
(No Transcript)
33
Quantification gel of the genomic DNA isolated
from Conexibacter woesei (DSM 14684T)
Microorganisms
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Lane 1 c(?-Marker) 15 ng Lane 2 c(?-Marker)
30 ng Lane 3 c(?-Marker) 50 ng Lane 4 DNA
Molecular Weight Marker II (Roche 236250) Lane 5
DSM 13279, Collinsella stercoris Lane 6 DSM
43043, Intrasporangium calvum Lane 7 DSM 18053,
Dyadobacter fermentans Lane 8 DSM 20476, Slackia
heliotrinireducens
Lane 9 DSM 18081, Patulibacter
minatonensis Lane 10 DSM 14684, Conexibacter
woesei Lane 11 DSM 11002, Dethiosulfovibrio
peptidovorans Lane 12 DSM 11551, Halogeometricum
borinquense Lane 13 DNA Molecular Weight Marker
II (Roche 236250) Lane 14 c(?-Marker) 125
ng Lane 15 c(?-Marker) 250 ng Lane 16
c(?-Marker) 500 ng
Conexibacter woesei (DSM 14684T) was taken from
the German Collection of Microorganisms and Cell
Cultures (DSMZ). The genomic DNA was isolated
using the Qiagen Genomic 500 DNA Kit (Qiagen
10262). The genomic DNA was 10-250 kb in size as
determined by Pulsed Field Gel Electrophoresis
(PFGE). The bulk of DNA had a size of 50-250 kb
(see attached PFGE image). The DNA concentration
is 500 ng/µl as estimated from the gel.
Spectrophotometric measurements yielded a DNA
concentration of 450 µg/ml 300 µl of genomic DNA
are shipped (150 µg).
34
GEBA Pilot IV Sequencing, Annotation, Data
Release
35
Current Status
  • gt100 in progress
  • GEBA 56 (focus of first paper)
  • 34 finished genomes
  • 55 submitted to Genbank
  • Released to IMG-GEBA page and JGI-FTP site
  • All data is completely Open for anyone to use

36
IMG/GEBA
http//img.jgi.doe.gov/cgi-bin/geba/main.cgi
37
Adopt a Microbe
38
GEBA Pilot IV Assess Benefits of GEBA56
  • All genomes have some value
  • But what, if any, is the benefit of tree-guided
    sequencing over other selection methods

39
Why Increase Taxonomic Coverage II?
  • Gene discovery
  • Annotation, functional prediction
  • Metagenomic analysis
  • Mechanisms of diversification
  • Species phylogeny and classification

40
(No Transcript)
41
Value of diverse genomes I Gene discovery
  • Premise
  • New genomes frequently contain genetic novelty
  • Phylogenetic diversity of a genome should be
    correlated to novelty
  • Caveat
  • Does lateral gene transfer wipe out contribution
    of phylogenetic diversity to novelty?

42
Protein Family Rarefaction Curves
  • Take data set of multiple complete genomes
  • Identify all protein families using MCL
  • Plot of genomes vs. of protein families

43
(No Transcript)
44
Number of proteins
Total Gene Number
Genome Number
45
Novelty 2 - Structural Novelty
  • Of the 17000 protein families in the GEBA56, 1800
    are novel in sequence (Wu)
  • Structural modeling suggests many are
    structurally novel too (D'haeseleer)
  • 372 being crystallized by the PSI (Kerfeld)

46
Novelty 3
  • Diversity within known families

47
Transporter Profiles
Sebaldella termitidis ATCC 33386 has 2x number of
sugar PTS transporters of any genome
48
Novelty 4
  • Unusual distribution patterns

49
Shotgun Sequencing Detects More Diversity than
PCR-methods
50
First Bacterial Actin Related Protein
First found by V. Kunin, Structure Analysis by
Patrik D. et al
51
Most Closely Related to ARP8
52
Value of 100 diverse genomes II Annotation
  • Premise
  • Increased phylogenetic coverage should improve
    our ability to annotate genes in other (e.g.,
    reference/model genomes)

53
Annotation Improves
  • Conversion of hypothetical into conserved
    hypotheticals
  • Linking distantly related members of protein
    families
  • Non-homology functional prediction methods

54
Linking Protein Families Improved
55
Fusion Based Predictions Improved
56
Improving Rosetta Stone Predictions
57
Value of 100 diverse genomes III Metagenomics
  • Premise
  • Increased sampling of diverse genomes should
    improve many aspects of metagenomic analysis
  • To test
  • Annotation
  • Binning

58
Metagenomic Annotation Improves (Slightly)
59
Compositional Binning Improves (Slightly)
60
Phylogenetic Binning Improves Slightly
61
Value of 100 diverse genomes V Phylogeny
62
16s Says Hyphomonas is in Rhodobacteriales
Badger et al. 2005
63
WGT Says Its Related to Caulobacterales
Badger et al. 2005
64
(No Transcript)
65
(No Transcript)
66
GEBA - After the Pilot
67
PD of sequenced organisms
68
PD with GEBA
69
(No Transcript)
70
  • At least 40 phyla of bacteria
  • Genome sequences are mostly from three phyla
  • Most phyla with cultured species are sparsely
    sampled
  • Lineages with no cultured taxa even more poorly
    sampled

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Well sampled phyla
Poorly sampled
No cultured taxa
71
As of 2002
  • At least 40 phyla of bacteria
  • Genome sequences are mostly from three phyla
  • Some other phyla are only sparsely sampled
  • Same trend in Viruses

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Based on Hugenholtz, 2002
72
As of 2002
  • At least 40 phyla of bacteria
  • Genome sequences are mostly from three phyla
  • Some other phyla are only sparsely sampled
  • Same trend in Microbial Eukaryotes

Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Based on Hugenholtz, 2002
73
Need experimental studies from across the tree too
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
0.1
Chloroflexi
Tree based on
Hugenholtz (2002)
with some
modifications.
74
(No Transcript)
75
MICROBES
76
A Happy Tree of Life
Write a Comment
User Comments (0)
About PowerShow.com