Integrated Microbial Community Genomes (IMCG) Data Management System - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Integrated Microbial Community Genomes (IMCG) Data Management System

Description:

16S Clone Library Sequence Submission Survey ... Clone Library SAI. 13. 105. all. Contact details. Submitter SAI. No. Yes. Relevant for sample type: ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 20
Provided by: mges5
Category:

less

Transcript and Presenter's Notes

Title: Integrated Microbial Community Genomes (IMCG) Data Management System


1
Integrated Microbial Community Genomes (IMCG)
Data Management System
eGenomics Meeting
Sep 7-9 2005
Victor M. Markowitz (BMDTC, CRD)Nikos Kyrpides
(MGAP, JGI)Natalia Ivanova (MGAP, JGI)Phil
Hugenholtz (MEP, JGI)
2
Synopsis
  • Problem
  • Metagenomics needs new analysis methods, such as
    for
  • Determining gene functions metabolic capacity
    of microbial communities and member species
  • Ex metabolic pathways involved in biomass
    conversion and biofuel production in termite
    hindguts
  • Studying intra-population variants and their
    correlation with environmental parameters
  • Metagenome analysis is
  • Data computation intensive
  • Iterative involves evolving data sets methods
  • Solution
  • Integrated Microbial Community Genomes (IMCG)
    system
  • Collect and manage metagenome data
  • Support metagenome analysis in an integrated
    context

3
Significance
4
Metagenome Data Collection Issues
Sargasso Sea
Soil
Acid mine drainage
Data Quality, Precision
Gene classification
Environmental attributes
Existing data repositories not designed to record
metagenome data
5
Metagenome Data Analysis Issues
Hypothesis
Genome
  • Challenges
  • Individual organism genomes are poorly
    characterized
  • Breadth, depth, quality, precision
  • Diversity of data sources required for analysis -
    data integration
  • Microbial community genomes
  • May require different concepts for modeling,
    analysis

6
Example Functional Characterization
7
Rationale
  • Premise
  • Effective metagenome analysis requires a
    comprehensive data management system
  • Strategy
  • Develop IMCG system supporting systematic
    collection, management, maintenance of metagenome
    data in context of
  • Integrated isolate microbial genome data (IMG)
  • Environmental, geographical, geochemical data
  • Impact
  • Accelerate pace of metagenome projects at LBNL,
    JGI, etc
  • Serve as community resource for metagenome data

8
Opportunity
  • Context
  • A metagenome data management system, recognized
    as a critical resource for the past several
    years, is not available
  • Expertise
  • Microbial genome analysis microbial ecology
    (JGI)
  • Biological data management system development
    (BDMTC)
  • Large scale data storage and computing (NERSC)
  • IMG System
  • High quality, evolving repository for microbial
    genome data
  • Critical foundation for a metagenome system
  • Provides glimpse into value of IMCG

9
IMCG IMG Relationship
Data Analysis
Data Repository
Data Processing
Data Acquisition
  • Community
  • diversity
  • Unique genes in
  • community
  • Marker genes
  • Pathway analysis
  • COG analysis
  • Comparative
  • analysis
  • .
  • Location (long/lat)
  • Morphology
  • Env conditions
  • Physiological cond.
  • Temperature
  • pH conditions
  • Amt collected/used
  • Est biomass
  • .

10
Metagenome Data in IMG /M
  • Next
  • New
  • 2 sludge data sets
  • gutless worm
  • termite hidgut
  • Existing
  • AMD
  • 4-7 Saragasso Sea data sets
  • reassembled
  • Soil

11
Metagenome Data in IMG
Genes in comparative context
Gene similarity wrt isolate genomes
12
Roadmap
  • Start
  • Official start of project- LBNL RD funding Oct
    2005
  • Milestones
  • Apr 2006 IMCG alpha/preview version
  • Oct 2006 first public version
  • Strategy
  • Load selected data sets gutless worm, termite
    hindgut, etc.
  • Emphasis on data quality, precision
  • System will evolve based on community feedback
  • Funding
  • Grants, collaborations
  • Challenge
  • Long term funding for maintaining a community
    resource

13
16S Clone Library Sequence Submission Survey
14
16S Clone Library Sequence Submission Survey
  • Select Habitat Type
  • Water
  • Wastewater
  • Soil
  • Extreme Habitats
  • Host-associated
  • Anthropogenic
  • Sediment
  • New Habitat

15
Example
16
Survey - Response
17
Survey - Response
Votes
Relevant for sample type Yes No
Submitter SAI Contact details all 105 13
Contact email address all 118 6

Clone Library SAI DNA / RNA extraction method all 85 26
Forward primer name all 107 10
Forward primer sequence all 102 12
Reverse primer name all 105 9
Reverse primer sequence all 102 12
Annealing temperature all 92 20
PCR cycle number all 81 29
Cloning vector all 82 29
http//www.jgi.doe.gov/16s/
18
Sequence-Associated Information (SAI)
Votes
Relevant for sample type Yes No
Sample SAI Latitude and Longitude all 66 9
Sampling location description all 36 1
Sample size all 88 22
URL for additional informations all 92 18
Sample type all 169 0
Temperature all 84 10
Sample treatment and preservation all 17 2
(Examples) pH Water 31 8
Salinity Water 35 5
Dissolved organic carbon Water 27 12
Moisture Soil 32 4
Ground cover / vegetation Soil 16 3
Agricultural use Soil 17 3
Host species Host-associated 21 1
Anatomical site Host-associated 22 2
Association type Host-associated 20 2
http//www.jgi.doe.gov/16s/
19
Some quotes from the survey
The best is to make as many menus as possible,
each free text fields gives more chances for
errors and inconsistencies.
entisol, andisol, inceptisol, gelisol,
histisol, aridisol, vertisol, alfisol, mollisol,
ultisol, spodosol, oxisol
Even Marine / Freshwater can be broken down
further into Freshwater Lake, Stream, River,
Pond, Artificial environment My point is that it
may be difficult to request all of this...
so this participant is for free text fields!!
? Trade-off between versatility and increased
chance of inconsistencies!!
http//www.jgi.doe.gov/16s/
Write a Comment
User Comments (0)
About PowerShow.com