Title: Genboree Discovery Process Integration
1Genboree Discovery Process Integration
- Aleksandar Milosavljevic, PhD
- Baylor College of Medicine
- January 10th, 2008 modified April 1st 2008.
2Genboree design has evolved to support diverse
discovery processes
Authors Gonzalez-Garay, M.L.1,2, Morgan, M.B
2, Miller, CA1,3, Jackson, A.R. 1,2, Tong,
M.1,2 , Gibbs, R.A.2,3, Milosavljevic,
A.1,2,3 1Bioinformatics Research Laboratory
2Human Genome Sequencing Center, Department of
Molecular and Human Genetics 3Program in
Structural and Computational Biology and
Molecular Biophysics, Baylor College of
Medicine, Houston, Texas, USA
3Genboree design has evolved to support diverse
discovery processes
- Genome sequencing and annotation
- human Ch 3, 12 sea urchin tribolium, nasonia
bacterial - Genome comparison
- rat, mammals, rhesus macaque, primates
- High-throughput microchip analysis
- array CGH, BCM cytogenetics laboratory
- DNA sequencing
- PCR-based resequencing mapping rearrangements
in cancer genomes using next generation
sequencing technologies
4Genboree software development
- Understanding of users needs and opportunities
- Focus on complete process
- Identify recurrent use patterns
- Generalize, minimize assumptions for widest use
- Design generic components for reuse across
projects - Fast prototyping
- Frequent feedback
- Extensive optimization of key components
5Usage of Genboree.org as of Dec 31 2007
- 923 registered users
- 100 users each working day
- 10,000 clicks per day
- 112 collaborating groups with 2 researchers,
with a median of 7 and average of 23.5 members
per group - many collaborating groups still maintain private
access to their pre-publication data and - 23 databases have been made widely accessible via
internet by the collaborating groups
6Genborees current focusTranslational studies
of genome variation
7Example 1 research at the clinical cytogentics
laboratory at BCM
- Agilent array design for array CGH (version 7)
identifying hypermutable Low Copy Repeat regions
for deeper probing - Analysis of a single case
- Upload Agilent array data
- Perform segmentation (invoke Bioconductor tool)
- Subtract polymorphisms (databases, current
literature) - Establish preliminary diagnosis/ hypothesis
- Integrate downstream data ( paternal testing,
FISH, etc.) - Share diagnosis and data view with reporting
physician - Identify recurrent events (based on BCM database)
- Identify new syndromes
8Translational studies of genome variation
High volume anchoring (by Pash 2.0) integrated
with Genboree
9Genboree overview
10Integration of Genboree into caBIG
infrastructure and caGRID
Current state
Integrated state
11Example 3 Pilot TCGA pilot project
- Target selection ( Target Selection Group, Mr.
Chris Miller, Dr. Anna Lapuk) - Data integration (Level 3,4 data)
- Collaborative target selection
- Target allocation across centers (Manuel)
- PCR assay design integration with the HGSC LIMS
system for probe design hole filling,
polymorphisms, repeats, tiling (Dr. Manuel
Gonzalez-Garay) - Quarterly Report for PCR resequencing (Dr. Manuel
Gonzalez-Garay) - ( Mutation annotation, sharing of PCR assays,
prioritization of mutations for collaborative
downstream studies )
12TCGA Project Quarterly Resequencing Report
Annotation object - viewer - editor - PCR primer
designer - other tools
13TCGA Project Quarterly Resequencing Report
14Key challenge lowering threshold to adoption of
genomic technologies by clinical researchers
- Key challenge Integration of discovery
processes. - -gt Integration of data and tools required but
not sufficient. - Genboree follows internet-based hosting
model - (Software as a Service).
- -gt No installation, maintenance, needed.
- -gt Open source release in Q2 2008.
- -gt Commercial launch of the Genboree Early
Access Hosting program ( www.genboree.com ) on
March 1st 2008.