Title: GNARE-GADU: An Environment for Grid-Based High-Throughput Genome Analysis
1GNARE-GADU An Environment for Grid-Based
High-Throughput Genome Analysis
- Mathematics and Computer Science Division
- Argonne National Laboratory
- Natalia Maltsev, Dina Sulakhe,
- Alex Rodriguez Bioinformatics group
- Mike Wilde, Nika Nefedova, Jens Voeckler
- Ian Foster the Globus Group
2MCS Argonne Bioinformatics Group
Projectshttp//compbio.mcs.anl.gov
PUMA2 a Grid-enabled interactive environment
for analysis of evolution of metabolism
http//compbio.mcs.anl.gov/puma2 PUMA2 contains
analysis of 1032 genomes, Integrates
information from over 20 sequence and metabolic
databases Supports user annotations Over 3500
unique individual users in April 2005
- Application Projects depending on
PUMA2/GADU/GNARE - Pathos Microbial informatics Core for NIH
Great Lakes Center of Excellence in Biodefense
and emerging infections - TarGet NIH Midwest Structural Biology Center
- MetaGenomes DOE Microbial Genomes program
- Sentra -- database prokaryotic signal
transduction - SubUnit database
- Physiological Profiles
- PUMA2 tools
- CHISEL a workbench for evolutionary analysis of
enzymes,assignment of functions to genes,
PhyloBlocks, BloBla, SVMMER, etc
- GADU/GNARE provides PUMA2 with
- Grid-based computational Backend
- Controlled automated workflows for data
acquisition and analysis
3Biggest Challenge !! Site Selection.
Gadu runs jobs dynamically on different grid
sites without reserving any nodes. Its difficult
to maintain the status of all the sites and
select a site that is appropriate for a given job.
4GADU/Gnare Future Plans and Use
-
- Alpha testing with the individual projects.
- Ongoing Projects
- SEED
- PNNL-Microbial Genome Program
- Northern Illinois University Rice Genome
Project. - Northwestern University
- Web Services
- Use of Grid infrastructure
- Goal is of 2 databases updates per month
(includes Blast, Blocks) - Each database update entails 20000 jobs running
at about 2 hours each on one node.