Title: XMLbased Data Management Support for Biomedical Applications
1XML-based Data Management Support for Biomedical
Applications
Ohio State University Comprehensive Cancer Center
Department of Biomedical Informatics Multiscale
Computing Laboratory www.multiscalecomputing.org
www.projectmobius.org
Scott Oster, Stephen Langella, Shannon Hastings,
Tahsin Kurc, Joel Saltz
Large Scale Biomedical Image Analysis
Mobius
State-of-the-art biomedical imaging studies make
use very large image datasets, potentially at
multiple institutions. It is necessary to carry
out feature-based analysis of both morphological
and functional data and link this information to
clinical data. Functional Imaging of Tumors
Use of static and dynamic image information to
determine anatomic microstructure and to
characterize physiological behavior. Digitized
Microscopy Remote viewing and analysis of
microscopy specimens. One of the main
challenges is that a typical study may involve
1000s of images distributed across multiple
sites. Large sizes of image data (up to 20-50GB
per image) represent a significant challenge in
storing, querying, and sharing digitized
microscopy images.
- Mobius provides a set of generic grid services
and protocols to support - distributed creation, versioning, and management
of data models and data instances, - on demand creation of databases, federation of
existing databases, and - querying of data in the Grid.
- Its design is motivated by the requirements of
Grid-wide data access and integration.
- Global Model Exchange (GME)
- Publish, Version, Retrieve, and Query Schemas
- Schema Discovery
- Hierarchical service instances
- Each has an authority (excluding root)
- Each is the authority of a set of namespaces
- Federated Ad hoc Storage Service (Mako)
- Federated Framework for Managing Data
- Data indexed from GME-published schemas
- Management of Data Store, Update, Retrieve,
Delete, Query (via XPath) - Provides an XML Realization for an underlying
data resource
An Image Archival and Analysis System Client
front-end implementing the functionality to
submit queries in a uniform way against
distributed image databases. Support for
extensible metadata schema for images that can be
used to represent 2D, 3D, and time dependent
images with optional application specific
metadata.
Synthesis of Information for Phenotype-Genotype
Analyses.
- Genotype-phenotype correlation analysis can be
used to identify polymorphism in candidate genes
that correlate with disease related phenotypes
and to help in achieve a better understanding of
complex diseases such as Coronary Artery Disease
(CAD). Such analysis can involve integrating SNP,
Gene, and Phenotypic data from public
repositories and local datasets, BLAST searches,
phylogenetic analysis. - Support for creation of materialized views (or
local caches) of external data sources on storage
clusters. - Web spiders to download data from external data
sources. - Currently, spiders for Genbank, BLAST, and
MatchMiner. - Unified, extensible data models for SNP data,
Phylogenetic analysis output, strains and
phenotype data from mouse phenome database, BLAST
and MatchMiner output.
- Virtualized and Federated Data Access. Multiple
image servers can be grouped to form a collective
which can be queried as if it were a single,
centralized server entity.
- Active Storage. Invocation of user-defined
procedures is supported on ensembles of images in
a distributed environment.
- On-demand Database Creation. User-defined data
types and image datasets conforming to a given
schema can be automatically manifested as custom
databases at runtime.
Collaborators Dan Janies, Biomedical
Informatics Wolfgang Sadee, Pharmacology Gustavo
Leone, Human Cancer Genetics Program Michael
Knopp, Radiology Tony Pan, Biomedical
Informatics Kun Huang, Biomedical Informatics
Integration of data collected in basic research
with clinical lab and outcome data is needed to
translate basic biomedical research to a
successful clinical application.
Linking to clinical outcome data in Enterprise
data warehouses through Mobius and XQuark Bridge.