Title: Dynamic Data Driven Applications Systems
1(No Transcript)
2Dynamic Data Driven Applications Systems
- Joel Saltz
- Chair and Professor
- Biomedical Informatics Department
- The Ohio State University
3Parameter Study Application Scenarios
- Clinical imaging studies
- Determine tumor characteristics via segmentation,
texture analysis of medical imagery - Test and refine algorithms by invoking test
algorithms on distributed datasets of gt1000
dynamic contrast MR studies - Simulation parameter studies
- 1000s of oil reservoir simulations used to
determine how to optimize oil production
4Parameter Study Data Analyses
- Compare dataset contents
- Compare features
- Spatially based comparisons
- Map datasets between mesh/coordinate systems
MicroCT Osteoporosis Study Kim Powell, Cleveland
Clinic Don Stredney, OSC
5Canonical Services
- Component Framework for Combined Task/Data
Parallelism - Data Aggregation, generalized reductions
- Crucial and ubiquitous in data analysis
- Integrated with Globus/NWS/SRB etc (NPACkage)
OGSA integration underway
- Canonical services carried out by Data Parallel
Components - Data Cluster/Decluster/Spatial Indexing/Range
Query Service (Inherited from Active Data
Repository) - Super-Semantic Data Cache when carrying out
parameter studies, use caching to eliminate
redundant computations (Andrade SC2002) - Grid Generalized Reduction (Ferreira ICS2002
6Clinical Studies using Dynamic Contrast Imaging
- 1000s of dynamic images per research study
- Iterative investigation of image quantification,
image registration and image normalization
techniques - Assess techniques ability to correctly
characterize anatomy and pathophysiology - Ground truth assessed by
- Biopsy results
- Changes in tumor structure and activity over time
with treatment - Images from many sites including NIH, Heidelberg,
Oklahoma, Ohio State - Collaboration with Michael Knopp, MD
7prior to therapy
1370
1370
after 2 cycles
1421
1421
1421
after 4 cycles
1438
1438
Knopp M, OSU Radiology / dkfz
8DCE-MR Analyses
- Fit pharmacokinetic model ODEs
- Tumor characterization using texture analysis and
feature detection techniques - Register images from consecutive studies
- Register images within single time dependent
study to correct for patient motion - Images obtained with varying time/space
resolution -- interpolate onto common time/space
mesh
9(No Transcript)
10- A Data Intense Challenge
- The Instrumented Oilfield of the Future
- Participants
- University of Texas at Austin
- CSM Wheeler, Dawson, Peszynska
- IG Sen, Stoffa
- PGE Torres-Verdin
- University of ChicagoCS Stevens, Papka
- University of MarylandCS Sussman
- Ohio StateCS Saltz, Kurc
- RutgersECE Parashar
- MITEngineering Haines
11- A Data Intense Challenge
- The Instrumented Oilfield of the Future
- Industrial Support (Data)
- British Petroleum (BP)
- Chevron
- International Business Machines (IBM)
- Landmark
- Shell
- Schlumberger
12Production Simulation via Reservoir Modeling
Monitor Production by acquiring Time Lapse
Observations of Seismic Data
Revise Knowledge of Reservoir Model via Imaging
and Inversion of Seismic Data
Modify Production Strategy using an Optimization
Criteria
13Example Scenario (SC2001)
14Software Support
- Component Framework for Combined Task/Data
Parallelism - Use defines sequence of pipelined components --
filter group - User directive tells preprocessor/runtime system
to generate and instantiate copies of filters - Many filter groups can be simultaneously active
- Integration proceeding with Globus/Network
Weather Service - SC 2002, HCW2002, Parallel Computing 2001
15DataCutter
- Components
- Embarrasingly Parallel
- Generalized Reduction
- Wrapped MPI
- Flow control between components
- Schedulers place filters on grid processors
(scheduler API) - Stream based communication being upgraded to
OSGA model - Data Parallel Compiler Prototype
- NPACkage
16Integrating DataCutter with existing Grid
toolkits SRB (done), Globus, NWS (ongoing)
- SRB integration Subset and filter datasets
- Globus integration DataCutter uses Globus
resource discovery, resource allocation,
authentication, and authorization services. - Network Weather Service (NWS) integration NWS
for used for system monitoring.
17Cannonical Services
- Canonical services carried out by Data Parallel
Components - Data Cluster/Decluster/Spatial Indexing/Range
Query Service (Inherited from Active Data
Repository) - Super-Semantic Data Cache (Andrade SC2002)
- Grid Generalized Reduction (Ferreira ICS2002)
18Clustering/Declustering Datasets
- Partition dataset into data chunks -- each chunk
contains a set of data elements - Each chunk is associated with a bounding box
- DataCutter Data Loading Service
- Distributes chunks across the disks in the system
- Constructs an R-tree index using bounding boxes
of the data chunks
Disk Farm
19Super-Semantic Data Cache
20Advantage of Using Cached Intermediate Results
(Virtual Microscope)
21Grid Generalized Reduction
22Other Biomedical Grid Applications
Virtual Microscope
- Grid based clinical research support
- 1000s of clinical research sites
- Different studies involve different subsets of
sites - Ad-hoc federated databases
- Lots of data naming issues
- Support for anonymization
- Role based data access
- Support for authentication, encryption
- Support for image analysis
- NCI Cancer Center Support
versus
query
images
23DataCutter Development Group
University of Maryland Alan Sussman Henrique
Andrade Christian Hansen
Ohio State University Joel Saltz Tahsin Kurc Umit
Catalyurek Gagan Agrawal Renato Ferreira