Title: Center for Subsurface Sensing
1Center forSubsurface Sensing Imaging Systems
Overview of Research Thrust R3
- R3 Fundamental Research Topics
- R3A Parallel Processing
- Middleware/Parallelization Tools
- FPGA Acceleration
- R3B Solutionware Development
- Subsurface Toolboxes
- Image and Sensor Data Databases
David Kaeli - NU Miriam Leeser - NU Wilson Rivera
- UPRM
2Overview of the Strategic Research Plan
Bio-Med
Enviro-Civil
L3
S3
S2
S1
S4
S5
ValidatingTestBEDs
L2
R2
FundamentalScience
L1
R3 Image and Data Information Management
R1
3CenSSIS Barriers Addressed by R3 Projects
Lack of Computationally Efficient, Realistic
Models
Barrier 4
Lack of Rapid Processing and Management of Large
Image Databases
Barrier 6
Lack of Validated, Integrated Processing
and Computational Tools
Barrier 7
4Center forSubsurface Sensing Imaging Systems
Middleware and Parallelization Tools
David Kaeli NU Wilson Rivera UPRM Carmen
Carvajal - UPRM Magda El-Shenawee - U
Arkansas Geoff Krapf NU (undergrad) Waleed
Meleis NU Craig Shaffer NU (undergrad) Karen
Tompko U Cincinnati Yijian Wang - NU Juemin
Zhang - NU
5CenSSIS Middleware Tools
- Parallelization of MATLAB, C/C and Fortran
codes using Message Passing Interface (MPI) a
software pathway to exploiting GRID-level
resources - Presently utilizing local Beowulf clusters, NCSA
resources (Mariner Center at BU) and Internet-2 - Profile-guided program instrumentation/optimizatio
n - Utilizing MPI-2 to address barriers in I/O
performance - Building on existing Grid Middleware such as
Globus Toolkit, MPICH-G2 and GridPort
MATLAB
MPI
C/C
MPICH-G2
Parallelization
Fortran
UPC
6Impact on CenSSIS Applications
- Reduced the runtime of a single-body Steepest
Descent Fast Multipole Method (SDFMM) application
by 74 on a 32-node Beowulf cluster - Hot-path parallelization
- Data restructuring
- Reduced the runtime of a Monte Carlo
- scattered light simulation by 98 on
- a 16-node Silicon Graphics Origin 2000
- Matlab-to-C compliation
- Hot-path parallelization
- Obtained superlinear speedup of Ellipsoid
- Algorithm run on a 16-node IBM
- Super-Parallel (SP2) system
- Matlab-to-C compliation
- Hot-path parallelization
7Techniques for Parallelizing MATLAB
- Manage completely independent MATLAB processes
distributed over different processors - Message passing within MATLAB (e.g.,
MultiMATLAB) - MATLAB calls to parallel libraries
(multi-threaded LAPACK, PLAPACK) - Backend compilers can convert MATLAB to C, and
automatically inserting MPI calls (e.g.,
RTExpress)
Multiple MATLAB sessions
A Single MATLAB session
Our Approach
Matlab Code
Parallel Code
C Code
Matlab C compiler
Use MPI
8Our Approach for Parallelizing MATLAB
Main 0/2502
Convert MATLAB to C using the MATLAB mcc compiler
1
Function Self /Children Runtime
18/0
1
3/35
1
1
Number of Calls
16/0
Tfqmr 1/1604
Convert array structs (generated by mcc) to
pointer-based structs where needed
273
0/0
273
273
Mult1 1/1602
0/0
Profile the C program to capture both data flow
and control flow
273
Mult 4/1599
273
Multdifl 706/0
273
273
Multfaflone4m 523/6
Parallelize the hot regions of the the
application using MPI
Multfaflone4 359/5
54K
0.6M
2.4M
0/0
0/0
0/0
9Eliminating I/O Barriers in ParallelSubsurface
Applications
- Many SSI applications tend to be file bound or
memory bound (or both) - While we can use MPI to parallelize processing
and use MPI collective-I/O to accelerate I/O, we
still are limited to accessing a file on a single
disk - Our present work looks at parallelizing I/O by
partitioning files associated with MPI processes - We attempt to utilize, slower and commodity (IDE)
local secondary storage
10Eliminating I/O Barriers in ParallelSSI
Applications
- Parallelize computation using MPI
- Profile chunk access frequencies and temporal
access patterns on a per process basis - Use profile to guide partitioning to reduce
overall execution time by 27-82 - Presently targeting both file-bound and
memory-bound applications
11Some Recent Publications
- Profile-based Characterization and Tuning
Subsurface Sensing Applications, M. Ashouei, D.
Jiang, W. Meleis, D. Kaeli, M. El-Shenawee, E.
Mizan, M. and C. Rappaport, Special issue of the
SCS Journal, November 2002. - Parallel Implementation of the Steepest Descent
Fast Multipole Method (SDFMM) on a Beowulf
Cluster for Subsurface Sensing Application, D.
Jiang, W. Meleis, M. El-Shenawee, E. Mizan, M.
Ashouei, and C. Rappaport, IEEE Microwave and
Wireless Components Letters, January 2002. - Electromagnetics Computations Using MPI Parallel
Implementation of the Steepest Descent Fast
Multipole Method (SDFMM), M. El-Shenawee, C.
Rappaport, D. Jiang, W. Meleis, and D. Kaeli,
Applied Computational Electromagnetics Society
Journal, August 2002. - An efficient parallel algorithm for solving
unsteady nonlinear equations, W. Rivera, J. Zhu,
and D. Huddleston, Proc. International Conference
on Parallel Processing, IEEE Computer Society,
2002. - Mapping and characterization of applications in
heterogeneous distributed systems, J. Yeckle and
W. Rivera , To appear in Proceed. of the 7th
World Multiconference on Systemics, Cybernetics
and Informatics (SCI2003). - Profile-Guided I/O Partitioning, Y. Wang and D.
Kaeli, Submitted to ICS03. -
12Grid Computing
- Solving Subsurfacing Barriers using Grid
Computing - Deployment of distributed CenSSIS applications
- Development of adaptive middleware
- Profile-guided parallelization/optimization
- Multi-language support
- Profile-guided I/O partitioning
- Interaction with distributed image database
resources - CenSSIS/HP Industrial Relations
- Strong links with Latin American Universities
interested in Grid Computing - Student and/or faculty interchange program with
CenSSIS schools - Leadership in leading IEEE/ACM GRID Computing
Workshops
13GRID Resources are needed for key CenSSIS
modeling applications
computable on parallel systems
targeted for GRID systems
14Grid Computing Experimental Grid _at_ UPRM
Storage
Sensors
Grid Community Model
Campus Backbone
Application Layer
?
Middleware Layer
Internet 2
Common Infrastruc. Layer
Resource Layer
IA64 Cluster
15Grid Computing Pattern Categorization
Hyperspectral Images
Computational methods for ensembles of
nonparametric supervised classifiers Feedback
algorithm Parallelization (Matlab to
MPI/C) Intrusion Detection Countermeasure
design problems
16Grid Computing LATAM Task Force
- Create a LATAM Task Force on Grid Computing.
- Universidad de Chile, Chile
- Ricardo Baeza, PhD in Computer Science,
University of Waterloo - Universidad de los Andes, Venezuela
- Herbert Hoeger, PhD in Computer Science,
University of Iowa - Universidad de Sau Paulo, Brasil
- Marcio Lobo, PhD in Computer Science, TUD,
Germany - Instituto Tecnologico de Monterrey, Mexico
- Cesar Vargas, PhD in Electrical Engineering,
Louisiana State University - Universidad del Valle, Colombia
- Angel Garcia, PhD in Telecommunications,
UPV-Spain. - Hold a Grid Workshop for these researchers/educato
rs at UPRM, and invite both CenSSIS and HP
people to serve as reviewers and panelists
(slated for Nov. 2003). - Provide tutorials and short courses on Grid-level
computing. - We will utilize CenSSIS problems as the
motivating examples that will be parallelized.
Implementations will be prototyped at UPRM, NU
and BU.
17Center forSubsurface Sensing Imaging Systems
Field Programmable Gate Arrays For Subsurface
Imaging
Miriam Leeser NU Wang Chen - NU Srdjan Coric -
NU Shawn Miller - NU Seth Molloy NU (undergrad)
Josh Noseworthy NU (undergrad) Haiqian Yu - NU
18Field Programmable Gate Arrays for Subsurface
Imaging
- Backprojection for Computed Tomography image
reconstruction - Sponsored by Mercury Computer
- Accelerating Finite Difference Time Domain (FDTD)
in hardware - Collaboration with Carey Rappaport, NU
- Retinal Vascular Tracing in real time
- Collaboration with Badri Roysam and Chuck
Stewart, RPI - Diverse problems, similar solutions
- FPGAs are particularly well suited for
accelerating image processing algorithms
19Backprojection
- Backprojection algorithm used in medical imaging
- Traditionally performed by custom hardware
- Application specific integrated circuits
- and/or custom board designs
- New systems require greater flexibility
- Algorithms under development for 3D
reconstruction - Application specific integrated circuits
- viewed as costly both in time and NRE
- FPGA implementation offers significant advantages
- Algorithm flexibility and re-use
- Fixed point and quantization effects matter
- Difference between fixed and floating point must
be small
20Projection Parallelism for Performance
Parallelism implemented in FireBird (Max 16-way
parallel)
Data dependency for backprojection processing
Projections
Image columns
Image columns
Projections
Image columns
Projections
Image rows
Image rows
1024 projections x 1024 samples/projection
Each used to reconstruct a 512 x 512 image
21Backprojection Speedup Due to Parallelism -
Expandable to n-way parallel
22Quality of Results are High
Software reconstruction (Floating Point)
Hardware reconstruction (Fixed Point)
Relative Error
Sinogram quantization 9 bits Interpolation
factor 3 bits Relative Error 0.001295
23FPGA Hardware Provides 100x Speedup Over Software
on 1GHz Pentium
A Software - Floating point - 450 MHz Pentium
240 s B Software - Floating point - 1 GHz
Dual Pentium 94 s C Software - Fixed point -
450 MHz Pentium 50 s D Software -
Fixed point - 1 GHz Dual Pentium 28 s E
Hardware (Wildstar, simple) - 50 MHz
5.4 s F Hardware (Wildstar, 4-way) - 50
MHz 1.3 s G Hardware
(Firebird, 8-way) - 65 MHz
0.5s H Hardware (Firebird, 16-way) - 65 MHz
0.25s
Parameters 1024 projections 1024 samples per
projection 512 x 512 pixels image 9-bit
sinogram data 3-bit interpolation factor
24FDTD Equations Discretize Maxwells Equations
GPR Modeling
- Update each space cell's electric and magnetic
field by using previous values of this cell and
its neighbors cells around it - Extremely computationally expensive
- Benefits from hardware acceleration
253-D Buried Object Detection Forward Model
26The Quantized Fixed-point Simulation
27Detailed Architecture of 2-D FDTD Implementation
(BlockRam interface and Pipeline updates for one
time step)
28Retinal Vascular Tracing Register 2-D Image to
3-D in Real Time
- Feature extraction
- Registration image pairs
- Registration montages
- Registration real-time / on-line
- Software is too slow
- Use FPGAs to accelerate to video frame rate
- Image guided surgery
29Retinal Vascular Tracing Register 2-D Image to
3-D in Real Time
Direction of blood vessel
PCI BUS
Smart Camera
30Developing Embedded Solutionware
All Three Projects Use Same Reconfigurable
Hardware, Same Design Flow Result is
Considerable Processing Speedup, Moving
Processing Closer to Sensors
Firebird PCI board from Annapolis Microsystems
31Center forSubsurface Sensing Imaging Systems
- Solutionware Development
- Subsurface Toolboxes
- Image and Sensor Data Databases
David Kaeli NU Chuck Stewart RPI Emmanuel
Arzuaga UPRM Jennifer Black NU Kyle Guilbert
NU (undergrad) Matthew Kowalski NU
(undergrad) Chakib Ouarraoui NU Amitha Perera -
RPI Becky Norum NU Derek Uluski NU (undergrad)
32CenSSIS Solutionware UPRM/NU/RPI
- Toolbox Development
- Support the development of CenSSIS Solutionware
that demonstrates our Diverse Problems Similar
Solutions model - Delivered a software-engineered Multi-View
Tomography Toolbox, developed in OOMATLAB - Developing three new CenSSIS Toolboxes
- Registration RPI/WHOI
- Hyperspectral Imaging UPRM
- 3-D Modeling - NEU
- Establish software development and testing
standards for CenSSIS - Image and Sensor Data Database
- Develop an web-accessible image database for
CenSSIS that enables efficient searching and
querying of images, metadata and image content - Develop image feature tagging capabilities
Matlab 6
33Current/Future Toolbox Development
- Development of multi-language toolboxes C,
Fortran, C, Java, MATLAB and OO-MATLAB - Delivered the MVT Toolbox open source
- Presently working on three additional toolbox
efforts - Developing a parallelized version of the MVT
Toolbox - Adopted Software Engineering Institute Capability
Maturing Model (CMM) Level 3 standards - Software library and bug tracking being developed
(CVS and Bugzilla) - Software Engineers on staff at NU, UPRM and RPI
Matlab 6
MSD
LPM
Modeling
34CenSSIS Image Database System
- Deliver an web-accessible database for
- CenSSIS that enables efficient searching
- and querying of images, sensor data,
- metadata and image content
- More that 200 metadata-rich images/datasets
- presently available online (gt 1000 by Year 5)
- Database Characteristics
- Relational complex queries (Oracle8i)
- Data security, reliability and layered user
privileges - Efficient search and query of image content and
metadata - Content-based image tagging using XML
- Indexing algorithms (2D, 3D and 4D)
- Explore object relational technology to handle
collections
mouse embryo
3
4
2
1
35CenSSIS Image Database System
36 lt/MediaCodinggt ltMediaInstancegt
ltIdentifier IdOrganization'Clinomics' IdName
'BreastCancerCell'gtBreastCancerCell//
image0001 lt/Identifiergt
ltLocatorgt
ltMediaURLgtfile//D/Breast/cells/imag0001.jpglt/Med
iaURLgt lt/Locatorgt
lt/MediaInstancegt lt/MediaProfilegt
ltStructuredAnnotationgt ltWhogtPatient239lt/Whogt
ltwhatObjectgtHuman primary breast tumor
cellslt/whatObjectgt ltWhatActiongt growing in
a NASA Bioreactor lt/WhatActiongt ltwheregt St.
Marys Hospital lt/wheregt ltWhengt 09/25/2002
lt/Whengt ltwhygt Investigate tumor cells
behaviour on microcarrier beads lt/whygt
ltTextAnnotation xmllang'en-us'gt Higher
magnification of view illustrating breast cancer
cells with intercellular boundaries on bead
surface lt/TextAnnotationgt
lt/StructuredAnnotationgt lt/StillRegiongt lt/Imagegt
ltImagegt lt!-- General Cell Infomation --gt
ltCellInformationgt ltIDgt 9 lt/IDgt
ltClinomicsIDgt 931175495 lt/ClinomicsIDgt ltDOBgt
2/7/30 lt/DOBgt ltSEXgt F lt/SEXgt ltCOLL_DATEgt
11/2/1993 lt/COLL_DATEgt ltPrimary_sitegt Breast
lt/Primary_sitegt ltINITIALgt II lt/INITIALgt
ltGRADEgt POORLY DIFFERENTIATED lt/GRADEgt
ltHISTOLOGYgt UNKNOWN lt/HISTOLOGYgt ltPRIM_SITE2gt
NONE lt/PRIM_SITE2gt ltPRIM_DATEgt 4/1/1992
lt/PRIM_DATEgt ltMET1_SITEgt NONE lt/MET1_SITEgt
ltMET1_DATEgt NONE lt/MET1_DATEgt ltTUBE_TYPEgt p
lt/TUBE_TYPEgt lt/CellInformationgt ltStillRegion
id"IMG0001"gt ltMediaProfilegt
ltMediaFormatgt ltFileFormatgtjpeglt/FileForm
atgt ltSystemgtPALlt/Systemgt
ltMediumgtCDlt/Mediumgt ltColorgtcolorlt/Colorgt
ltFileSizegt332.228lt/FileSizegt
lt/MediaFormatgt . . . . . . . .
General Image Info
Image Source Info
Image File Location Info
Image Donor Info
Image Feature Info
Image Format Info
37Impact to Date and Future Plans
- Major Impact Items
- Significant acceleration of many critical SSI
applications using embedded(FPGA) and
parallelization - Delivery of the Multi-View Tomography Toolbox
- Delivery of a populated CenSSIS Image Database
System - Development of DICOM interoperability
- Near term deliverables (Years 3-5)
- Development of real-time 3-D vascular tracing
smart camera - Migration of compute-bound modeling problems to
the GRID - Apply of out-of-core acceleration to critical I/O
bound applications - Completion of 3 new Solutionware Toolboxes
- Interfacing to visualization toolkits
(SCIRUN-Utah, VTK) - 2000 images online by Year 5
- Longer term deliverables (Years 6-8)
- Development of a reconfigurable hardware library
of SSI applications - Demonstration of the power of the GRID
- Delivery of 3 additional CenSSIS Toolboxes
- 5000 images online by Year 8
38Quilt Chart Organization Integration of Year
Three CenSSIS Research Program
Multi-Institution Collaboration
Important Outcomes
Advances in Solving Real World Problems
S1 Cellular structure studied with 3D Fusion
Microscopy S2 4D Image Guided Radiotherapy S3
Multi-modal, non-invasive screening for incipient
breast cancers S4 Remote and in-situ monitoring
of submerged coral reefs encompassing shallow and
deep water habitats S5 Multi-sensor quantitative
assessment of underground contaminants and civil
infrastructure
CenSSIS Research Areas
S2
S4
S5
S1
S3
CenSSIS Research Areas
R1A Nonlinear and Dual Wave Probes
R1B Effective Forward Models
R2A MVT Methods
R2B LPM Methods
R2C MSD Methods
R2D Image Understanding Sensor
Fusion Methods
R3A Parallel Hardware Implementation for
Fast Subsurface Detection
R3B Solutionware Tools
Initial TestBED Facilities
I-PLUS Development (Real Problems)
Relative Contribution to Outcomes
Engineered System Level
39Impact on System Level Projects
- FPGA - real-time registration
- S1 3D Fusion Microscope
- Parallel/GRID Processing and Toolboxes
- S1(3DFM) - Impacting the design of the 3DFM
inversion algorithms by accelerating FDTD on a
Mercury cluster - S3 (breast cancer) and S4 (coral reef)
Parallelization of the MVT and Hyperspectral
toolboxes
40Impact on System Level Projects
- Sensor and Image Database
- S1 (3DFM) - Testcase for advanced submission and
tagging capabilities - S2 (Radiation oncology) Facilitates sharing and
indexing 4-D datasets, interfacing DICOM-based
systems
ltxml version1.0 encodingUTF-8gt ltembryogt ltdes
criptiongt Embryo developmental
stageslt/descriptiongt ltfeature label1
xPos129 yPos133 xPos248 yPos250gt 1
cell embryo lt/featuregt ltfeature label2
xPos150 yPos128 xPos870 yPos240gt 2
cell embryo lt/featuregt ltfeature label3
xPos1 5 yPos1 5 xPos225 yPos220gt 4
cell embryo lt/featuregt lt/embryogt
41CenSSIS R3 Research Thrust Summary
- Providing both SSI-related computing research
expertise and supporting CenSSIS infrastructure
needs - Addressing key research barriers in computational
efficiency, embedded computing and image/sensor
data management - Exploiting Grid resources to enable new discovery
in SSI applications - Producing a image/data repository and
software-engineered SSI Toolsets - Providing educational and research opportunities
to undergraduates and Latin American
faculty/students - Developing enabling tools targeting system-level
projects - Real-time registration
- Accelerated modeling of new inversion algorithms
- Indexing and cataloging DICOM and
multi-dimensional images