Title: High Productivity Computing System Program
1Wavelet Spectral Dimension Reduction of
Hyperspectral Imagery on a Reconfigurable
ComputerTarek El-Ghazawi1, Esam El-Araby1,
Abhishek Agarwal1, Jacqueline Le Moigne2, and
Kris Gaj31The George Washington
University,2NASA/Goddard Space Flight
Center,3George Mason Universitytarek, esam,
agarwala_at_gwu.edu, lemoigne_at_backserv.gsfc.nasa.gov
, kgaj_at_gmu.edu
2Objectives and Introduction
- Investigate Use of Reconfigurable Computing for
- On-Board Automatic Processing of
- Remote Sensing Data
- Remote Sensing ? Image Classification
- Applications
- Land Classification, Mining, Geology, Forestry,
Agriculture, Environmental Management, Global
Atmospheric Profiling (e.g. water vapor and
temperature profiles), and Planetary Space
missions - Types of Carriers
3Types of Sensing
- Mono-Spectral Imagery ? 1 band (SPOT
panchromatic) - Multi-Spectral Imagery ? 10s of bands (MODIS 36
bands, SeaWiFS 8 bands, IKONOS 5 bands) - Hyperspectral Imagery ? 100s-1000s of bands
(AVIRIS 224 bands, AIRS 2378 bands)
4Different Airborne Hyperspectral Systems
5Why On-Board Processing?
- Problems
- Complex Pre-processing Steps
- Image Registration / Fusion
- Large Data Volumes
- Large cost and complexity of the On-The-Ground /
Earth processing systems - Large critical decisions latency
- Large data downlink bandwidth requirements
- Solutions
- Automatic On-Board Processing
- Reduces the cost and the complexity of the
On-The-Ground/Earth processing system - larger utilization for broader community,
including educational institutions - Enables autonomous decisions to be taken on-board
? faster critical decisions - Applications
- Future reconfigurable web sensors missions
- Future Mars and planetary exploration missions
- Dimension Reduction
- Reduction of communication bandwidth
- Simpler and faster subsequent computations
Investigated Pre-Processing Step
6Why Reconfigurable Computers?
- On-Board Processing Problems
- High Computational Complexities
- Low performance for traditional processing
platforms - High form / wrap factors (size and weight) for
parallel computing systems - Low flexibility for traditional ASIC-Based
solutions - High costs and long design cycles for traditional
ASIC-Based solutions
- Solutions
- Reconfigurable Computers (RCs)
- Higher performance (throughput and processing
power) compared to conventional processors - Lower form / wrap factors compared to parallel
computers - Higher flexibility (reconfigurability) compared
to ASICs - Less costs and shorter time-to-solution compared
to ASICs
7Introduction
8Data Arrangement
9Data Arrangement (cntd)
Pixels Rows X Columns
10Examples of Hyperspectral Datasets
11Dimension Reduction Techniques
- Principal Component Analysis (PCA)
- Most Common Method Dimension Reduction
- Does Not Preserve Spectral Signatures
- Complex and Global computations difficult for
parallel processing and hardware implementations - Wavelet-Based Dimension Reduction
- Preserves Spectral Signatures
- High-Performance Implementation
- Simple and Local Operations
122-D DWT (1-level Decomposition)
132-D DWT (2-level Decomposition)
14Wavelet-Based vs. PCA (Execution Time, 500 MHz
P3)
Complexity Wavelet-Based O(MN) PCA
O(MN2N3)
15Wavelet-Based vs. PCA (cntd) (Execution Time,
500 MHz P3)
Complexity Wavelet-Based O(MN) PCA
O(MN2N3)
16Wavelet-Based vs. PCA (cntd) (Classification
Accuracy)
- Implemented on the HIVE (8 Pentium
Xeon/Beowulfs-Type System) 6.5 times faster than
sequential implementation - Classification Accuracy Similar or Better than
PCA - Faster than PCA
17The Algorithm
18Prototyping Wavelet-Based Dimension Reduction of
Hyperspectral Imagery on a Reconfigurable
Computer, the SRC-6E
19Hardware Architecture of SRC-6E
20SRC Compilation Process
21Top Hierarchy Module
22Decomposition and Reconstruction Levels of
Dimension Reduction (DWT_IDWT)
23FIR Filters (L, L) Implementation
24Correlator Module
25Histogram Module
26Resource Utilization and Operating Frequency
27Measurements Scenarios
28SRC Experiment Setup and Results
- Salinas98
- 217 X 512 Pixels, 192 Bands 162.75 MB
- Number of Streams 41
- Stream Size 2730 voxels 4 MB
- Non-Overlapped Streams
- TDMA-IN 13.040 msec
- TCOMP 0.62428 msec
- TDMA-OUT 22.712 msec
- TTotal 1.49 sec
- Throughput 109.23 MB/Sec
- Overlapped Streams
- TDMA 35.752 msec
- TCOMP 0.62428 msec
- Xc 0.0175
29Execution Time
30Distribution of Execution Times
31Speedup Results
32Concluding Remarks
- We prototyped the automatic wavelet-based
dimension reduction algorithm on a reconfigurable
architecture - Both coarse-grain and fine-grain parallelism are
exploited - We observed a 10x speedup using the P3 version of
SRC-6E. From our previous experience we expect
this speedup to double using the P4 version of
SRC machine - These speedup figures were obtained while I/O is
still dominating. The speedup can be increased by
improving I/O Bandwidth of the reconfigurable
platforms