Title: Algorithm Synthesis Tool
1SLAAC Team Meeting 3-99
Multi- Dimensional Imaging Jeff Bloch
Ultra- Wideband Coherent RF Mark Dunham
LANL Challenge Problems Kevin McCabe kmccabe_at_lanl
.gov 505-667-0728
2University Collaborations BYU, UT,...
MultiDimensional Image Processing
Ultra-Wide Band Radio Frequency
Kurt Moore
Real Time Processing of Multi/Hyper Spectral or
Time domain Data Cubes
DAPS
Real Time Processing wide band RF data
ReConfigurable Computing Hardware IP51/RCA-2/DARP
A/(RCA-3)/Commercial
CALIOPE Kurt Moore HIRIS/MTI John
Szymanski James Theiler SHS RULLI Hyperspect
ral Demonstrations
Mike Caffrey, Phil Blain, Noor Khalsa, Tony Rose,
Tony Nelson
RCC Architecture Development/Deployment
Hardware/Software Environment
ALDEBARAN Mark Dunham
Bellatrix Scott Robinson Capella
R. Dingler Cibola Mike Caffrey
Mike Caffrey, Tony Salazaar, John Layne, Jan
Friego
Classification/Compression/Recognition Algorithms
for fixed point RCC Hardware
John Szymanski, James Theiler, Jeff Bloch, Kurt
Moore, Chris Brislawn Steven Brumby Reid
Porter,Simon Perkins
FORTE V-SENSOR
DII Rapid Feature Identification Using RCC
Technology and Genetic Algorithms Jeff Bloch,
John Szymanski
DARPA Collaboration RCC HW/SW Tool Evaluation
Kevin McCabe
3- Collaboration Phase 1
- Represent Challenge Problems
- Ultra-Wide Band RF (UWBRF) Signal Processing
- Multi-Dimensional Image Processing (MDIP)
- Provide specific challenge problem descriptions
to ACS investigators - MDIP http//nis-www.lanl.gov/nis-projects/daps/
- UWBRF http//www.lanl.gov/rcc/
- Seek out and collaborate with ACS investigators
whose work matches our need - ISI - DRP and RRP
- Northwestern - Matlab compiler
- Ptolemy - System level analysis tool
- Validate Hardware or Software Strategies
4- Collaboration Phase 2
- Technology Insertion
- MDIP Rapid Feature Identification Project (RFIP)
- Multi-dimensional image processing via algorithms
derived in real time for rapid searching of
archival information and high bandwidth sensor
data streams - Bellatrix UWBRF Signal Compression
- Wideband Signal Compressor for ID-1 Compatible
Tape Recorders - Airborne environment
- Capella UWBRF Accelerated Analysis Tool
- Acceleration of government algorithms to make
real time analysis feasible - Additional Potential Challenge area
- Plume Detection
- Airborne LIDAR based sensor to detect and analyze
plumes
5RFIP
Jeff Bloch, 505-665-2568, jbloch_at_lanl.gov John
Szymanski, szymanski_at_lanl.gov, 505-665-9371 James
Theiler, jtheiler_at_lanl.gov, 505-665-5682
6- RFIP
- Objective
- Manipulate image processing steps carried out on
RCC hardware to develop remote sensing algorithms
for classifying and identifying features of
interest to an image analyst. - Provide software suite and hardware to work in
host workstation(s) - Search speeds comparable to archive retrieval
times - Platform and RCC hardware independence
- Scalable within a platform and via networked
platforms - Approach
- Rapid evolution of feature recognition procedure
via - Hardware accelerator containing tunable image
processing operators - Software engine that parallelizes a search and
manipulates accelerated operators to maximize
performance against truth data
7- RFIP
- Current Plan
- Develop non-real time demonstration (proof of
concept) of an algorithm to evolve image
classification procedures for identifying
features of interest (fully funded by RFIP) - Select image operators amenable to RCC
acceleration - Select algorithm framework and software
- Select simulation environment (IDL, Perl, etc.)
- Select test datasets
- Develop, write, and execute an all software
demonstration - Demonstrate ability of RCC to dramatically speed
up image processing steps (RFIP - SLAAC
partnership) - Select demonstration RCC hardware (SLAAC-1
insertion opportunity) - Develop tunable operator architecture in VHDL
- Select some image operators and code them in VHDL
- Develop software engine to drive a single RCC
- Benchmark accelerated operators against software
operators
8- RFIP
- Future Plan (2 DII proposals being submitted this
month) - Fully develop RCC accelerated workstation
- Add to selection image operators amenable to RCC
acceleration and code them in VHDL - Refine algorithm framework and software
- Define and procure RCC computer suitable for
analysts workstation - Broaden test datasets
- Benchmark against all software solution
- Demonstrate parallelizability and scalability of
approach on multiple workstations - Implement prior all software solution across
multiple workstations - Develop a parallel execution scheme
- Accelerated algorithm evolution against one truth
data set - Accelerated processing of search data
- Develop advanced user interface
- Benchmark against single workstation all software
solution
9- RFIP
- RFIP Project Status
- Non-real time demonstration functional
- Ability to demonstrate a limited number of
operators - Further refinement of tunable operators approach
ongoing - Rapid evolution of a Water Finding Procedure
demonstrated - Benchmark of all software solution to be done
- Current RFIP funding ends 12/99
- RFIP SLAAC collaboration effort
- Demonstration of RCC
- Candidate image operators selected
- Tunable operator architecture concept under
development - Targeting of RCC hardware to be done
- Develop software engine to drive a single RCC to
be done - Benchmark against software operators to be done
- Limited demonstration to prove principle by 12/99
10Multi-Spectral Image Channel Inputs
The Tunable Operator Architecture
Fitness Output
11(No Transcript)
12(No Transcript)
13- RFIP - SLAAC collaboration
- SLAAC technology
- Hardware
- SLAAC-1 RRP with Linux driver
- LANL has VXI based RCC in use just in last few
months BUT! - VXI form factor not suitable for RFIP
workstations - Have only a primitive board support package
- SLAAC-1 advantages
- PCI form factor with Linux driver matches RFIP
workstations - Runtime library is key to research into
scalability proposed by RFIP team - Still under discussion but Xilinx architecture
in SLAAC-1 may have advantages over Altera
architecture for tunable operator concept under
development
14- RFIP - SLAAC collaboration
- SLAAC technology
- Software
- Runtime library is key to research into
scalability proposed by RFIP team - Long-term vision
- Clusters of work-stations employing accelerated
hardware to allow - 1) Rapid development of new tools, and constant
refinement of existing tools, for analysts mining
large data bases for timely information. - 2) Greater acceleration by distributing
inherently parallelizable processing
15- RFIP - SLAAC collaboration
- Goals
- Long Term Goals (beyond 12/99)
- Determine best architecture for MDIP class of
problems - Single RRP in a workstation
- Operators accelerated at least 10x
- Demonstrate scalability
- Multiple RRPs in a single workstation
- Multiple workstations with 1 RRP each
- 9 month Insertion Plan
- Map tunable operator architecture into SLAAC-1
- Target VHDL operators to Xilinx 40150
- Develop interface to software engine on host
16Bellatrix
Scott Robinson, 505-665-1954, shr_at_lanl.gov John
Layne, 505-667-5137, jpl_at_lanl.gov Mark Dunham,
505-667-0045, mdunham_at_lanl.gov
17- Bellatrix
- Objective
- To demonstrate the ability to continuously record
wideband data for the COMBAT SENT program. - Apply lossy compression while still preserving
the signal characteristics required by the
analyst. - Demonstrate a novel algorithm for lossy
compression of wideband signals so that 40 MHz _at_
12 bits can be recorded at 50 MB/s with upgrade
path to 70MHz. - Devise hardware solution for higher resolution
and up to 200 MHz bandwidth under light signal
conditions. - Provide a scalable platform that can be used for
RD on new WB processing tasks after delivery of
the compression system in anticipation of NextGen
architecture.
18- Bellatrix
- Approach
- Develop three signal processing algorithms for
RCC acceleration - Sub-Band Coding Compression
- Homomorphic Compression
- Burst Digitization Compression
- Initially target a 100Mss, 12 bit channel
recorded onto an ID-1 tape. - Apply lossy compression techniques to convert 150
Mbytes/second of incoming data to 50
Mbytes/second outgoing to tape.
19BELLATRIX 1.0 WB Compression Sub-Band Coding (V.
2)
VXI Chassis
40 MHz analog BW
FPDP
ID-1
100Mss, 12 bit Digitizer Mezzanine (under
development)
Ethernet 10baseT Control
Sony DIR-1000H Tape
20BELLATRIX 1.0 WB Compression Sub-Band Coding
with SLAAC-1
VXI Chassis
Ethernet 10baseT Control
40 MHz analog BW
ID-1
FPDP
100Mss, 12 bit Digitizer Mezzanine (under
development)
I/O 1
I/O 2
Sony DIR-1000H Tape
SLAAC-1 FPGA Computer
21BELLATRIX 1.0 WB Compression Homomorphic or
Burst Digitization (V. 2)
22- Bellatrix
- Plan and Status
- Software models of lossy compression techniques
have been developed. - Accomplishments Demonstration of experimental
algorithms on Blackbeard and FORTE signals
analysis of rate-distortion characteristics and
effects of data quantization on exploitability - Validate lossy compression models against actual
data in process. - Develop a 12-bit, 100 Msps A/D converter input
card for the RCA-2 using the new Analog Devices
AD9432. - Implement Sub-Band Coding compression technique
on RCC for initial flight demonstration Sept.
1999. - Follow-on demos dependent on success of initial
demo - Eventually add demodulation, cross-correlation
delay estimation, parameterization, set-on, and
SNOI removal.
23- Bellatrix - SLAAC collaboration
- Goals
- Evaluate suitability of RRP architecture for UWB
problem - High rate systolic streaming data
- Collaborate with developers of Nextgen
architecture
24Wideband Compression via Burst Digitization
Input Signal
Sl
Sk
Sj
Si
t
W
X
Compressed Output
i j k l
Adaptive Thresholds
SAVE?
Activity Rules
25Homomorphic Compression Algorithm
2ij
Positive Frequencies only (Analytic Signal
Format)
Dm
T0
Baseline Threshold
0
0
N/2 Discrete Components
fnyq
26Joint Time Frequency Compression of Wideband RF
Signals
- Developers Chris Brislawn (CIC-3), Shane
Crockett (student, USNA). - Example spectrogram of FORTE data (L) after 41
compression (R).
27Lossy Compression of Wideband RF
Time Based Compression (Burst Digitization) Well
suited for pulse-like signals with low duty
factor Performance depends on detection of pulse
presence Weak SNR cases need sophisticated
triggering Frequency Domain Compression
(Homomorphic Thresholding) Well suited to long
duration signals and complex signal
mixtures Simple versions of algorithm can provide
5X compression high fidelity Higher compression
ratios tend to round fast rise/fall times on
pulses Compression Through Sub-Band Coding Joint
localization of signal in time and
frequency Adaptive bit allocation and scalar
quantizer design Compression generally removes
signal noise
28Capella-2
Scott Robinson, 505-665-1954, shr_at_lanl.gov Robert
Dingler, 505-665-3483, rdingler_at_lanl.gov Steve
White, 505-667-4623, swhite_at_lanl.gov Tony
Salazar, 505-667-2508, aasalazar_at_lanl.gov Mark
Dunham, 505-667-0045, mdunham_at_lanl.gov
29Capella-2
- Objective
- Provide 1000 lines/sec minimum, 10,000 lines/sec
goal, of Government Spectrum A Raster displays
for quick look data searches. - Implement selected routines for 40 MHz real time
analysis. - Allow key concept demonstrations of a Modular
Coherent UWB Processor, including SNOI removal,
set-on, demod, and cross-correlation. - Demonstrate that pre-D processing can yield
superior Pd, de-interleave, and metrics in
real time, with respect to PDW methods.
30Dataflow Block Diagram
FFT
Time- Frequency Filters
IFFT
ALPHA 4100
RCC Pulse Parameterizer
RCC Synchronous Video Integration
31Capella-2 Software Environment
C MFC Routines
National Inst. Pentium PC Running NT 4.0
Control and Status Control - Text commands
sent to socket via TCP/IP
PCI Bus 1 Ethernet FPDP Out
DLL HW Library Routines
60 MB/S RAID
Dec Alpha 4100 4-Processor Host
MXI Control SW
MXI Control SW
PCI Bus 2 FPDP In Calculex
SW Model
CRI FFT Board
RCC Boards
Daughter Cards
VXI Crate Black Box
Tape/GigaFlash System
Black Box Software Functions Initialize, Load
Flex File, Set Registers, Start Processing, Stop
Processing, Report Status
32Basic Workstation Accelerator System
Waterfall Video
SVGA
FPDP
Alta
To External Network
Ether
Ether
10bT
PCI A
Alpha CPU
Alpha CPU
Alpha CPU
Alpha CPU
PCI B
DEC Alpha 4100
VXI Crate
U-SCSI
U-SCSI
FPDP
Alta
ID-1
33Review
34- Highest Priority MDIP Algorithm Needs
- Spectral and Spatial Classification in real time
- Spectral matched filter algorithms
- K-Means style classification algorithm
- Plume detection
- Rare signal or signature detection
35- Highest Priority UWB Algorithm Needs
- Find a coding algorithm/process to compress
information bandwidth through an FFT - Decompose a non-linear RF chirp into an efficient
wavelet or multi-resolution expansion - Apply image processing/recognition algorithms to
streaming time-frequency images to find objects
of interest. - Identify fast methods of classification and
correlation suitable for FPGA implementation
36Myrinet-2560/SAN Compatible I/O Interface
Development Status
- March 3, 1999
- Douglas E. Patrick
- NIS-4 Space Instrumentation and Systems
Engineering - Mail Stop D448
- Los Alamos National Laboratory
- (505)-665-1203
- patrick_at_lanl.gov
37A Few Current Myrinet/SAN Interface Design
Efforts by others
- Lockheed/Sanders LANAI processor based Common
Node Adapter (CNA) - Lockheed/Martin Astronautics FPGA driven I/O
design using FI32 SAN/FIFO interface Version 1.3
(currently not using any Packet or Header info) - Air Force Research Laboratory FPGA driven I/O
similar to LMCO but uses AFRL Packets
38Myrinet Interface Design Goals
- Leverage off of LMCO and AFRL Design
- Maintain as much Myrinet-2560/SAN Compatibility
as possible (within reason) - Maintain protocol and packet compatibility with
those that we will be interfacing with (AFRL,
LMCO, etc..) - At a minimum, be able to easily reconfigure (via
FPGA reprogramming) for mission specific
protocol(s)/packet(s).
39(No Transcript)
40SLAAC-3
41Desirable architectural elements
- Overall Architecture
- Distinct from current COTS RCCs
- Careful trade study of PE-PE connectivity vs.
PE-Memory - Bus widths
- Data Broadcast between one input and all PEs
- Independent addressibility
- Anticipate direction of reconfigurability
features
42Desirable architectural elements
- Input/Output
- High Speed IO of flexible type
- Mezzanine card with standard interface to RCC
- Directly connected to PEs
- Capable of wide 64 bit data
- Ability to split and combine data streams for
parallelizability and scalability - 3 IO Ports, 2 in and 1 out or vice versa
- 1 IO connected to all PEs
- IO dataflow decoupled from PE by FIFOs
43Desirable architectural elements
- Memory
- Multiple parallel memory banks local to each PE
- Independently addressable
- Parallel access
- 18 bit data
- Ability to split and merge data streams for
parallelizability and scalability - 3 IO Ports, 2 in and 1 out or vice versa
- 1 IO connected to all PEs
- IO dataflow decoupled from PE by FIFOs
44Desirable architectural elements
- Memory
- Shared memory banks between adjacent PEs
- Connected via crossbar switches
- Ability to split and merge data streams for
parallelizability and scalability - 3 IO Ports, 2 in and 1 out or vice versa
- 1 IO connected to all PEs
- IO dataflow decoupled from PE by FIFOs
- Datapaths
- Broadcast bus between one input all PEs
45Desirable architectural elements
- 6U VME64 Board (3.3V included in this standard)
- Mezzanine cards for flexibility
- Simple Fast interface from PE to back-plane
- Simple VME Interface Controller
- Configuration Manager and local configuration
memory - 2.5V or 1.8V Need these for future FPGAs
- Independent clock with skew control