Overview - PowerPoint PPT Presentation

About This Presentation
Title:

Overview

Description:

High Speed Energy Efficient Architecture for Finite Ridgelet Transform Shrutisagar Chandrasekaran and Abbes Amira Overview September 2004 Outline Research Objectives ... – PowerPoint PPT presentation

Number of Views:23596
Avg rating:3.0/5.0
Slides: 32
Provided by: Shrutisaga
Learn more at: http://web.cecs.pdx.edu
Category:

less

Transcript and Presenter's Notes

Title: Overview


1
High Speed Energy Efficient Architecture for
Finite Ridgelet Transform Shrutisagar
Chandrasekaran and Abbes Amira
Overview September 2004
2
Outline
  • Research Objectives
  • Introduction
  • Discrete Ridgelet Transform
  • Finite Radon Transform
  • Discrete Wavelet Transform
  • FRIT Architecture
  • FPGA Implementations and Results
  • Conclusions
  • Future Work and Acknowledgements

3
Research Objectives
  • To evaluate and model power consumption of FPGA
    based designs at various levels of abstraction
    and to evolve and implement strategies for low
    power energy efficient design
  • To develop a high level framework for FPGAs based
    matrix algorithms implementation such as Ridglet
    transform, matrix multiplication, SVD, DCT,
    DWT..etc used in image and signal processing
    applications.
  • To efficiently implement the Finite Ridgelet
    Transform (FRIT) on FPGA using Handel C, for
    satellite based onboard image compression within
    the ongoing Framework development

4
Research Objectives
Application User
System Architect
  • Estimating Performance Measures
  • (Power, Area, Max Frequencyetc)
  • Capturing Platform Features at higher level

RLC
CSC
DWT
FPGA Configuration Implementation Reconfiguration
Compilation
MM DCT (1D,2D) FFT (1D, 2D) DWT (1D, 2D) FRAT
(1D, 2D) FRIT (1D,2D) SVD QR
VHDL Handel-C Schematic Hybrid EDIF Bitstream
5
Introduction
  • Discrete Wavelet Transforms (DWT) have become
    powerful tools in a wide range of applications
    including
  • Image/Video Compression (JPEG2000, MPEG-4)
  • Aerospace applications (Data denoising,
    Satellite/Astronomical image compression,
    analysis)
  • Image/Video Enhancement, Segmentation
  • Telecommunication
  • The advantage of DWT over existing transforms
    such as Discrete Fourier Transform (DFT) and
    Discrete Cosine Transform (DCT) is that the DWT
    performs a multiresolution analysis of a signal
    with localization in both time and frequency

6
Introduction
  • The wavelet transform has many limitations when
    it comes to representing straight lines and edges
    in images
  • To overcome the weakness of wavelets in higher
    dimensions, Candes recently proposed the Ridgelet
    transform which deals effectively with line
    singularities in 2-D
  • However, the complexity of its implementation
    still remains as a heavy burden on standard
    microprocessors where large amounts of data have
    to be processed
  • Therefore, VLSI/FPGA implementations of the
    Ridgelet Transforms are needed for real-time
    applications.

7
The Finite Ridgelet Transform
  • The FRIT provides a sparse representation for
    functions defined on the continuum plane
  • The transform allows representing edges and
    other singularities along curves in a more
    efficient way
  • The basic idea is to map a line singularity in
    the two-dimensional (2-D) domain into a point by
    means of the Radon transform.
  • Then, a one-dimensional (1-D) wavelet is
    performed to deal with the point singularity in
    the Radon domain

8
The Finite Ridgelet Transform
9
The Finite Ridgelet Transform
  • The two fundamental buliding blocks of the FRIT
    are the FRAT and DWT
  • The FRAT pseudocode is mapped onto hardware
    after performing energy and speed optimisations
    including parallelism and pipelining
  • Experimental results in Matlab have shown that
    simple lower order wavelets yield better
    compression (lesser entropy) when transforming
    from FRAT domain to FRIT domain
  • The HAAR wavelet gives better results than the
    CDF2.2 and other higher order wavelets, in terms
    of minimising the entropy in the Ridgelet domain

10
The Finite Ridgelet Transform
  • It is able to transform two dimensional images
    with lines into a domain of possible line
    parameters, where each line in the image will
    give a peak positioned at the corresponding line
    parameters
  • Numerous discretisations of the Radon transforms
    have been devised to approximate the continuous
    formulae
  • However, most of them were not designed to be
    invertible
  • transforms for digital images. Alternatively, the
    Finite Radon
  • Transform (FRAT) theory (which means transform
    for finite
  • length signals) originated

11
The Finite Radon Transform
  • The FRAT is defined as summations of image pixels
    over a certain set of lines.
  • Lkl denotes the set of points that make up a
    line on
  • the lattice Z2p as follows
  • Computing the kth Radon projection, i.e., the kth
    row of the array, we need to pass all pixels of
    the original image once and use p histogrammers
    one for every pixel in the row.

12
The Finite Radon Transform
for k0(p-1) n k for j 0(p-1)
n n - k if n lt 0 n
np end l n - 1 for I
0(p-1) l l 1 if l
gt p l l - p end
FRAT(k,l) FRAT(k,l) f(i,j)
end end end for j0(p-1) for i0(p-1)
FRAT(p,j) FRAT(p,j) f(i,j) end end
  • The FRAT is defined as summations of image pixels
    over a certain set of lines.

FRAT Pseudocode
13
Discrete Wavelet Transform
  • The work by Daubechies and Mallat led to the
    discrete filter based interpretation of wavelets
  • Wavelets can be implemented as a set of filter
    banks comprising a high-pass and a low-pass
    filter, each followed by down-sampling by two

14
Discrete Wavelet Transform
  • Though the simplest wavelet, the HAAR DWT gives
    the best performance in terms of entropy
    reduction
  • Integer to Integer Lifting version of the HAAR
    DWT is used to ensure that it is fully invertible
  • In place transform is performed to reduce the
    number and size of on-chip buffers

15
FRIT Architecture
  • Once the Radon and wavelet transform have been
    implemented, the Ridgelet transform is
    straightforward
  • Each output of the radon projection, i.e, each
    row of radon transformed image, is simply passed
    through the wavelet transform
  • Dual output buffer configuration is used so that
    the FRAT and the DWT can be performed
    simultaneously on the chip
  • In place lifting DWT is performed in the second
    output buffer containing the FRAT vectors

16
FRIT Architecture
  • One input pixel processed on each clock cycle
  • No clock edges wasted in buffering input tile
  • Fully pipelined input section
  • The controller has (p1) counters which generate
    address and read/write status of output vectors
  • Double buffered O/P section to perform DWT in
    parallel

17
FRIT Architecture
  • p1 FRAT vectors are decomposed in parallel, p
    is the Block size
  • Lifting architecture is used to perform the 1D
    Haar wavelet transform
  • In place decomposition performed to reduce
    internal buffer size

Core Latency Total Latency including memory access Input Buffer Size Output Buffer Size Can be Pipelined
p2 p2 p 1 - 2 x p2 Yes
18
FRIT Architecture
19
FPGA Implementations and Results
  • In order to verify the performance of the
    proposed architectures, designs have been
    prototyped on the Celoxica RC1000 board
    containing the Xilinx XCV2000E FPGA
  • Available on chip logic resource include - Slices
    19200 - CLB Array 80 x 120 - Block RAM
    655,360 bits - Distributed RAM 614,400 bits
  • The RC1000 has 4 memory banks which communicate
    with the host by means of DMA transfers

20
FPGA Implementations and Results
  • The design has also been synthesised on the
    Radiation Hardened QPro Virtex-II FPGA, as it is
    the preferred Xilinx FPGA for deployment onboard
    satellites
  • Industry First Radiation Hardened Platform FPGA
    Solution
  • Guaranteed total ionizing dose to 200 krad(si)
    and latch-up immune to LET gt 160 MeV-cm2/mg. SEU
    upsets lt 1.5E-6 per device day achievable with
    recommended redundancy implementation
  • Certified to MIL-PRF-38535 standard
  • Guaranteed over the full military temperature
    range (55 C to 125 C)

21
FPGA Implementations and Results
Design Flow
22
FPGA Implementations and Results
  • Handel-C adds constructs to ANSI-C to enable DK
    to directly implement hardware
  • Fully synthesizable HW programming language based
    on ANSI-C
  • Implements C algorithm direct to optimized FPGA
    or outputs RTL from C

Handel-C Additions for hardware
Majority of ANSI-C constructs supported by DK
Parallelism Timing Interfaces Clocks Macro
pre-processor RAM/ROM Shared expression Communicat
ions Handel-C libraries FP library Bit
manipulation
Control statements (if, switch, case,
etc.) Integer Arithmetic Functions Pointers Basic
types (Structures, Arrays etc.) define include
Software-only ANSI-C constructs
Recursion Side effects Standard libraries Malloc
23
FPGA Implementations and Results
FRAT Implementation
  • An empirical study has shown that the choice of a
    block size p7 gives the best balance of power
    and performance for the FRAT

24
FPGA Implementations and Results
  • Comparison of performance metrics of the FRAT
    sub-block with existing work

Maximum Throughput (FPS) Maximum Energy Per Frame (mJ) Maximum Frequency
Architecture 2 1 225 1.84 81.92 MHz
Proposed Architecture 317 1.38 96.18 MHz
Software Implementation 0.037 - -
1 C.A.Rahman and W.Badawy, Architectures the
Finite Radon Transform, IEE Electronic Letters,
Vol. 40, No. 15, July 2004 Implemented using
Matlab on a 1.8 GHz Pentium 4 workstation
equipped with 1GB DDR RAM
25
FPGA Implementations and Results
  • Various performance metrics of the FRIT
    implemented on the Virtex-E and the QPro
    Virtex-II FPGAs

Performance Metrics Virtex-E QPro Virtex-II
Area Occupied (slices) 504 497
Max Frequency (MHz) 43.57 56.67
Max Power (mW) 301.65 196.91
Energy/Frame (mJ) 1.934 1.40
Max Throughput (FPS) 156 202
26
FPGA Implementations and Results
  • FRIT achieves the best results in terms of
    reducing the entropy of the image
  • This means that better compression can be
    achieved

Entropy Source 1 Source 2 Source 3
Source 4.3430 4.9129 3.3197
FRAT 3.6730 4.2748 2.1200
Entropy Source 1 Source 1 Source 2 Source 2 Source 3 Source 3
Entropy Haar cdf2.2 Haar cdf2.2 Haar cdf2.2
DWT 3.5472 3.6836 3.8725 3.9136 2.7530 3.0387
FRIT 3.1115 3.3753 3.4554 3.6639 2.0525 2.4051
27
FPGA Implementations and Results
  • Source 1 FRIT Domain

28
FPGA Implementations and Results
  • Source 2 FRIT Domain

29
FPGA Implementations and Results
  • Source 3 FRIT Domain

30
Conclusions
  • The Ridgelet transform was recently introduced
    to
  • overcome the weakness of wavelet transforms
  • An architecture and its efficient FPGA
    implementation
  • for the Finite Ridgelet transform have been
    proposed
  • The implementations have been carried out for
    different input image sources
  • The implementation results show that proposed
    implementation outperforms existing work in terms
    of both area and system speed

31
Future work and Acknowledgments
  • Develop Complete on-chip compression engine for
    satellite images
  • Explore the effect of Algorithmic, architectural
    and RTL level optimisations to minimise power
    consumption

Acknowledgments
Celoxica (Mr. Roger Gook) and EPSRC for
supporting this work
Write a Comment
User Comments (0)
About PowerShow.com