Design and Implementation of FPGA-based systolic array for - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Design and Implementation of FPGA-based systolic array for

Description:

Design and Implementation of FPGA-based systolic array for LZ Data Compression By Mohamed Ahmed Abd El Ghany Ahmed 2006 Introduction to Data Compression Data ... – PowerPoint PPT presentation

Number of Views:181
Avg rating:3.0/5.0
Slides: 31
Provided by: Moham8
Category:

less

Transcript and Presenter's Notes

Title: Design and Implementation of FPGA-based systolic array for


1
Design and Implementation of FPGA-based systolic
array for LZ Data Compression
By Mohamed Ahmed Abd El Ghany Ahmed
2006
2
Overview
  • Introduction to Data Compression
  • Data Compression Methods
  • Systolic Array Operation in LZ
  • Proposed design (Design-P)
  • FPGA Implementation
  • Testing Application
  • Software simulation
  • Conclusions

3
Introduction to Data Compression
  • Data compression is the process of converting an
    input data stream into another data stream with a
    reduced size.
  • Benefits of data compression
  • Reduction of data storage requirements
  • Reduction of data transfer cost

4
Data Compression Methods
Lossy Data Compression
Lossless Data Compression
The decompressed data are some approximation of
the original data
The decompressed data must always be identical to
the original data
Run-Length Encoding
Transform coding schemes
Statistical Methods
Vector Quantization schemes
Dictionary Methods
Sub-band coding schemes
5
Lempel Ziv Algorithms
LZ78
LZ77
LZSS
LZH
LZW
LZMW
6
LZSS Idea
Lookahead buffer
Dictionary
b b a d e
c b b a c d e
a a . . . .
a b
Window
Output codeword (1, Ip, Lmax)
(1, 2, 3)
d e a a e
a c d e b b a
f g . . . .
b b
Shifting by Lmax ( 3 )
7
Codeword length Lc
Lc log2 (dictionary length) log2 (lookahead
buffer length) 1 bits
In the example, Lc log2(7)log2(5)1 7 bits
b b a
(1, 2,3)
3 bytes 24 bits
7 bits
Compressed to
8
Non-Match Case
Lookahead buffer
Dictionary
f a c d e
c b b a c d e
a a . . . .
a b
Window
Output codeword (0, S) S first symbol of
lookahead buffer
(0, f )
a c d e a
b b a c d e f
a . . . .
b c
Shifting by 1
9
Systolic Array Operation in LZ
dictionary
Lookahead buffer
X0 X1 X2 X3 X4 X5 X6 Y0 X7 Y1 X8 Y2
Length Ls
Length n-Ls
10
Interleaved Design (Design-i)
Li
PE2
PE1
PE0
D
D
X7 X4 X6 X3 X5 X2 X4 X1 X3 X0
D
D
Y2
Y1
Y0
X8 X7 X6 X5 X4 X3 X2 X1 X0
Input sequence
11
The Match Results Block
12
Proposed Design (Design-P)
PE2
PE1
PE0
X7.X2 X1 X0
D
D
Y2
Y1
Y0
E0
E1
E2
Li
L-encoder
13
Design-P PE
Design-i PE
14
L-Encoder
E0
Li0
E1
Li1
E2
15
MRB of Design-P
MRB of Design-2i
16
Parallel Compression
PE2
PE1
PE0
D
D
X0
Y2
Y1
Y0
X1
X2
E0
E1
E2
LI
X3
L-encoder
X4
X5
PE2
PE1
PE0
X6
D
D
X7
Y2
Y1
Y0
X8
E0
E1
E2
LII
L-encoder
17
LZ Compression Chip
Yi
SALZC component
FIFO
Xi
Input sequence
Li
Control
Control_FIFO
Host controller
Code word
18
First-in-First-out (FIFO)
Block RAM
Write_counter
Write_address
Input_sequence
controls
Read_counter
read_address
19
The implementation results of Design-P and
Design-i
Maximum Frequency Number of BRAMs Number of BRAMs Number of 4 input LUTs Number of 4 input LUTs Number of Slice Flip Flops Number of Slice Flip Flops Number of Slices Number of Slices
200 MHz 14 14 4704 4704 4704 4704 2352 2352
113.766 MHz 7 1 8 398 8 401 12 302 Design-p (n512, Ls16)
79.815 MHz 7 1 13 619 10 500 19 459 Design-2i (n512, Ls16)
104.308 MHz 14 2 8 419 8 408 13 310 Design-p (n1024, Ls16)
79.700 MHz 14 2 13 650 10 511 20 471 Design-2i (n1024, Ls16)
20
I/O Interface of LZ Compression Chip
LZ compression chip
Data input
codeword
8
16
Codeword ready
Control signals
6
end
21
Testing Application
22
Data Flow of Testing Application
Data stream
PC
LZ compression Chip
Compressed data
23
Decompression Architecture
24
The Compression Rate (Rc)
LsW
clk
Rc
n-Ls1
  • Example
  • The dictionary size (n) 1k
  • Ls 16
  • w 8
  • clk 104.308 MHz

Rc 13 Mbit per second
25
Software Simulation
Data Sets
Calgary corpus
Silesia corpus
26
Experiments on the Calgary corpus
27
Experiments on the Silesia Corpus
28
Conclusions
  • The proposed implementation is area and speed
    efficient. The compression rate is increased by
    more than 40 and the design area is decreased by
    more than 30.
  • The prototype is executed using XILINX, Spartan
    II FPGA.
  • The chip can be incorporated among real-time
    systems so that data can be compressed and
    decompressed on-the-fly.

29
Future Work
  • Studying the effect of combining the proposed
    architecture for LZ data compression and elliptic
    curve cryptography in a single chip.
  • Study the fast string matching techniques are
    required to accelerate the compression process.
  • By modifying the host controller and including,
    e.g., dictionaries, our chip can be used for
    other string-matching based LZ algorithms, such
    as LZ78 and LZW.

30
Thanks
Write a Comment
User Comments (0)
About PowerShow.com