Title: TASTE OF RESEARCH SUMMER SCHOLARSHIPS 200304
1Implementation of a 1024-Point FFT Core
2003
Author Bing Herng Chong (3022233) Supervisor Dr
Saeid Nooshabadi
1. Abstract The aim of this project is to produce
a hardware implementation of a 1024-point Fast
Fourier Transform (FFT) on a Xilinx Virtex-E
Field Programmable Gate Array (FPGA). 2. Fast
Fourier Transform (FFT) The FFT is a set of
efficient algorithms used in the evaluation of
the Discrete Fourier Transform (DFT). It is
widely used in areas whereby the frequency
make-up of continuous-time signals needs to be
obtained or analyzed. Some applications that use
FFT include filtering, speech analysis, radar
systems and digital video broadcasting. In most
cases cheap, efficient and accurate means of
computing FFT on signals is desired. 3. FFT Core
Block Diagram The aim of this project would be to
implement the FFT block diagram as depicted in
the figure shown below
5. Why hardware? Hardware implementation of
non-sequential algorithms operates much faster
than software based implementations because
hardware implementations offer true parallelism.
FFT is a non-sequential algorithm and hence its
efficiency can greatly be improved by performing
it in hardware. 6. Design Software-Compiled-Sys
tem Design Instead of following the common
path of using hardware description language
(HDL), a different design methodology is used in
this project. Its called the software-compiled-sys
tem design methodology and is offered by
Celoxicas Handel-C. Handel-C is in fact ANSI C
with a few modifications to allow hardware design
capabilities. Using this language requires
programming in software while thinking in
hardware. 7. Stages of Development Using
Handel-C allows a sequential version to be
implemented first and debugged as if it were a
software program. The structure of the program is
then modified such that the program would work
efficiently in hardware with the addition of
parallelism, pipelines and code optimization.
Parallelism reveals a trade-off between hardware
utilization and efficiency. A summary of the
results of mapping onto a Xilinx XCV300E FPGA is
shown below
TASTE OF RESEARCH SUMMER SCHOLARSHIPS 2003/04
The FFT block would read in discrete data samples
from a source. Double buffering of data at the
input and output ends provides overall
efficiency. The entire process is supervised by a
micro-controller. 4. FFT Decimation-in-Time Dec
imation-in-time FFT (DIT FFT) simplifies DFT
computations by taking advantage of the symmetry
and periodicity of the DFT formula (shown below)
when the number of points is a power of 2 (N2n).
8. Preliminary Testing Output
Verification MATLAB generates very accurate FFT
results. The outputs of the written code is
compared to that of MATLABs to check for
accuracy and correctness. 9. Accuracy There is a
need to perform accuracy tests to find the least
number of bits to represent numerical data in the
fixed point arithmetic used such that the output
of the design has a discrepancy of less than 5
compared with MATLAB outputs. A summary of the
results are as follows
In doing so the total number of complex
multiplications and additions in the entire
computation is reduced drastically. The figure
below shows the flow graph (butterfly
computation) of a 8-point DIT FFT in which this
design is based on.
10. Achievements and Conclusion Currently the
project is in the stage of synthesis and
transferring the design onto the FPGA. A working
Handel-C model of an N-point, radix-2, 16-bit DIT
FFT that has the ability to cope with as much as
1024 input points has already been achieved.
However this working model is by no means the
most efficient and there is definitely room for
further improvements.
UNSW
ENGINEERING _at_ UNSW