Title: Accelerator Design Example
1Accelerator Design Example
2Video Accelerator
- Objective
- Reduce the number of pixels that must be
transmitted/stored
3Problem Statement
- Design a motion estimation engine (accelerator)
for a PC to enable experimenting with video
processing techniques
4Algorithm
- A variety exist Wolf chose
- block motion estimation
- Basis
- describe current frame in terms of differences
between it and a previous one
5Current frame divided into MACROBLOCKS
typically 16x16
6- For every MACROBLOCK in the current frame,
objective is to find the region in the previous
frame that MOST CLOSELY matches the MACROBLOCK - Limit search to a SEARCH AREA
- Measure sum of absolute differences in
intensity over the MACROBLOCK - MOTION VECTOR location of minimum
7Measure S absM(i,j) S(i-ox,j-oy)
M(i,j) intensity of MACROBLOCVK at i,j S(i,j)
intensity of search region at i,j ox , oy
offset from MACROBLOCK center to search area
8Algorithm in C
9- If MBSIZE 16 SEARCHSIZE 8
- nops (16x16)x(17x17) 73,984
- Must be performed on every MACROBLOCK of every
frame - For one video format (CIF)
- frame size 352x288
- -gt 22x18 macroblocks
10Requirements Table
11Preliminary design decisions
- Purpose is experimentation ? one of a kind
- Hence -- use FPGA vs. ASIC
- lower performance a tradeoff
12Specifications C program is essentially the spec
13Specs - continued
14Sequence diagram for compute-mv() operation of
class Motion-estimator
Correction - reverse second two arrows
15Architecture
- Large memory
- MACROBLOCK 16x16 256
- SEARCH AREA (88188)2 1089 pixels
- mistake in text should be (881)2 - 289
- ? Text conclusion -memory external to the FPGA
too large for memory on FPGA no true today
particularly with correction
16Possible Architecture
Two memories Sixteen PEs perform abs calculation
on a pair of pixels Comparator chooses smallest
17In steady state, the schedule fetches one pixel
from the macroblock and two pixels from the
search area per clock cycle Computes 16
correlations between macroblock and search area
simultaneously
18Architecture refinement
19System Testing
- Can use images not video for testing because
the design is only a motion estimator, not a
video compressor