Title: Parallel Beam Back Projection: Implementation
1Parallel Beam Back ProjectionImplementation
- Srdjan Coric
- Miriam Leeser
- Eric Miller
2Outline
- Annapolis Wildstar
- Simple Architecture
- algorithm
- datapath
- Performance
- Results
- Parallelism extraction
- Advanced Architecture 4x
- datapath
- Performance
- Results
- Implementation issues
- Future directions
3(No Transcript)
4Data Flow
5Interpolation factor errorCorner starting
position
6Simple Architecture Datapath
7Performance Results Software vs. FPGA Hardware
- Software - Floating point - 450 MHz Pentium
240 s - Software - Floating point - 1 GHz Dual Pentium
94 s - Software - Fixed point - 450 MHz Pentium
50 s - Software - Fixed point - 1 GHz Dual Pentium
28 s - Hardware - 50 MHz
5.4 s
Parameters 1024 projections 1024 samples per
projection 512512 pixels image 9-bit
sinogram data 3-bit interpolation factor
8Zoom 200 Grayscale range lt Pixel value
range (heart features in focus)
9Zoom 200 Grayscale range lt Pixel value
range (lung features in focus)
10Original image - Hardware output image
11Parallelism Issues
Case 1 No parallelism extracted
Case 2 Pixel level parallelism extracted
Case 3 Projection level parallelism extracted
Projections
Image columns
V1
Image rows
V3
V2
Tk1V1
Tk1V2
Tk2 V3
k1 ltk2, V2 V3 V1 /4, TExecution time
12Advanced Architecture - Data Path projection
parallelism extracted
13Performance Results Software vs. FPGA Hardware
- Software - Floating point - 450 MHz Pentium
240 s - Software - Floating point - 1 GHz Dual Pentium
94 s - Software - Fixed point - 450 MHz Pentium
50 s - Software - Fixed point - 1 GHz Dual Pentium
28 s - Hardware - 50 MHz
5.4 s - Hardware (Advanced Architecture) - 50 MHz
1.3 s
Parameters 1024 projections 1024 samples per
projection 512512 pixels image 9-bit
sinogram data 3-bit interpolation factor
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19Implementation Issues - fanout -
prj_num(3) fanout 1565 ! routing delay 7.913
ns (39.99)
20Implementation Issues - fanout -
odd_2_A_44 fanout 144 !
21Memory Bridges Stuff
- 3 architectures implemented
- Simple Architecture non-parallel (on slide 6)
- Advanced Architecture 4-way parallel (slide
12) - Bridge Free Advanced Arch
- as B but contains no memory bridges (all design
buffers in BlockRAMs) from PCI bus to
memory banks required for Host-Memory
communication. Bridges are separate design
that is downloaded before (after) design C is
downloaded so that input data can be stored
to (output data read from) memories on the
WildStar board. - Virtex1000 resource utilization
- 11 logic, 90 BlockRAMs (with bridges)
- 39 logic, 100 BlockRAMs
- 21 logic, 100 BlockRAMs
22Floorplan of the Bridge Free Advanced
Architecture (design C on the previous slide)
23Future Directions