A Parallel Implementation of MSER detection

About This Presentation

Title:

A Parallel Implementation of MSER detection

Description:

A Parallel Implementation of MSER detection GPGPU Final Project Lin Cao Review Review MSER is a stable Connected Component of thresholded image All pixels inside the ... – PowerPoint PPT presentation

Number of Views:140

Avg rating:3.0/5.0

Slides: 21

Provided by: coll218

Learn more at: http://coachk.cs.ucf.edu

Category:

more less

Transcript and Presenter's Notes

Title: A Parallel Implementation of MSER detection

1
A Parallel Implementation of MSER detection

GPGPU Final Project
Lin Cao

2
Review
Invariant to affine transformation, such as
rotation, translation, and scale change
Denotes a set of stable connected components
that are detected in gray scale image
3
Review

MSER is a stable Connected Component of
thresholded image
All pixels inside the MSER have higher or lower
intensities than in the surrounding regions
Regions are selected to be stable over intensity
range

4
Sequential and Parallel Approach

Sequential Parallel
bucketSort()
buildDirectedGraph( )
Find ( )
blockReduction( )
Union( )
parentCompression( )
Update( ) // already
get regions
GetRegion( )
computeVariation( ) computeVariation( )
findRoot( )
leastVariation( )
leastVariation( )

5
buildDirectedGraph
75 78 56 62
50 58 55 53
80 65 64 60
65 55 50 55
A parents value of each pixel should no less
than its current value.
local memory visited, members Shared memory
6
buildDirectedGraph
75 78 56 62
50 58 55 53
80 65 64 60
65 55 50 55
Memory Usage local memory visited,
members Shared memory
Also process edge for next step
7
Block Reduction
1616, 88
8
Block Reduction
1616, 88
9
Block Reduction
1616, 88
10
Block Reduction
log 24
log 22
totally 3 iterations are needed
11
Block Reduction
Load edge information to each pixel

65 70 65 63 75
58 60 59 58 57
55 65 66 62
55 55 54 52

58 59

62

60
80
70
55
50
57
80
60

If (horizontal_pixelUpdate)
12
Block Reduction

History buffer

13
Parent Compression
75 78 56 62
50 58 56 58
80 58 54 58
65 55 58 55
Shared memory based on parent locality
14
FindRegion

FindRoot, so that we can process each regions
tree respectively
Find regions parent and child based on the
delta, so that variation can be computed.
var (area(parent) area(child))/area(current
region)
Send the region information to CPU
Scan every regions tree, find the minival
variation, which is MSER regions.
Filter the region

15
Performance Analysis

For 256256 image,

16
Performance Analysis

For 1024768 image,

17
Performance Analysis

Why 88 better than 1616?
local memory usage
recursion times
block execution
block reduction times
parent locality

18
Performance Analysis

GPU vs CPU timing
intermidiate values
Synchronization
record information
memory transfer

19
Conclusion

Very large data dependancy, still can be solved.
Should be suitable to multicore microprocessor,
whose individual core is strong enough than the
single thread in GPU.
The bottenleck is still memory.