A Parallel Implementation of MSER detection - PowerPoint PPT Presentation

About This Presentation
Title:

A Parallel Implementation of MSER detection

Description:

A Parallel Implementation of MSER detection GPGPU Final Project Lin Cao Review Review MSER is a stable Connected Component of thresholded image All pixels inside the ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 21
Provided by: coll218
Learn more at: http://coachk.cs.ucf.edu
Category:

less

Transcript and Presenter's Notes

Title: A Parallel Implementation of MSER detection


1
A Parallel Implementation of MSER detection
  • GPGPU Final Project
  • Lin Cao

2
Review
Invariant to affine transformation, such as
rotation, translation, and scale change
Denotes a set of stable connected components
that are detected in gray scale image
3
Review
  • MSER is a stable Connected Component of
    thresholded image
  • All pixels inside the MSER have higher or lower
    intensities than in the surrounding regions
  • Regions are selected to be stable over intensity
    range

4
Sequential and Parallel Approach
  • Sequential Parallel
  • bucketSort()
    buildDirectedGraph( )
  • Find ( )
    blockReduction( )
  • Union( )
    parentCompression( )
  • Update( ) // already
    get regions
  • GetRegion( )
    computeVariation( ) computeVariation( )
    findRoot( )
  • leastVariation( )

  • leastVariation( )

5
buildDirectedGraph
75 78 56 62
50 58 55 53
80 65 64 60
65 55 50 55
A parents value of each pixel should no less
than its current value.
local memory visited, members Shared memory
6
buildDirectedGraph
75 78 56 62
50 58 55 53
80 65 64 60
65 55 50 55
Memory Usage local memory visited,
members Shared memory
Also process edge for next step
7
Block Reduction
1616, 88
8
Block Reduction
1616, 88
9
Block Reduction
1616, 88
10
Block Reduction
log 24
log 22
totally 3 iterations are needed
11
Block Reduction
Load edge information to each pixel


65 70 65 63 75
58 60 59 58 57
55 65 66 62
55 55 54 52





58 59

62


60
80
70
55
50
57
80
60








If (horizontal_pixelUpdate)
12
Block Reduction




History buffer
















13
Parent Compression
75 78 56 62
50 58 56 58
80 58 54 58
65 55 58 55
Shared memory based on parent locality
14
FindRegion
  • FindRoot, so that we can process each regions
    tree respectively
  • Find regions parent and child based on the
    delta, so that variation can be computed.
  • var (area(parent) area(child))/area(current
    region)
  • Send the region information to CPU
  • Scan every regions tree, find the minival
    variation, which is MSER regions.
  • Filter the region

15
Performance Analysis
  • For 256256 image,

16
Performance Analysis
  • For 1024768 image,

17
Performance Analysis
  • Why 88 better than 1616?
  • local memory usage
  • recursion times
  • block execution
  • block reduction times
  • parent locality

18
Performance Analysis
  • GPU vs CPU timing
  • intermidiate values
  • Synchronization
  • record information
  • memory transfer

19
Conclusion
  • Very large data dependancy, still can be solved.
  • Should be suitable to multicore microprocessor,
    whose individual core is strong enough than the
    single thread in GPU.
  • The bottenleck is still memory.

20
Future Work


65 70 65 63 75
58 60 59 58 57
55 65 66 62
55 55 54 52


60
80
70
13
50
57
80
60
  • More efficient block
  • reduction. (decoder
  • and encoder)
  • Memory random access
  • GPU code effciency
Write a Comment
User Comments (0)
About PowerShow.com