Human Detection

About This Presentation

Title:

Human Detection

Description:

HOG encoding (Contd..) Different voting schemes were used for each of the ... Given an Image :- HOG feature vector is computed across all scales and window ... – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 13

Provided by: phanind

Category:

more less

Transcript and Presenter's Notes

Title: Human Detection

1
Human Detection

Phanindra
Varma

2
Detection -- Overview

Human detection in static images is based on the
HOG (Histogram of Oriented Gradients) encoding of
images
Training set consists of positive windows
(containing humans) and negative images
For each window in the training set the HOG
feature vector is computed and linear SVM is used
for learning the classifier
For any test image, the feature vector is
computed on densely spaced windows at all scales
and classified using the learned SVM

3
HOG encoding

Preprocessing-
Gamma normalize each channel using square root
transformation in the given window
For each channel compute gradients using -1 0 1
and -1 0 1T and find the channel with the
largest gradient magnitude for each pixel
Compute gradient orientation (0 180) for each
pixel in this dominant channel
Descriptor computation -
Divide the window (64x128) into dense grid of
points with horizontal and vertical spacing equal
to 8 pixels
Divide the 16x16 region (block) centered on each
point on the grid into cells of size 8x8 (i.e 4
cells for each grid point)
For each pixel in the current block use Trilinear
interpolation based on gradient strength to vote
into a 2x2x9 histogram

4
HOG encoding (Contd..)

Different voting schemes were used for each of
the colored regions
Block normalization for illumination invariance
is done on each block independently using the
norm of the 2x2x9 vector
The final feature vector is the collection of all
the 2x2x9 feature vectors from all the grid
points

Cell centers
Grid point
A Block of 16x16 pixels
5
Training

The training set has been obtained from
http//pascal.inriaples.fr/data/human/INRIAPerson.
tar
The training set consists of positive 64x128
windows (2416) containing humans and negative
images
Negative windows are sampled from the negative
images at random locations (12000)
Initial Phase learning - Learn the SVM
classifier on the original training set
Generate Hard examples - Run the learned SVM on
the negative images at all scales and window
locations and save all the false positives
(approx.6000)

6
Training (Contd..)

Second Phase learning - Using the newly
generated negative examples learn the new linear
SVM (total positive windows 2400, negative
windows 17000 approx)
Following this procedure, 375 windows were
misclassified out of the possible 19400 windows
(using SVMLight)

7
Testing

Given an Image - HOG feature vector is computed
across all scales and window locations and the
locations and scales of all positive windows are
saved (window size 64x128)
This procedure gives multiple detections (at many
scales and locations)
To fuse overlapping detections the Mean Shift
mode detection algorithm is used
Represent each detection in a 3D space (x y
log(s)) and iteratively compute the mean shift
vector at each point
The resulting modes give the final detections and
the bounding boxes are drawn using this final
scale

8
Results - Detection
An example image
Detections when threshold is zero
9
Results Detection (Contd..)
Previous image
Detections when threshold is equal to one
10
Results - Detection
Detections when threshold is zero
An example image
11
Results Detection (Contd..)
Result of Mean Shift mode detection
12
Comparision
Detection Video

Write a Comment

User Comments (0)