Face Recognition: A Literature Survey - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Face Recognition: A Literature Survey

Description:

The upper half of the face aids more in recognition then the bottom half. ... Face detection: Locating the faces in an image or video sequence. ... – PowerPoint PPT presentation

Number of Views:1974
Avg rating:3.0/5.0
Slides: 27
Provided by: usersS
Category:

less

Transcript and Presenter's Notes

Title: Face Recognition: A Literature Survey


1
Face RecognitionA Literature Survey
  • By
  • W. Zhao, R. Chellappa, P.J. Phillips,
  • and A. Rosenfeld
  • Presented ByShane Brennan
  • 5/02/2005

2
Early Methods of Recognition
  • Early methods treated recognition as a problem of
    2D pattern recognition.
  • Methods included distance-measuring algorithms.
    These determined the distances between important
    features and compared these distances to the
    distances on known faces.
  • Fairly inaccurate, and performs poorly with
    variations in orientation and size.
  • Does well with variations in intensity.

3
More Modern Approaches
  • Among appearance-based methods eigenfaces and
    fisherfaces have been proved effective in
    experiments involving large databases.
  • Feature-based graph matching approaches have been
    successful as well, and are less sensitive to
    variations in illumination and viewpoint, as well
    as inaccuracy in face localization.
  • Feature extraction techniques in graph matching
    approaches are currently inadequate.
  • Example cannot detect an eye if the eyelid is
    closed.

4
Lessons That Have Been Learned
  • The upper half of the face aids more in
    recognition then the bottom half. Bottom lighting
    may actually make it more difficult to recognize
    a face.
  • The nose is not as significant as the eyes, ears,
    and mouth in recognition. Although in profile
    views, a distinctive nose can help greatly in
    recognition.
  • Low-frequency components play a dominant role,
    enough to identify the gender of the face.
    Although higher frequency bands are necessary for
    recognition.

5
Three Aspects of Recognition
  • Face detection Locating the faces in an image or
    video sequence.
  • Feature extraction Finding the location of eyes,
    nose, mouth, etc.
  • Face recognition Identifying the face(s) in the
    input image or video.
  • Face detection and feature extraction may be
    performed simultaneously.

6
Face Detection
  • Considered successful if the presence and rough
    location of a face is correctly identified.
  • Two statistics are important True positives
    (correct detection), and false positives
    (incorrect detection).
  • Multi-view based methods do much better than
    invariant feature methods when head rotation is
    large.

7
Face Detection, continued
  • By treating this problem as a two-class problem
    false positives can be reduced while maintaining
    a high true positive rate. This can be done by
    retraining systems with a high number of
    false-positives. (Bootstrapping)
  • Appearance-based methods have achieved the best
    results in face detection, compared to
    feature-based and template-matching methods.

8
Feature Extraction
  • Is the most important part of face recognition.
    Even holistic methods need accurate location of
    features for normalization.
  • Some methods use feature restoration, which fills
    in occluded parts of the face using symmetry.
  • Three basic approaches Edge detection methods,
    feature-template methods, and structural matching
    methods that take into consideration geometric
    constraints on features.

9
Feature Extraction, continued
  • Early methods used template approaches that
    focused on individual features.
  • These methods fail when those important features
    are occluded or obscured.
  • More recent methods use structural matching
    methods like Active Shape Modeling (ASM). These
    methods are more robust in terms of handling
    variations in image intensity and feature shape.

10
ASM
  • Create a model of the features that you wish to
    find. This model is defined by a series of model
    points as well as the connection between points.
  • Overlay the model onto the image. Examine the
    region around each model point to find the best
    match in the image that fits that point. Move the
    model point to that image point and update the
    model.
  • The matching is usually done using image edges.
  • Repeat this process for several iterations until
    convergence (model points do not move far).

11
Examples of ASM Implementations A successful
match (top), and a semi-successful match
(bottom).
12
ASM, continued
  • Suppose you take k samples on either side of a
    model point, this provides a vector of 2k1
    sample points. Call this vector gi.
  • Normalize the sample by dividing by the sum of
    absolute element values gi gi / ( ? j
    gij )
  • Repeat this for each training image to obtain a
    set of normalized samples gi for each model
    point.
  • Assume the set of normalized samples are
    distributed as a multivariate Gaussian, and find
    mean gmean and covariance Sg.
  • Repeat this process for each model point.

13
ASM, continued
  • The quality of fit (measure of accuracy) of a new
    sample gs to the model is given byf(gs) (gs
    gmean)T Sg-1 (gs gmean)
  • This is the Mahalanobis distance. Minimizing
    f(gs) is equivalent to maximizing the
    probability that gs comes from the distribution.
  • This iterative process can be sped up by the use
    of multi-resolution (coarse to fine) feature
    matching.

14
Facial Recognition
  • One successful facial recognition system has been
    the use of eigenfaces.
  • This involves projecting an input image into a
    lower-dimension facespace and then computing
    the distance between the projected input image
    and known faces.
  • More detail on eigenfaces will be provided in my
    next presentation.

15
Linear Discriminant Analysis
  • Face systems using Linear Discriminant Analysis
    (LDA) have also been successful.
  • Training of LDA systems is carried out via
    scatter matrix analysis.
  • So for an M-class problem, the within- and
    between-class scatter matrices Sw and Sb are
    computed as followsSw ?i1 to M Pr(wi)
    CiSb ?i1 to M Pr(wi) (mi m0)(mi m0)T
  • Where Pr(wi) is the prior class probability, and
    is typically assigned the value 1/M.

16
Linear Discriminant Analysis, cont.
  • Ci is the average scatter matrix (conditional
    covariance matrix) and is defined asCi
    E(x(w) mi)(x(w) mi)T w wi
  • Sw shows the average scatter (Ci) of the sample
    vectors x of different classes wi around their
    respective means (mi).
  • Sb shows the scatter of the conditional mean
    vectors (mi) around the overall mean vector (m0).
  • A measure for quantifying discriminatory power
    is G(T) TTSbT / TTSwT
  • The projection matrix W which optimizes this
    function can be found by solving the eigenvalue
    problem SbW SWW?W

17
Linear Discriminant Analysis, cont.
  • The basic method of this algorithm is that
    classification is performed by projecting the
    input x into a subspace via a projection/basis
    matrix Proj (W from previous slide).
    Z Proj x
  • Comparing the projection coefficient vector, Z of
    the input to all pre-stored projection vectors of
    known and labeled classes you can identify and
    label the input vector.
  • The vector comparison varies in different
    systems. PCA algorithms tend to use either the
    angle or the Euclidean distance.

18
PDBNN
  • A proposed fully automatic face detection and
    recognition system based on Probabilistic
    Decision-Based Neural Networks has been proposed.
  • It consists of three modules A face detector,
    eye localizer, and face recognizer.
  • The PDBNN does not use the lower face. This
    excludes the influence of facial expressions
    (smiling, frowning, etc).

19
PDBNN, continued
  • Breaks the input into two features at a
    resolution of 14x10 pixels.
  • The features are normalized intensity and edges.
  • Each feature is fed into a separate PDBNN and the
    final recognition result is the combination of
    the output of each PDBNN.
  • Advantages of this implementation are that it
    converges quickly and is easily implemented on
    distributed computing platforms.

20
A key feature is that each individual to be
recognized has a subnet in the PDBNN devoted to
them.
21
EBGM
  • The most successful feature-based structural
    matching approach has been the use of Elastic
    Bunch Graph Matching (EBGM) systems.
  • Local features represented by wavelet
    coefficients for different rotations and scales.
  • Wavelet bases are referred to as jets.
  • Based on Dynamic Link Architecture (DLA).
  • DLAs use synaptic plasticity to from sets of
    neurons grouped into structured graphs in a
    neural network.

22
EBGM, continued
  • Basic mechanisms are Tij, the connection between
    two neurons (i and j), and Jij, a dynamic
    variable.
  • These J-variables are the synaptic weights for
    signal transmission among neurons.
  • The T-parameters act as constraints to the
    J-variables. Small changes in T over time from
    synaptic plasticity will cause the J-variables to
    change as well.
  • A new image is recognized by transforming the
    image into a grid of jets and comparing this grid
    to those of known images.

23
EBGM, continued
  • This basic DLA architecture is extended to EBGM
    by attaching a set of jets to each grid node,
    instead of just one jet.
  • Each jet in the set is derived from a different
    stored (known) face image.
  • This EBGM method has been applied toFace
    detection and extraction, pose estimation, gender
    classification, sketch-image-based recognition,
    and general object recognition.

24
On the left is an image graph. The graph is
positioned over the input image. At each node,
the local jet around the corresponding image
point is computed and stored. This pattern of
jets is used to represent the pattern clases. A
new image is recognized by transforming it into a
grid of jets and comparing it to known
models. EBGM (represented by the image on the
right) works the same way, but at each node is a
set of jets, each derived from a different face
image. Pose variation is handled by determining
the pose of the face using prior class
information. The jet transformations under
variations in pose are then learned.
25
Results and Conclusions
  • The Subspace LDA system, EBGM system, and
    probabilistic eigenface system are judged to be
    the top three methods of face recognition based
    on the accuracy of the results.
  • Each method has different levels of performance
    on different subsets of images.
  • When the number of training samples per class is
    large, LDA performs best. When only one or two
    samples are available per face class, PCA
    (eigenface) is a better choice.

26
Other Interesting Results
  • It has been demonstrated that the image size can
    be very small and recognition methods will still
    perform well.For LDA system 12x11 pixelsFor
    PDBNN 12x11 pixelsFor human perception 24x18
    pixels
  • It is interesting to note that the algorithms can
    recognize faces at a lower resolution than the
    human brain can.
Write a Comment
User Comments (0)
About PowerShow.com