ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION - PowerPoint PPT Presentation

About This Presentation
Title:

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION

Description:

Minimization of error entropy with fiducial points. Experiments. Supervised learning ... Entropy with Fiducial points (MEEF). ERROR ENTROPY WITH FIDUCIAL POINTS ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 28
Provided by: WeiF4
Learn more at: http://plaza.ufl.edu
Category:

less

Transcript and Presenter's Notes

Title: ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION


1
ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION
  • Weifeng Liu, P. P. Pokharel, J. C. Principe
  • CNEL, University of Florida
  • weifeng_at_cnel.ufl.edu
  • Acknowledgment This work was partially supported
    by NSF grant ECS-0300340 and ECS-0601271.

2
Outlines
  • Maximization of correntropy criterion (MCC)
  • Minimization of error entropy (MEE)
  • Relation between MEE and MCC
  • Minimization of error entropy with fiducial
    points
  • Experiments

3
Supervised learning
  • Desired signal D
  • System output Y
  • Error signal E

4
Supervised learning
  • The goal in supervised training is to bring the
    system output close to the desired signal.
  • The concept of close, implicitly or explicitly
    employs a distance function or similarity
    measure.
  • Equivalently, to minimize the error in some
    sense.
  • For instance, MSE

5
Maximization of Correntropy Criterion
  • Correntropy of the desired signal and the system
    output V(D,Y) is estimated by
  • where

6
Correntropy induced metric
  • Define
  • satisfy the following properties
  • Non-negativity
  • Identity of indiscernibles
  • Symmetry
  • Triangle inequality

7
CIM contours
  • Contours of CIM(E,0) in 2D sample space
  • close, like L2 norm
  • Intermediate, like L1 norm
  • far apart, saturates with large-value elements
  • (direction sensitive)

8
MCC is minimization of CIM
  • MCC ?

?
?
9
MCC is M-estimation
MCC ?
?
where
10
Minimization of Error Entropy
  • Renyis quadratic error entropy is estimated by
  • Information Potential (IP)

11
Relation between MEE and MCC
  • Define
  • Construct

12
Relation between MEE and MCC
13
IP induced metric
  • Define
  • is a pseudo-metric.
  • NO identity of indiscernibles.

14
IPM contours
  • Contours of IPM(E,0) in 2D sample space
  • valley along e1 e2, not sensitive to the error
    mean
  • saturates with points far from the valley

15
MEE and its equivalences
  • MEE ?

?
?
?
?
16
MEE is M-estimation
Assume the error PDF with then
17
Nuisance of conventional MEE
  • How to determine the location of the error PDF
    since it is shift-invariant.
  • Conventionally by making the error mean equal to
    zero.
  • In the case that the error PDF is non-symmetric
    or has heavy tails the estimation of error mean
    is problematic.
  • Fixing the error peak at the origin is obviously
    better than the conventional method of shifting
    the error based on zero-mean.

18
ERROR ENTROPY WITH FIDUCIAL POINTS
  • supervised training ? most of the errors equal to
    zero
  • minimizes the error entropy with respect to 0
  • Denote
  • E is the error vector and e0 serves a point of
    reference

19
ERROR ENTROPY WITH FIDUCIAL POINTS
  • In general, we have

20
ERROR ENTROPY WITH FIDUCIAL POINTS
  • ? is a weighting constant between 0 and 1
  • how many fiducial points at the origin
  • ? 0 ? MEE
  • ? 1 ? MCC
  • 0 lt ? lt 1 ? Minimization of Error Entropy with
    Fiducial points (MEEF).

21
ERROR ENTROPY WITH FIDUCIAL POINTS
  • MCC term locates the main peak of the error PDF
    and fixes it at the origin even in the cases
    where the estimation of the error mean is not
    robust
  • Unifying two cost functions actually retains all
    the merits of being completely robust with
    outlier resistance and kernel size resilience.

22
Metric induced by MEEF
  • Well-defined metric
  • directional sensitive
  • favor errors with the same sign
  • penalize errors have different signs

23
Experiment 1 Robust regression
  • X input variable
  • f unknown function
  • N noise
  • Y observation
  • Noise PDF

24
Regression results
25
Experiment 2 Chaotic signal prediction
  • Mackey-Glass chaotic time series with parameter
    t30
  • time delayed neural network (TDNN)
  • 7 inputs,
  • 14 hidden PEs
  • tanh nonlinearity
  • 1 linear output

26
Training error PDF
27
Conclusions
  • Establish connections between MEE, distance
    function and M-estimation
  • Theoretically explains the robustness of this
    family of cost functions
  • Unify MEE and MCC in the framework of information
    theoretic models
  • propose a new cost functionminimization of error
    entropy with fiducial points (MEEF) which solves
    the problem of MEE being shift-invariant in an
    elegant and robust way.
Write a Comment
User Comments (0)
About PowerShow.com