ECE 471571 Lecture 5 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

ECE 471571 Lecture 5

Description:

The Curse of Dimensionality 1st Aspect. The number of training samples ... Curse of Dimensionality 2nd Aspect. Accuracy and overfitting ... – PowerPoint PPT presentation

Number of Views:15
Avg rating:3.0/5.0
Slides: 18
Provided by: QI1
Category:
Tags: ece | curse | lecture

less

Transcript and Presenter's Notes

Title: ECE 471571 Lecture 5


1
ECE 471/571 - Lecture 5
  • Dimensionality Reduction
  • 01/29/09

2
The Curse of Dimensionality 1st Aspect
  • The number of training samples
  • What would the probability density function look
    like if the dimensionality is very high?
  • For a 7-dimensional space, where each variable
    could have 20 possible values, then the 7-d
    histogram contains 207 cells. To distributed a
    training set of some reasonable size (1000) among
    this many cells is to leave virtually all the
    cells empty

3
Curse of Dimensionality 2nd Aspect
  • Accuracy and overfitting
  • In theory, the higher the dimensionality, the
    less the error, the better the performance.
    However, in realistic PR problems, the opposite
    is often true. Why?
  • The assumption that pdf behaves like Gaussian is
    only approximately true
  • When increasing the dimensionality, we may be
    overfitting the training set.
  • Problem excellent performance on the training
    set, poor performance on new data points which
    are in fact very close to the data within the
    training set

4
Curse of Dimensionality - 3rd Aspect
  • Computational complexity

5
Dimensionality Reduction
  • Fishers linear discriminant
  • Best discriminating the data
  • Principal component analysis (PCA)
  • Best representing the data

6
Fishers Linear Discriminant
  • For two-class cases, projection of data from
    d-dimension onto a line
  • Principle Wed like to find vector w (direction
    of the line) such that the projected data set can
    be best separated

Projected mean Sample mean
7
Other Approaches?
  • Solution 1 make the projected mean as apart as
    possible
  • Solution 2?

Scatter matrix
Between-class scatter matrix
Within-class scatter matrix
8
The Generalized Rayleigh Quotient
Canonical variate
9
Some Math Preliminaries
  • Positive definite
  • A matrix S is positive definite if yxTSxgt0 for
    all Rd except 0
  • xTSx is called the quadratic form
  • The derivative of a quadratic form is
    particularly useful
  • Eigenvalue and eigenvector
  • x is called the eigenvector of A iff x is not
    zero, and Axlx
  • l is the eigenvalue of x

10
Multiple Discriminant Analysis
  • For c-class problem, the projection is from
    d-dimensional space to a (c-1)-dimensional space
    (assume d gt c)
  • Sec. 3.8.3

11
Principal Component Analysis or K-L Transform
  • How to find a new feature space (m-dimensional)
    that is adequate to describe the original feature
    space (d-dimensional). Suppose mltd

x2
y1
y2
x1
12
K-L Transform (1)
  • Describe vector x in terms of a set of basis
    vectors bi.
  • The basis vectors (bi) should be linearly
    independent and orthonormal, that is,

13
K-L Transform (2)
  • Suppose we wish to ignore all but m (mltd)
    components of y and still represent x, although
    with some error. We will thus calculate the first
    m elements of y and replace the others with
    constants

Error
14
K-L Transform (3)
  • Use mean-square error to quantify the error

15
K-L Transform (4)
  • Find the optimal ai to minimize e2
  • Therefore, the error is now equal to

16
K-L Transform (5)
  • The optimal choice of basis vectors is the
    eigenvectors of Sx
  • The expansion of a random vector in terms of the
    eigenvectors of the covariance matrix is referred
    to as the Karhunen-Loeve expansion, or the K-L
    expansion
  • Without loss of generality, we will sort the
    eigenvectors bi in terms of their eigenvalues.
    That is l1 gt l2 gt gt ld. Then we refer to b1,
    corresponding to l1, as the major eigenvector,
    or principal component

17
Summary
  • Raw data ? covariance matrix ? eigenvalue ?
    eigenvector ? principal component
Write a Comment
User Comments (0)
About PowerShow.com