Adaptive Algorithms for PCA - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Adaptive Algorithms for PCA

Description:

Feedforward weights W are trained using Oja's rule. ... It should be noted that when convergence is reached, all the lateral weights must go to zero! ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 12
Provided by: rao78
Category:

less

Transcript and Presenter's Notes

Title: Adaptive Algorithms for PCA


1
Adaptive Algorithms for PCA
  • PART II

2
  • Ojas rule is the basic learning rule for PCA and
    extracts the first principal component
  • Deflation procedure can be used to estimate the
    minor eigencomponents
  • Sangers rule does an on-line deflation and uses
    Ojas rule to estimate the eigencomponents
  • Problems with Sangers rule-
  • Strictly speaking, Sangers rule is non-local and
    makes it a little harder for VLSI implementation.
  • Non-local rules are termed as biologically
    non-plausible! (As engineers, we dont care very
    much about this)
  • Sangers rule converges slowly. We will see later
    that many algorithms for PCA converge slowly.

3
Other Adaptive structures for PCA The first step
would be to change the architecture of the
network so that the update rules become local.
INPUT X(n)
LATERAL WEIGHTS - C
WEIGHTS -W
4
This is the Rubner-Tavan Model. Output vector y
is given by
x1
w1
y1
c
x2
y2
w2
  • C is a lower triangular matrix and this is
    usually called as the lateral weight matrix or
    the lateral inhibitor matrix.
  • Feedforward weights W are trained using Ojas
    rule.
  • Lateral weights are trained using anti-Hebbian
    rule.

5
  • Why this works?
  • Fact We know that the eigenvectors are all
    orthonormal vectors. Hence the outputs of the
    network are all uncorrelated. Since, anti-Hebbian
    learning decorrelates signals, we can use this
    for training the lateral network.
  • Most important contributions of Rubner-Tavan
    Model
  • Local update rules and hence biologically
    plausible
  • Introduction of the lateral network for
    estimating minor components instead of using
    deflation

6
  • However, the Rubner-Tavan model is slow to
    converge.
  • APEX (Adaptive Principal Component Extraction)
    network slightly improves the speed of
    convergence of the Rubner-Tavan method.
  • APEX uses exactly the same network architecture
    as Rubner-Tavan.
  • Feedforward weights are trained using Ojas rule
    as before
  • Lateral weights are trained using normalized
    anti-Hebbian rule instead of just anti-Hebbian!

This is very similar to the normalization we did
to Hebbian learning. You can say that this is
Ojas rule for anti-Hebbian
7
  • APEX is faster because normalized anti-Hebbian
    rule is used to train the lateral net.
  • It should be noted that when convergence is
    reached, all the lateral weights must go to zero!
    This is because, when convergence is reached, all
    the outputs are uncorrelated and hence there
    should not be any connection between them.
  • Faster methods for PCA-
  • All the adaptive models we discussed so far are
    based on gradient formulations. Simple gradient
    methods are usually slow and their convergence
    depends heavily on the selection of the right
    step-sizes.
  • Usually, the selection of step-sizes is directly
    dependent on the eigenvalues of the input data.

8
  • Researchers have used different optimization
    criteria instead of the simple steepest descent.
    These optimizations no doubt increase the speed
    of convergence but they increase the
    computational cost as well!
  • There is always a trade-off between speed of
    convergence and complexity.
  • There are some subspace techniques like the
    Natural-power method and Projection Approximation
    Subspace Tracking (PAST) which are faster than
    the traditional Sangers or APEX rules, but are
    computationally intensive. Most of them involve
    direct matrix multiplications.

9
CNEL rule for PCA-
  • Any value of T can be choosen, but T lt 1
  • There is no step-size in the algorithm!
  • The algorithm is O(N) and on-line.
  • Most importantly, it is faster than many other
    PCA algorithms

10
Performance With Violin Time Series (CNEL rule)
11
Sangers rule
Write a Comment
User Comments (0)
About PowerShow.com