Fast Learning in Networks of Locally-Tuned Processing Units - PowerPoint PPT Presentation

About This Presentation
Title:

Fast Learning in Networks of Locally-Tuned Processing Units

Description:

Responses of neurons are 'locally-tuned' or 'selective' for some part ... Cochlear stereocilia cells in human ear exhibit locally-tuned response to frequency. ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 15
Provided by: benfei
Learn more at: https://www.cs.hmc.edu
Category:

less

Transcript and Presenter's Notes

Title: Fast Learning in Networks of Locally-Tuned Processing Units


1
Fast Learning in Networks of Locally-Tuned
Processing Units
  • John Moody and Christian J. Darken
  • Yale Computer Science
  • Neural Computation 1, 281-294 (1989)

2
Network Architecture
  • Responses of neurons are locally-tuned or
    selective for some part of the input space.
  • Contains a single hidden layer of these
    locally-tuned neurons.
  • Hidden layer outputs are fed to a layer of linear
    neurons, giving network output.
  • For mathematical simplicity, well assume only
    one neuron in the linear output layer.

3
Network Architecture (2)
4
Biological Plausibility
  • Cochlear stereocilia cells in human ear exhibit
    locally-tuned response to frequency.
  • Cells in visual cortex respond selectively to
    stimulation that is both local in retinal
    position and local in angle of orientation.
  • Prof. Wang showed locally-tuned responses to
    motion of particular speeds and orientations.

5
Mathematical Definitions
  • A network of M locally-tuned units has overall
    response function
  • Here, is a real-valued vector in input space,
    is the response function of the locally-tuned
    unit, R is a radially-symmetric function with a
    single maximum its center and which drops to zero
    at large radii.

6
Mathematical Definitions (2)
  • and are the center and width in the input
    space of the unit, and is the
    weight or amplitude of the unit.
  • A simple R is the unit normalized Gaussian

7
Possible Training Methods
  • Fully supervised training to find neuron centers,
    widths, and amplitude.
  • Uses error gradient found by varying all
    parameters (no restrictions on the parameters).
  • In particular, widths can grow large, thereby
    losing the local nature of the neurons.
  • Compared with backpropagation, achieves lower
    error, but like BP, very slow to train.

8
Possible Training Methods (2)
  • Combination of supervised and unsupervised
    learning, a better choice?
  • Neuron centers and widths are determined through
    unsupervised learning.
  • Weights or amplitudes for hidden layer outputs
    are determined through supervised training.

9
Unsupervised Learning
  • Determination of neuron centers, how?
  • k-means clustering
  • Find set a k neuron centers which represent a
    local minimum of the total squared euclidean
    distances between the training vectors and the
    neuron centers.
  • Learning Vector Quantization (LVQ)

10
Unsupervised Learning (2)
  • Determination of neuron widths, how?
  • P nearest-neighbor heuristics
  • Vary widths to achieve certain amount of response
    overlap between each neuron and its P nearest
    neighbors.
  • Global first nearest-neighbor, P 1
  • Uses global average width between each neuron and
    its nearest neighbor as nets uniform width.

11
Supervised Learning
  • Determination of weights, how?
  • Simple case for 1 linear output
  • Use Widrow-Hoff learning rule.
  • For a layer of linear outputs?
  • Simply use Gradient Descent learning rule.
  • Reduced to a linear optimization problem.

12
Advantages Over Backprop
  • Training via a combination of linear supervised
    and linear self-organizing techniques is much
    faster than backprop.
  • For a given input, only a small fraction of
    neurons (those with nearby centers) will give
    non-zero responses. Hence we dont need to fire
    all neurons to get overall output. This improves
    performance.

13
Advantages Over Backprop (2)
  • Based on well-developed mathematical theory
    (kernel theory) yielding statistical robustness.
  • Computational simplicity since only one layer is
    involved in supervised training.
  • Provides guaranteed, globally optimal solution
    via simple linear optimization.

14
Project Proposal
  • Currently debugging C RBF network with n
    dimensional input and 1 linear output neuron.
  • Uses k-means clustering, global first nearest
    neighbor heuristic, and gradient descent.
  • Experimentation with different training algs.
  • Try to reproduce results for RBF neural nets
    performing face-recognition.
Write a Comment
User Comments (0)
About PowerShow.com