The delta rule - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

The delta rule

Description:

assumption: examples are drawn from a probability distribution. conditions for generalization ... Smooth activation functions are important for generalizing the ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 20
Provided by: sebasti67
Category:

less

Transcript and Presenter's Notes

Title: The delta rule


1
The delta rule
2
Learn from your mistakes
3
If it aint broke, dont fix it.
4
Outline
  • Supervised learning problem
  • Delta rule
  • Delta rule as gradient descent
  • Hebb rule

5
Supervised learning
  • Given examples
  • Find perceptron such that

6
Example handwritten digits
  • Find a perceptron that detects twos.

7
Delta rule
  • Learning from mistakes.
  • delta difference between desired and actual
    output.
  • Also called perceptron learning rule

8
Two types of mistakes
  • False positive
  • Make w less like x.
  • False negative
  • Make w more like x.
  • The update is always proportional to x.

9
Objective function
  • Gradient update
  • Stochastic gradient descent on
  • E0 means no mistakes.

10
Perceptron convergence theorem
  • Cycle through a set of examples.
  • Suppose a solution with zero error exists.
  • The perceptron learning rule finds a solution in
    finite time.

11
If examples are nonseparable
  • The delta rule does not converge.
  • Objective function is not equal to the number of
    mistakes.
  • No reason to believe that the delta rule
    minimizes the number of mistakes.

12
Memorization generalization
  • Prescription minimize error on the training set
    of examples
  • What is the error on a test set of examples?
  • Vapnik-Chervonenkis theory
  • assumption examples are drawn from a probability
    distribution
  • conditions for generalization

13
contrast with Hebb rule
  • Assume that the teacher can drive the perceptron
    to produce the desired output.
  • What are the objective functions?

14
Is the delta rule biological?
  • Actual output anti-Hebbian
  • Desired output Hebbian
  • Contrastive

15
Objective function
  • Hebb rule
  • distance from inputs
  • Delta rule
  • error in reproducing the output

16
Supervised vs. unsupervised
  • Classification vs. generation
  • I shall not today attempt further to define the
    kinds of material pornography but I know it
    when I see it.
  • Justice Potter Stewart

17
Smooth activation function
  • same except for slope of f
  • update is small when the argument of f has large
    magnitude.

18
Objective function
  • Gradient update
  • Stochastic gradient descent on
  • E0 means zero error.

19
Smooth activation functions are important for
generalizing the delta rule to multilayer
perceptrons.
Write a Comment
User Comments (0)
About PowerShow.com