Connectionist Networks - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

Connectionist Networks

Description:

Recall that we previously said the weight updates proceed using: The new ... There are no self-connections, so wii = 0 for all i. Biases wi0 may be included ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 66
Provided by: stephaner
Category:

less

Transcript and Presenter's Notes

Title: Connectionist Networks


1
Connectionist Networks
  • Chapter 11

2
Perceptron Learning
3
Perceptron Learning
4
Perceptron Learning
5
Perceptron Learning
(x1,x2)
w0 w1x1 w2x2 0
d
6
Perceptron Learning
7
Perceptron Learning
8
Perceptron Learning
9
Perceptron Learning
10
Perceptron Learning
Heaviside function
11
Perceptron Learning
12
Perceptron Learning
13
Perceptron Learning
14
Perceptron Learning
w0 -14.8, w1 2.7, w2 -2.9
(7,-5)
15
Perceptron Learning
16
Perceptron Learning
w0 -14.8, w1 2.7, w2 -2.9
(5,4)
17
Perceptron Learning
18
Perceptron Learning
w0 -14.8, w1 2.7, w2 -2.9
(5,4) -gt 0
(7,-5) -gt 1
19
Perceptron Learning
  • Assume we have a set of data inputs with binary
    labels , and a neuron whose output
    is bounded between 0 and 1.
  • Data points with a binary label of 1 can be
    viewed as having a probability of occurrence of
    and data points having a binary label of
    0 can be viewed as having a probability of
    occurrence of under our model
    sigmoid distribution.

20
Perceptron Learning
  • Since our binary labels are all zeros
    or ones a cleaver way to express the error in our
    classifications is

21
Perceptron Learning
22
Perceptron Learning
23
Perceptron Learning
24
Perceptron Learning
25
Perceptron Learning
26
Perceptron Learning
  • The Algorithm so far
  • For each input/target pair (x(n),t(n)) (n 1
    N), compute f(n) f(x(n) w), where
  • Define e(n) t(n) - f(n), and compute for each
    weight wi
  • Then let

27
Perceptron Learning
28
Perceptron Learning
  • The learning rate
  • Recall that we previously said the weight
    updates proceed using
  • The new weights then, are
  • Sometimes, however, this update step can produce
    weight changes that jump around too much and jump
    right past the optimal decision boundary. To
    avoid this problem we specify a learning rate
    that makes each weight change a little smaller.

29
Perceptron Learning
f(net)
30
Perceptron Learning
bias
X1
X2
XD
31
Backpropagation Learning
V1
V2
Vj
VJ
wij
HI
H1
H2
Hi
wki
X1
X2
XD
Xk
32
Backpropagation Learning
V1
Output
1
1
-2
Hidden Layer
1
2
1
1
1
1
1
X1
X2
Input
33
Backpropagation Learning
  • Again assume we have a set of data inputs with
    binary labels .
  • Denote the inputs X1, , Xk, XD, the hidden
    units H1, , Hi , HI and the output units V1, ,
    Vj, , VJ.
  • Denote the weight between input Xk and hidden
    unit Hi by wki and the weight between hidden unit
    Hi and visible output unit Vj by wij.

34
Backpropagation Learning
V1
V2
Vj
VJ
wij
HI
H1
H2
Hi
wki
X1
X2
XD
Xk
35
Backpropagation Learning
V1
V2
Vj
VJ
wij
HI
H1
H2
Hi
wki
X1
X2
XD
Xk
36
Competitive Learning
37
Competitive Learning
38
Competitive Learning
39
Competitive Learning
40
Competitive Learning
41
Competitive Learning
42
Competitive Learning
43
Competitive Learning
  • The Algorithm
  • Randomly pick K-means.
  • Calculate the distance from each point to each
    mean. The mean that is closest to a given point
    wins that point.
  • Re-calculate each mean as the mean of the points
    it has just won.
  • Iterate until the total distance moved by all of
    the means is a very small number (like zero).

44
Hebbian Coincidence Learning
visual
auditory
1
f
-1
0
45
Hebbian Coincidence Learning
46
Hebbian Coincidence Learning
1
w1 1 w2 1 w3 0
-1
x1 1 x2 1 x3 -1
2
0
1
wnew 1 1 0 0.2 ? (1) ? 1 1 -1 wnew 1.2
1.2 -0.2
47
Hebbian Coincidence Learning
1
w1 1.2 w2 1.2 w3 -0.2
-1
x1 1 x2 1 x3 -1
2.6
0
1
wnew 1.2 1.2 -0.2 0.2 ? (1) ? 1 1
-1 wnew 1.4 1.4 -0.4
48
Attractor Networks or Memories
49
Attractor Networks or Memories
  • For the weights, wij denotes the weight from
    neuron j to neuron i.
  • A Hopfield network consists of I neurons. They
    are fully connected through symmetric,
    bidirectional connections with weights wij wji.
    There are no self-connections, so wii 0 for
    all i. Biases wi0 may be included as weights
    coming from input x0 which is permanently set to
    x0 1.
  • The output of neuron i is denoted by xi.

50
Attractor Networks or Memories
  • The activity at neuron i is the weighted sum of
    the inputs from all the other neurons
  • The threshold function used is the hyperbolic
    tangent function

51
Attractor Networks or Memories
  • The learning rule is intended to make a set of
    desired memories x(n) be stable states of the
    Hopfiled networks activity rule. Each memory is
    a binary pattern, with xi ? -1, 1.
  • The weights are set using the Hebb rule

52
Attractor Networks or Memories
53
Attractor Networks or Memories
w1
w2
w3

w25
w1
0
w2
0
w3
0

0
w25
54
Attractor Networks or Memories
w1
w2
w3

w25
w1
0
w2
0
w3
0

0
w25
55
Attractor Networks or Memories
n 21
n 25
56
Attractor Networks or Memories
w1
w2
w3

w25
w1
0
w2
0
w3
0

2
w25
0
2
57
Attractor Networks or Memories
58
Attractor Networks or Memories
1 1 -1 1 1 -1 -1 1 -1 -1 1 1 -1 1 -1 -1 1 1 1 -1 1
-1 -1 -1 1

59
Attractor Networks or Memories
60
Attractor Networks or Memories
61
Attractor Networks or Memories
25 x 1
25 x 25
25 x 25
25 x 25
62
Attractor Networks or Memories
63
Attractor Networks or Memories
  • The Algorithm
  • For a given set of n memories X(n) (say each
    memory is 5 x 5, or has 25 units), compute
    weights between all nodes of X using Hebbian
    learning, setting all diagonal
    weights to zero.

64
Attractor Networks or Memories
  • For a new presentation of a corrupted memory X,
    initialize Xold to be a vector of ones (25 x 1
    in this case).
  • Compute the activations using activation WX
  • Compute the threshold outputs using Xnew
    tanh(activation)

65
Attractor Networks or Memories
  • Compute the change in X from one iteration to the
    next
    change Xnew - Xold
  • Compute the gradient on the weights
  • gw XnewchangeT
  • gw gw gwT
  • Update the weights
  • Wnew Wold ??(gw - ?Wold)
  • Stop of Xold Xnew otherwise set Xold Xnew
    and iterate again, computing the new activations
    with Xnew etc.
Write a Comment
User Comments (0)
About PowerShow.com