Connectionist Networks - PowerPoint PPT Presentation

1 / 65

About This Presentation

Title:

Connectionist Networks

Description:

Recall that we previously said the weight updates proceed using: The new ... There are no self-connections, so wii = 0 for all i. Biases wi0 may be included ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 66

Provided by: stephaner

Category:

more less

Transcript and Presenter's Notes

Title: Connectionist Networks

1
Connectionist Networks

Chapter 11

2
Perceptron Learning
3
Perceptron Learning
4
Perceptron Learning
5
Perceptron Learning
(x1,x2)
w0 w1x1 w2x2 0
d
6
Perceptron Learning
7
Perceptron Learning
8
Perceptron Learning
9
Perceptron Learning
10
Perceptron Learning
Heaviside function
11
Perceptron Learning
12
Perceptron Learning
13
Perceptron Learning
14
Perceptron Learning
w0 -14.8, w1 2.7, w2 -2.9
(7,-5)
15
Perceptron Learning
16
Perceptron Learning
w0 -14.8, w1 2.7, w2 -2.9
(5,4)
17
Perceptron Learning
18
Perceptron Learning
w0 -14.8, w1 2.7, w2 -2.9
(5,4) -gt 0
(7,-5) -gt 1
19
Perceptron Learning

Assume we have a set of data inputs with binary
labels , and a neuron whose output
is bounded between 0 and 1.
Data points with a binary label of 1 can be
viewed as having a probability of occurrence of
and data points having a binary label of
0 can be viewed as having a probability of
occurrence of under our model
sigmoid distribution.

20
Perceptron Learning

Since our binary labels are all zeros
or ones a cleaver way to express the error in our
classifications is

21
Perceptron Learning
22
Perceptron Learning
23
Perceptron Learning
24
Perceptron Learning
25
Perceptron Learning
26
Perceptron Learning

The Algorithm so far
For each input/target pair (x(n),t(n)) (n 1
N), compute f(n) f(x(n) w), where
Define e(n) t(n) - f(n), and compute for each
weight wi
Then let

27
Perceptron Learning
28
Perceptron Learning

The learning rate
Recall that we previously said the weight
updates proceed using
The new weights then, are
Sometimes, however, this update step can produce
weight changes that jump around too much and jump
right past the optimal decision boundary. To
avoid this problem we specify a learning rate
that makes each weight change a little smaller.

29
Perceptron Learning
f(net)
30
Perceptron Learning
bias
X1
X2
XD
31
Backpropagation Learning
V1
V2
Vj
VJ
wij
HI
H1
H2
Hi
wki
X1
X2
XD
Xk
32
Backpropagation Learning
V1
Output
1
1
-2
Hidden Layer
1
2
1
1
1
1
1
X1
X2
Input
33
Backpropagation Learning

Again assume we have a set of data inputs with
binary labels .
Denote the inputs X1, , Xk, XD, the hidden
units H1, , Hi , HI and the output units V1, ,
Vj, , VJ.
Denote the weight between input Xk and hidden
unit Hi by wki and the weight between hidden unit
Hi and visible output unit Vj by wij.

34
Backpropagation Learning
V1
V2
Vj
VJ
wij
HI
H1
H2
Hi
wki
X1
X2
XD
Xk
35
Backpropagation Learning
V1
V2
Vj
VJ
wij
HI
H1
H2
Hi
wki
X1
X2
XD
Xk
36
Competitive Learning
37
Competitive Learning
38
Competitive Learning
39
Competitive Learning
40
Competitive Learning
41
Competitive Learning
42
Competitive Learning
43
Competitive Learning

The Algorithm
Randomly pick K-means.
Calculate the distance from each point to each
mean. The mean that is closest to a given point
wins that point.
Re-calculate each mean as the mean of the points
it has just won.
Iterate until the total distance moved by all of
the means is a very small number (like zero).

44
Hebbian Coincidence Learning
visual
auditory
1
f
-1
0
45
Hebbian Coincidence Learning
46
Hebbian Coincidence Learning
1
w1 1 w2 1 w3 0
-1
x1 1 x2 1 x3 -1
2
0
1
wnew 1 1 0 0.2 ? (1) ? 1 1 -1 wnew 1.2
1.2 -0.2
47
Hebbian Coincidence Learning
1
w1 1.2 w2 1.2 w3 -0.2
-1
x1 1 x2 1 x3 -1
2.6
0
1
wnew 1.2 1.2 -0.2 0.2 ? (1) ? 1 1
-1 wnew 1.4 1.4 -0.4
48
Attractor Networks or Memories
49
Attractor Networks or Memories

For the weights, wij denotes the weight from
neuron j to neuron i.
A Hopfield network consists of I neurons. They
are fully connected through symmetric,
bidirectional connections with weights wij wji.
There are no self-connections, so wii 0 for
all i. Biases wi0 may be included as weights
coming from input x0 which is permanently set to
x0 1.
The output of neuron i is denoted by xi.

50
Attractor Networks or Memories

The activity at neuron i is the weighted sum of
the inputs from all the other neurons
The threshold function used is the hyperbolic
tangent function

51
Attractor Networks or Memories

The learning rule is intended to make a set of
desired memories x(n) be stable states of the
Hopfiled networks activity rule. Each memory is
a binary pattern, with xi ? -1, 1.
The weights are set using the Hebb rule

52
Attractor Networks or Memories
53
Attractor Networks or Memories
w1
w2
w3

w25
w1
0
w2
0
w3
0

0
w25
54
Attractor Networks or Memories
w1
w2
w3

w25
w1
0
w2
0
w3
0

0
w25
55
Attractor Networks or Memories
n 21
n 25
56
Attractor Networks or Memories
w1
w2
w3

w25
w1
0
w2
0
w3
0

2
w25
0
2
57
Attractor Networks or Memories
58
Attractor Networks or Memories
1 1 -1 1 1 -1 -1 1 -1 -1 1 1 -1 1 -1 -1 1 1 1 -1 1
-1 -1 -1 1

59
Attractor Networks or Memories
60
Attractor Networks or Memories
61
Attractor Networks or Memories
25 x 1
25 x 25
25 x 25
25 x 25
62
Attractor Networks or Memories
63
Attractor Networks or Memories

The Algorithm
For a given set of n memories X(n) (say each
memory is 5 x 5, or has 25 units), compute
weights between all nodes of X using Hebbian
learning, setting all diagonal
weights to zero.

64
Attractor Networks or Memories

For a new presentation of a corrupted memory X,
initialize Xold to be a vector of ones (25 x 1
in this case).
Compute the activations using activation WX
Compute the threshold outputs using Xnew
tanh(activation)

65
Attractor Networks or Memories

Compute the change in X from one iteration to the
next
change Xnew - Xold
Compute the gradient on the weights
gw XnewchangeT
gw gw gwT
Update the weights
Wnew Wold ??(gw - ?Wold)
Stop of Xold Xnew otherwise set Xold Xnew
and iterate again, computing the new activations
with Xnew etc.