Introduction to Neural Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Introduction to Neural Networks

Description:

Recall is no longer perfect. 1 -1 1 1 1 1 0 1 3. 1 -1 -1 -1 -1 -1 = 1 0 1. 1 -1 1 1 ... wii = 0. Characteristics. Only 1 unit updates its activation at a time ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 38

Provided by: Joh7

Learn more at: https://www.cs.montana.edu

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Neural Networks

1
Introduction to Neural Networks

John Paxton
Montana State University
Summer 2003

2
Chapter 3 Pattern Association

Aristotles observed that human memory associates
similar items
contrary items
items close in proximity
items close in succession (a song)

3
Terminology and Issues

Autoassociative Networks
Heteroassociative Networks
Feedforward Networks
Recurrent Networks
How many patterns can be stored?

4
Hebb Rule for Pattern Association

Architecture

w11
x1
y1
xn
ym
wnm
5
Algorithm

1. set wij 0 1 lt i lt n, 1 lt j lt m
2. for each training pair st
3. xi si
4. yj tj
5. wij(new) wij(old) xiyj

6
Example

s1 (1 -1 -1), s2 (-1 1 1)
t1 (1 -1), t2 (-1 1)
w11 11 (-1)(-1) 2
w12 1(-1) (-1)1 -2
w21 (-1)1 1(-1) -2
w22 (-1)(-1) 1(1) 2
w31 (-1)1 1(-1) -2
w32 (-1)(-1) 11 2

7
Matrix Alternative

s1 (1 -1 -1), s2 (-1 1 1)
t1 (1 -1), t2 (-1 1)
1 -1 1 -1 2 -2
-1 1 -1 1 -2 2
-1 1 -2 2

8
Final Network

f(yin) 1 if yin gt 0, 0 if yin 0, else -1

2
x1
y1
-2
-2
x2
2
y2
-2
x3
2
9
Properties

Weights exist if input vectors are linearly
independent
Orthogonal vectors can be learned perfectly
High weights imply strong correlations

10
Exercises

What happens if (-1 -1 -1) is tested? This
vector has one mistake.
What happens if (0 -1 -1) is tested? This vector
has one piece of missing data.
Show an example of training data that is not
learnable. Show the learned network.

11
Delta Rule for Pattern Association

Works when patterns are linearly independent but
not orthogonal
Introduced in the 1960s for ADALINE
Produces a least squares solution

12
Activation Functions

Delta Rule (1) wij(new) wij(old) a(tj
yj)xi1
Extended Delta Rule (f(yin.j)) wij(new)
wij(old) a(tj yj)xif(yin.j)

13
Heteroassociative Memory Net

Application Associate characters.A lt-gt aB
lt-gt b

14
Autoassociative Net

Architecture

w11
x1
y1
xn
yn
wnn
15
Training Algorithm

Assuming that the training vectors are
orthogonal, we can use the Hebb rule algorithm
mentioned earlier.
Application Find out whether an input vector is
familiar or unfamiliar. For example, voice input
as part of a security system.

16
Autoassociate Example

1 1 1 1 1 1 1 0 1 1
1 1 1 1 1 0
1
1 1 1 1 1 1
0

17
Evaluation

What happens if (1 1 1) is presented?
What happens if (0 1 1) is presented?
What happens if (0 0 1) is presented?
What happens if (-1 1 1) is presented?
What happens if (-1 -1 1) is presented?
Why are the diagonals set to 0?

18
Storage Capacity

2 vectors (1 1 1), (-1 -1 -1)
Recall is perfect
1 -1 1 1 1 0 2 2
1 -1 -1 -1 -1 2 0 2
1 -1 2 2 0

19
Storage Capacity

3 vectors (1 1 1), (-1 -1 -1), (1 -1 1)
Recall is no longer perfect
1 -1 1 1 1 1 0 1 3
1 -1 -1 -1 -1 -1 1 0 1
1 -1 1 1 -1 1 3 1 0

20
Theorem

Up to n-1 bipolar vectors of n dimensions can be
stored in an autoassociative net.

21
Iterative Autoassociative Net

1 vector s (1 1 -1)
st s 0 1 -1 1 0 -1
-1 -1 0
(1 0 0) -gt (0 1 -1)
(0 1 -1) -gt (2 1 -1) -gt (1 1 -1)
(1 1 -1) -gt (2 2 -2) -gt (1 1 -1)

22
Testing Procedure

1. initialize weights using Hebb learning
2. for each test vector do
3. set xi si
4. calculate ti
5. set si ti
6. go to step 4 if the s vector is new

23
Exercises

1 piece of missing data (0 1 -1)
2 pieces of missing data (0 0 -1)
3 pieces of missing data (0 0 0)
1 mistake (-1 1 -1)
2 mistakes (-1 -1 -1)

24
Discrete Hopfield Net

content addressable problems
pattern association problems
constrained optimization problems
wij wji
wii 0

25
Characteristics

Only 1 unit updates its activation at a time
Each unit continues to receive the external
signal
An energy (Lyapunov) function can be found that
allows the net to converge, unlike the previous
system
Autoassociative

26
Architecture
x2
y2
y1
y3
x1
x3
27
Algorithm

1. initialize weights using Hebb rule
2. for each input vector do
3. yi xi
4. do steps 5-6 randomly for each yi
5. yin.i xi Syjwji
6. calculate f(yin.i)
7. go to step 2 if the net hasnt converged

28
Example

training vector (1 -1)

y1
y2
-1
x1 x2
29
Example

input (0 -1) update y1 0 (-1)(-1)
1 update y2 -1 1(-1) -2 -gt -1
input (1 -1) update y2 -1 1(-1) -2 -gt
-1 update y1 1 -1(-1) 2 -gt 1

30
Hopfield Theorems

Convergence is guaranteed.
The number of storable patterns is approximately
n / (2 log n) where n is the dimension of a
vector

31
Bidirectional Associative Memory (BAM)

Heteroassociative Recurrent Net
Kosko, 1988
Architecture

x1
y1
ym
xn
32
Activation Function

f(yin) 1, if yin gt 0
f(yin) 0, if yin 0
f(yin) -1 otherwise

33
Algorithm

1. initialize weights using Hebb rule
2. for each test vector do
3. present s to x layer
4. present t to y layer
5. while equilibrium is not reached
6. compute f(yin.j)
7. compute f(xin.j)

34
Example

s1 (1 1), t1 (1 -1)
s2 (-1 -1), t2 (-1 1)
1 -1 1 -1 2 -2
1 -1 -1 1 2 -2

35
Example

Architecture

2
x1
y1
-2 2
y2
x2
-2
present (1 1) to x -gt 1 -1 present (1 -1) to y
-gt 1 1
36
Hamming Distance

Definition Number of different corresponding
bits in two vectors
For example, H(1 -1), (1 1) 1
Average Hamming Distance is ½.

37
About BAMs

Observation Encoding is better when the average
Hamming distance of the inputs is similar to the
average Hamming distance of the outputs.
The memory capacity of a BAM is min(n-1, m-1).

Write a Comment

User Comments (0)