Artificial Neural Network Paradigms - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

Artificial Neural Network Paradigms

Description:

This means that neighboring areas in these maps represent neighboring areas in ... And what does such a stable state look like? ... – PowerPoint PPT presentation

Number of Views:94

Avg rating:3.0/5.0

Slides: 34

Provided by: marcp179

Category:

more less

Transcript and Presenter's Notes

Title: Artificial Neural Network Paradigms

1
Artificial Neural Network Paradigms Marc
Pomplun Department of Computer
Science University of Massachusetts at
Boston E-mail marc_at_cs.umb.edu Homepage
http//www.cs.umb.edu/marc/
2
Artificial Neural Network Paradigms

Overview
The Backpropagation Network (BPN)
Supervised Learning in the BPN
The Self-Organizing Map (SOM)
Unsupervised Learning in the SOM
Instantaneous Learning The Hopfield Network

3
The Backpropagation Network

The backpropagation network (BPN) is the most
popular type of ANN for applications such as
classification or function approximation.
Like other networks using supervised learning,
the BPN is not biologically plausible.
The structure of the network is identical to the
one we discussed before
Three (sometimes more) layers of neurons,
Only feedforward processing input layer ?
hidden layer ? output layer,
Sigmoid activation functions

4
The Backpropagation Network

BPN units and activation functions

5
Supervised Learning in the BPN

Before the learning process starts, all weights
(synapses) in the network are initialized with
pseudorandom numbers.
We also have to provide a set of training
patterns (exemplars). They can be described as a
set of ordered vector pairs (x1, y1), (x2, y2),
, (xP, yP).
Then we can start the backpropagation learning
algorithm.
This algorithm iteratively minimizes the
networks error by finding the gradient of the
error surface in weight-space and adjusting the
weights in the opposite direction
(gradient-descent technique).

6
Supervised Learning in the BPN

Gradient-descent example Finding the absolute
minimum of a one-dimensional error function f(x)

Repeat this iteratively until for some xi, f(xi)
is sufficiently close to 0.
7
Supervised Learning in the BPN

Gradients of two-dimensional functions

The two-dimensional function in the left diagram
is represented by contour lines in the right
diagram, where arrows indicate the gradient of
the function at different locations. Obviously,
the gradient is always pointing in the direction
of the steepest increase of the function. In
order to find the functions minimum, we should
always move against the gradient.
8
Supervised Learning in the BPN

In the BPN, learning is performed as follows
Randomly select a vector pair (xp, yp) from the
training set and call it (x, y).
Use x as input to the BPN and successively
compute the outputs of all neurons in the network
(bottom-up) until you get the network output o.
Compute the error ?opk, for the pattern p across
all K output layer units by using the formula

9
Supervised Learning in the BPN

Compute the error ?hpj, for all J hidden layer
units by using the formula

Update the connection-weight values to the hidden
layer by using the following equation

10
Supervised Learning in the BPN

Update the connection-weight values to the output
layer by using the following equation

Repeat steps 1 to 6 for all vector pairs in the
training set this is called a training epoch.
Run as many epochs as required to reduce the
network error E to fall below a threshold ?

11
Supervised Learning in the BPN
The only thing that we need to know before we can
start our network is the derivative of our
sigmoid function, for example, f(netk) for the
output neurons
12
Supervised Learning in the BPN

Now our BPN is ready to go!
If we choose the type and number of neurons in
our network appropriately, after training the
network should show the following behavior
If we input any of the training vectors, the
network should yield the expected output
vector (with some margin of error).
If we input a vector that the network has never
seen before, it should be able to
generalize and yield a plausible output
vector based on its knowledge about similar
input vectors.

13
Self-Organizing Maps (Kohonen Maps)
In the BPN, we used supervised learning. This is
not biologically plausible In a biological
system, there is no external teacher who
manipulates the networks weights from outside
the network. Biologically more adequate
unsupervised learning. We will study
Self-Organizing Maps (SOMs) as examples for
unsupervised learning (Kohonen, 1980).
14
Self-Organizing Maps (Kohonen Maps)
In the human cortex, multi-dimensional sensory
input spaces (e.g., visual input, tactile input)
are represented by two-dimensional maps. The
projection from sensory inputs onto such maps is
topology conserving. This means that neighboring
areas in these maps represent neighboring areas
in the sensory input space. For example,
neighboring areas in the sensory cortex are
responsible for the arm and hand regions.
15
Self-Organizing Maps (Kohonen Maps)

Such topology-conserving mapping can be achieved
by SOMs
Two layers input layer and output (map) layer
Input and output layers are completely
connected.
Output neurons are interconnected within a
defined neighborhood.
A topology (neighborhood relation) is defined
on the output layer.

16
Self-Organizing Maps (Kohonen Maps)
Common output-layer structures
One-dimensional(completely interconnected)
Two-dimensional(connections omitted, only
neighborhood relations shown green)
17
Self-Organizing Maps (Kohonen Maps)
A neighborhood function ?(i, k) indicates how
closely neurons i and k in the output layer are
connected to each other. Usually, a Gaussian
function on the distance between the two neurons
in the layer is used
18
Unsupervised Learning in SOMs
For n-dimensional input space and m output
neurons
(1) Choose random weight vector wi for neuron i,
i 1, ..., m
(2) Choose random input x
(3) Determine winner neuron k wk
x mini wi x (Euclidean distance)
(4) Update all weight vectors of all neurons i in
the neighborhood of neuron k wi wi
??(i, k)(x wi) (wi is shifted towards x)
(5) If convergence criterion met, STOP.
Otherwise, narrow neighborhood function ? and
learning parameter ? and go to (2).
19
Unsupervised Learning in SOMs
Example I Learning a one-dimensional
representation of a two-dimensional (triangular)
input space
20
Unsupervised Learning in SOMs
Example II Learning a two-dimensional
representation of a two-dimensional (square)
input space
21
Unsupervised Learning in SOMs
Example IIILearning a two-dimensional mapping
of texture images
22
The Hopfield Network

The Hopfield model is a single-layered recurrent
network.
It is usually initialized with appropriate
weights instead of being trained.
The network structure looks as follows

X1
X2
XN

23
The Hopfield Network

We will focus on the discrete Hopfield model,
because its mathematical description is more
straightforward.
In the discrete model, the output of each neuron
is either 1 or 1.
In its simplest form, the output function is the
sign function, which yields 1 for arguments ? 0
and 1 otherwise.

24
The Hopfield Network

For input-output pairs (x1, y1), (x2, y2), ,
(xP, yP), we can initialize the weights in the
following way (like associative memory)

This is identical to the following formula
where xp(j) is the j-th component of vector xp,
andyp(i) is the i-th component of vector yp.
25
The Hopfield Network

In the discrete version of the model, each
component of an input or output vector can only
assume the values 1 or 1.
The output of a neuron i at time t is then
computed according to the following formula

This recursion can be performed over and over
again. In some network variants, external input
is added to the internal, recurrent one.
26
The Hopfield Network

Usually, the vectors xp are not orthonormal, so
it is not guaranteed that whenever we input some
pattern xp, the output will be yp, but it will be
a pattern similar to yp.
Since the Hopfield network is recurrent, its
behavior depends on its previous state and in the
general case is difficult to predict.
However, what happens if we initialize the
weights with a set of patterns so that each
pattern is being associated with itself, (x1,
x1), (x2, x2), , (xP, xP)?

27
The Hopfield Network

This initialization is performed according to the
following equation

You see that the weight matrix is symmetrical,
i.e., wij wji. We also demand that wii 0, in
which case the network shows an interesting
behavior. It can be mathematically proven that
under these conditions the network will reach a
stable activation state within an finite number
of iterations.
28
The Hopfield Network

And what does such a stable state look like?
The network associates input patterns with
themselves, which means that in each iteration,
the activation pattern will be drawn towards one
of those patterns.
After converging, the network will most likely
present one of the patterns that it was
initialized with.
Therefore, Hopfield networks can be used to
restore incomplete or noisy input patterns.

29
The Hopfield Network

Example Image reconstruction (Ritter, Schulten,
Martinetz 1990)
A 20?20 discrete Hopfield network was trained
with 20 input patterns, including the one shown
in the left figure and 19 random patterns as the
one on the right.

30
The Hopfield Network

After providing only one fourth of the face
image as initial input, the network is able to
perfectly reconstruct that image within only two
iterations.

31
The Hopfield Network

Adding noise by changing each pixel with a
probability p 0.3 does not impair the networks
performance.
After two steps the image is perfectly
reconstructed.

32
The Hopfield Network

However, for noise created by p 0.4, the
network is unable the original image.
Instead, it converges against one of the 19
random patterns.

33
The Hopfield Network

The Hopfield model constitutes an interesting
neural approach to identifying partially occluded
objects and objects in noisy images.
These are among the toughest problems in computer
vision.

Write a Comment

User Comments (0)