Artificial Neural Network Paradigms - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Artificial Neural Network Paradigms

Description:

This means that neighboring areas in these maps represent neighboring areas in ... And what does such a stable state look like? ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 34
Provided by: marcp179
Category:

less

Transcript and Presenter's Notes

Title: Artificial Neural Network Paradigms


1
Artificial Neural Network Paradigms Marc
Pomplun Department of Computer
Science University of Massachusetts at
Boston E-mail marc_at_cs.umb.edu Homepage
http//www.cs.umb.edu/marc/
2
Artificial Neural Network Paradigms
  • Overview
  • The Backpropagation Network (BPN)
  • Supervised Learning in the BPN
  • The Self-Organizing Map (SOM)
  • Unsupervised Learning in the SOM
  • Instantaneous Learning The Hopfield Network

3
The Backpropagation Network
  • The backpropagation network (BPN) is the most
    popular type of ANN for applications such as
    classification or function approximation.
  • Like other networks using supervised learning,
    the BPN is not biologically plausible.
  • The structure of the network is identical to the
    one we discussed before
  • Three (sometimes more) layers of neurons,
  • Only feedforward processing input layer ?
    hidden layer ? output layer,
  • Sigmoid activation functions

4
The Backpropagation Network
  • BPN units and activation functions

5
Supervised Learning in the BPN
  • Before the learning process starts, all weights
    (synapses) in the network are initialized with
    pseudorandom numbers.
  • We also have to provide a set of training
    patterns (exemplars). They can be described as a
    set of ordered vector pairs (x1, y1), (x2, y2),
    , (xP, yP).
  • Then we can start the backpropagation learning
    algorithm.
  • This algorithm iteratively minimizes the
    networks error by finding the gradient of the
    error surface in weight-space and adjusting the
    weights in the opposite direction
    (gradient-descent technique).

6
Supervised Learning in the BPN
  • Gradient-descent example Finding the absolute
    minimum of a one-dimensional error function f(x)

Repeat this iteratively until for some xi, f(xi)
is sufficiently close to 0.
7
Supervised Learning in the BPN
  • Gradients of two-dimensional functions

The two-dimensional function in the left diagram
is represented by contour lines in the right
diagram, where arrows indicate the gradient of
the function at different locations. Obviously,
the gradient is always pointing in the direction
of the steepest increase of the function. In
order to find the functions minimum, we should
always move against the gradient.
8
Supervised Learning in the BPN
  • In the BPN, learning is performed as follows
  • Randomly select a vector pair (xp, yp) from the
    training set and call it (x, y).
  • Use x as input to the BPN and successively
    compute the outputs of all neurons in the network
    (bottom-up) until you get the network output o.
  • Compute the error ?opk, for the pattern p across
    all K output layer units by using the formula

9
Supervised Learning in the BPN
  • Compute the error ?hpj, for all J hidden layer
    units by using the formula
  • Update the connection-weight values to the hidden
    layer by using the following equation

10
Supervised Learning in the BPN
  • Update the connection-weight values to the output
    layer by using the following equation
  • Repeat steps 1 to 6 for all vector pairs in the
    training set this is called a training epoch.
  • Run as many epochs as required to reduce the
    network error E to fall below a threshold ?

11
Supervised Learning in the BPN
The only thing that we need to know before we can
start our network is the derivative of our
sigmoid function, for example, f(netk) for the
output neurons
12
Supervised Learning in the BPN
  • Now our BPN is ready to go!
  • If we choose the type and number of neurons in
    our network appropriately, after training the
    network should show the following behavior
  • If we input any of the training vectors, the
    network should yield the expected output
    vector (with some margin of error).
  • If we input a vector that the network has never
    seen before, it should be able to
    generalize and yield a plausible output
    vector based on its knowledge about similar
    input vectors.

13
Self-Organizing Maps (Kohonen Maps)
In the BPN, we used supervised learning. This is
not biologically plausible In a biological
system, there is no external teacher who
manipulates the networks weights from outside
the network. Biologically more adequate
unsupervised learning. We will study
Self-Organizing Maps (SOMs) as examples for
unsupervised learning (Kohonen, 1980).
14
Self-Organizing Maps (Kohonen Maps)
In the human cortex, multi-dimensional sensory
input spaces (e.g., visual input, tactile input)
are represented by two-dimensional maps. The
projection from sensory inputs onto such maps is
topology conserving. This means that neighboring
areas in these maps represent neighboring areas
in the sensory input space. For example,
neighboring areas in the sensory cortex are
responsible for the arm and hand regions.
15
Self-Organizing Maps (Kohonen Maps)
  • Such topology-conserving mapping can be achieved
    by SOMs
  • Two layers input layer and output (map) layer
  • Input and output layers are completely
    connected.
  • Output neurons are interconnected within a
    defined neighborhood.
  • A topology (neighborhood relation) is defined
    on the output layer.

16
Self-Organizing Maps (Kohonen Maps)
Common output-layer structures
One-dimensional(completely interconnected)
Two-dimensional(connections omitted, only
neighborhood relations shown green)
17
Self-Organizing Maps (Kohonen Maps)
A neighborhood function ?(i, k) indicates how
closely neurons i and k in the output layer are
connected to each other. Usually, a Gaussian
function on the distance between the two neurons
in the layer is used
18
Unsupervised Learning in SOMs
For n-dimensional input space and m output
neurons
(1) Choose random weight vector wi for neuron i,
i 1, ..., m
(2) Choose random input x
(3) Determine winner neuron k wk
x mini wi x (Euclidean distance)
(4) Update all weight vectors of all neurons i in
the neighborhood of neuron k wi wi
??(i, k)(x wi) (wi is shifted towards x)
(5) If convergence criterion met, STOP.
Otherwise, narrow neighborhood function ? and
learning parameter ? and go to (2).
19
Unsupervised Learning in SOMs
Example I Learning a one-dimensional
representation of a two-dimensional (triangular)
input space
20
Unsupervised Learning in SOMs
Example II Learning a two-dimensional
representation of a two-dimensional (square)
input space
21
Unsupervised Learning in SOMs
Example IIILearning a two-dimensional mapping
of texture images
22
The Hopfield Network
  • The Hopfield model is a single-layered recurrent
    network.
  • It is usually initialized with appropriate
    weights instead of being trained.
  • The network structure looks as follows

X1
X2
XN

23
The Hopfield Network
  • We will focus on the discrete Hopfield model,
    because its mathematical description is more
    straightforward.
  • In the discrete model, the output of each neuron
    is either 1 or 1.
  • In its simplest form, the output function is the
    sign function, which yields 1 for arguments ? 0
    and 1 otherwise.

24
The Hopfield Network
  • For input-output pairs (x1, y1), (x2, y2), ,
    (xP, yP), we can initialize the weights in the
    following way (like associative memory)

This is identical to the following formula
where xp(j) is the j-th component of vector xp,
andyp(i) is the i-th component of vector yp.
25
The Hopfield Network
  • In the discrete version of the model, each
    component of an input or output vector can only
    assume the values 1 or 1.
  • The output of a neuron i at time t is then
    computed according to the following formula

This recursion can be performed over and over
again. In some network variants, external input
is added to the internal, recurrent one.
26
The Hopfield Network
  • Usually, the vectors xp are not orthonormal, so
    it is not guaranteed that whenever we input some
    pattern xp, the output will be yp, but it will be
    a pattern similar to yp.
  • Since the Hopfield network is recurrent, its
    behavior depends on its previous state and in the
    general case is difficult to predict.
  • However, what happens if we initialize the
    weights with a set of patterns so that each
    pattern is being associated with itself, (x1,
    x1), (x2, x2), , (xP, xP)?

27
The Hopfield Network
  • This initialization is performed according to the
    following equation

You see that the weight matrix is symmetrical,
i.e., wij wji. We also demand that wii 0, in
which case the network shows an interesting
behavior. It can be mathematically proven that
under these conditions the network will reach a
stable activation state within an finite number
of iterations.
28
The Hopfield Network
  • And what does such a stable state look like?
  • The network associates input patterns with
    themselves, which means that in each iteration,
    the activation pattern will be drawn towards one
    of those patterns.
  • After converging, the network will most likely
    present one of the patterns that it was
    initialized with.
  • Therefore, Hopfield networks can be used to
    restore incomplete or noisy input patterns.

29
The Hopfield Network
  • Example Image reconstruction (Ritter, Schulten,
    Martinetz 1990)
  • A 20?20 discrete Hopfield network was trained
    with 20 input patterns, including the one shown
    in the left figure and 19 random patterns as the
    one on the right.

30
The Hopfield Network
  • After providing only one fourth of the face
    image as initial input, the network is able to
    perfectly reconstruct that image within only two
    iterations.

31
The Hopfield Network
  • Adding noise by changing each pixel with a
    probability p 0.3 does not impair the networks
    performance.
  • After two steps the image is perfectly
    reconstructed.

32
The Hopfield Network
  • However, for noise created by p 0.4, the
    network is unable the original image.
  • Instead, it converges against one of the 19
    random patterns.

33
The Hopfield Network
  • The Hopfield model constitutes an interesting
    neural approach to identifying partially occluded
    objects and objects in noisy images.
  • These are among the toughest problems in computer
    vision.
Write a Comment
User Comments (0)
About PowerShow.com