RPROP Resilient Propagation

About This Presentation

Title:

RPROP Resilient Propagation

Description:

Rprop Description and Implementation Details ... too large and the minimum was missed, the previous 'weight update is reverted. ... – PowerPoint PPT presentation

Number of Views:606

Avg rating:3.0/5.0

Slides: 20

Provided by: Nole

Category:

more less

Transcript and Presenter's Notes

Title: RPROP Resilient Propagation

1
RPROPResilient Propagation

Students Novica Zarvic
Roxana Grigoras
Term Winter semester 2003/2004
Lecture Machine Learning / Neural Networks
Course Information Engineering
Date 2004-01-06

2
Content

Part I
General remarks
Foundations (MLP, Supervised Learning,
Backpropagation and its problems)
Part II
Description of the RProp Algorithm
Example cases
Part III
Visualization with SNNS
Discussion

-02-
3
General remarks
I.

Basis for this talk
Rprop Description and Implementation Details
(Technical report by Martin Riedmiller, January
1994)
URL http//lrb.cs.uni-dortmund.de/riedmill/publi
cations/rprop.details.ps.Z

-03-
4
MLPMulti-Layer Perceptron
I.
Output layer
Input layer
Hidden layer(s)
Topology of a typical feed-forward network with
two hidden layers. The external input is
presented to the input layer, propagated forward
through the hidden layers and yields an output
activation vector in the output layer.
-04-
5
Supervised Learning
I.

Objective
Tune the weights in the network such that the
network performs a desired mapping of input to
output activations.

-05-
6
Principleof supervised learning (like BP or one
of its derivatives)
I.

Presentation of the input pattern through
activation of the input units. The pattern set
consists of input activation vector xp and a
target vector tp.
Feedforward computation to get the resulting
output vector sp.
Compare sp with tp. Distance between the vectors
is measured by the function E ½ ?p ?n
tp sp 2 . (n number of units in output
layer, p
a pattern pair of the pattern set P)
Backpropagation of the errors from the output to
the input changes the weights of the connections.
This minimizes the error vector.
Changing the weights of all neurons with the
previous calculated values.

-06-
7
Problems of Backpropagation
I.

? No information about the complete error
function. It is difficult to choose a good
learning rate.
a. Local Minima of E
b. Plateaus
c. Oscillation
d. Leaving good Minima
? It uses only weight-specific information
(partial derivative) to adapt weight-specific
parameters.

-07-
8
RPROPResilient Propagation
II.

What is the traditional Backpropagation algorithm
doing?
? It modifies the weights of the partial
derivatives. (?E/ ?wij)
? Problem The size of this differential does
not really represent the size of the necessary
modification of the weight changes.
? Solution RProp does not count on the value of
the partial derivative. It considers only the
sign of the derivative to indicate the direction
of the weight update.

-08-
9
RPROP-Description-
II.

Effective learning scheme
It performs a direct adaption of weight step
based on local gradient information
Basic principle of RProp is to eliminate the
harmful influence of the size of the partial
derivative on the weight step
It considers only the sign of the derivative to
indicate the direction of the weight update.

-09-
10
RPROPResilient Propagation
II.
-10-
11
RPROPWhat is ?ij ?
II.

?ij is an update value.
The size of the weight change is exclusively
determined by this weight-specific update
value.
?ij evolves during the learning process based on
its local sight on the errorfunction E, according
to the following learning-rule

-11-
12
RPROP
II.

The weight update ?wij follows a simple rule
If the derivative is positive (increasing
error), the weight is decreased by its update
value.
If the derivative is negative, the update value
is added.

-12-
13
RPROPOne exception (Take bad steps back!)
II.

If the partial derivative changes sign, i.e. the
previous step was too large and the minimum was
missed, the previous weight update is reverted.

-13-
14
RPROP-The pseudo code-
II.
-14-
15
RPROP-Settings-
II.

Increasing and decreasing factors
?- 0.5 (decrease factor)
? 1.2 (increase factor)
Limits
?max 50.0 (upper limit)
?min 1e-6 (upper limit)
Initial value
?o 0.1 (default setting)

-15-
16
RPROPBackprop vs. RProp
III.
-16-
17
RPROP-Discussion-
III.

Compared to all other algorithms, only the sign
of the differential is used to perform learning
and adaptation.
The size of the derivative decreases
exponentially with the distance between the
weight and the output-layer.
Using RProp the size of the weight-step is
dependent only on the sequence of signs ?
learning is spread equally all over the entire
network.

-17-
18
RPROP-Further material-
III.

Advanced Supervised Learning in Multi-layer
Perceptrons From Backpropagation to Adaptive
Learning Algorithms (Martin Riedmiller)
A Direct Adaptive Method for Faster
Backpropagation Learning The RPROP Algorithm
(Martin Riedmiller)
Rprop Description and Implementation Details
(Martin Riedmiller)

-18-
19
RPROPResilient Propagation
III.