Additional NN Models - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Additional NN Models

Description:

Additional NN Models Reinforcement learning (RL) Basic ideas: Supervised learning: (delta rule, BP) Samples (x, f(x)) to learn function f(.) precise error can be ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 18
Provided by: YunP150
Category:

less

Transcript and Presenter's Notes

Title: Additional NN Models


1
Additional NN Models
  • Reinforcement learning (RL)
  • Basic ideas
  • Supervised learning (delta rule, BP)
  • Samples (x, f(x)) to learn function f(.)
  • precise error can be determined and is used to
    drive the learning.
  • Unsupervised learning (competitive, BM)
  • no target/desired output provided to help
    learning,
  • learning is self-organized/clustering
  • reinforcement learning in between the two
  • no target output for input vectors in training
    samples
  • a judge/critic will evaluate the output
  • good reward signal (1)
  • bad penalty signal (-1)

2
  • RL exists in many places
  • Originated from psychology( training animal)
  • Machine learning community, different theories
    and algorithms
  • major difficulty credit/blame distribution
  • chess playing W/L (multi-step)
  • soccer playing W/L(multi-player)
  • In many applications, it is much easier to
    determine good/bad, right/wrong,
    acceptable/unacceptable than to provide precise
    correct answer/error.
  • It is up to the learning process to improve the
    systems performance based on the critics signal.

3
  • Principle of RL
  • Let r 1 reword (good output)
  • r -1 penalty (bad output)
  • If r 1, the system is encouraged to continue
    what it is doing
  • If r -1, the system is encouraged not to do
    what it is doing.
  • Need to search for better output
  • because r -1 does not indicate what the good
    output should be.
  • common method is random search

4
  • ARP the associative reword-and-penalty algorithm
    for NN (Barton and Anandan, 1985)
  • Architecture

input x(k) output y(k) stochastic units z(k)
for random search
5
  • Random search by stochastic units zi
  • or let zi obey a continuous probability
    distribution
  • function.
  • or let is
    a random noise, obeys
  • certain distribution.
  • Key z is not a deterministic function of x,
    this gives z a chance to be a good
    output.
  • Prepare desired output (temporary)

6
  • Compute the errors at z layer
  • where E(z(k)) is the expected value of z(k)
    because z is a random variable
  • How to compute E(z(k))
  • take average of z over a period of time
  • compute from the distribution, if possible
  • if logistic sigmoid function is used,
  • Training BP or other method to minimize the
    error

7
  • Recurrent BP
  • Recurrent networks network with feedback links
  • State (output) of the network evolves along the
    time.
  • may or may not have hidden nodes.
  • may or may not stabilize when
  • how to learn w so that an initial state (input)
    will lead to
  • a stable state with the desired output.
  • 2. Unfolding
  • for any recurrent network with finite evolution
    time, there is an equivalent feedforward network.
  • problems
  • too many repetitions
  • too many layers when the network need a long
    time to

8
  • reach stable state.
  • standard BP needs to be relized to hard
    duplicate weights.
  • 3. Recurrent BP (1987)
  • system
  • assume at least one fixed point exists for the
    system with the given initial state
  • when a fixed point is reduced
  • can be obtained.
  • error

9
  • take the gradient descent approach to minimize E
    by update W
  • direct derivation will have

10
  • Computing is very time consuming.
  • Pineda and Almeida/s proposal
  • can be computed by another recurrent net
  • with identical structure of the original RN
  • direction of each are is reversed( transposed
    network)
  • in the original network weight for node j to i
    Wij
  • in the transposed network, weight for node j to
    i

11
(No Transcript)
12
  • Weight-update procedure for RBP
  • with a given input and its desired output
  • Relax the original network to a fixed point
  • Compute error
  • Relax the transposed network to a fixed point
  • Update the weight of the original network

13
  • The complete learning algorithm
  • incremental/sequential
  • W is updated by the preseting of each learning
    pair using the weight-update procedure.
  • to ensure the dearned network is stable,
    learning rate must be small(much smaller than the
    rate for standard BP learning)
  • time consuming two relaxation processes are
    involved for each step of weight update
  • better performance than BP in some applications

14
  • (III) Probabilistic Neural Networks
  • 1. Purpose classify a given input pattern x
    into one of the
  • pre-defined classes by Bayesian decision
    rule.
  • Suppose there are k predefined classes s1,
    sk
  • P(si) prior probability of class si
  • P(xsi) conditional probability of class si
  • P(x) probability of x
  • P(six) posterior probability of si, given x
  • example
  • Ss1Us2Us3.Usk, the set of all patients
  • si the set of all patients having disease I
  • x a description of a patient(manifestation)

15
  • P(xsi) prob. One with disease I will have
  • description x
  • P(six) prob. one with description x will have
  • disease i.
  • by Bayes theorem

16
  • 2. PNN architecture feed forward with 2 hidden
    layers
  • learning is not used to minimize error but to
    obtain P(xsi)
  • 3. Learning
  • assumption P(si) are known, P(xsi) obey
    Gaussian distr.
  • estimate

17
  • 4.Comments
  • (1) Bayesian classification by
  • (2) fast classification( especially if
    implemented in parallel
  • machine).
  • (3) fast learning
  • (4) trade nodes for time( not good with large
    training
  • samaples/clusters).
Write a Comment
User Comments (0)
About PowerShow.com