Pattern recognition and classification II - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Pattern recognition and classification II

Description:

The network transforms an input vector to an output vector. ... Assesment of bank customers (by the bank before lending money) ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 19
Provided by: ivarba
Category:

less

Transcript and Presenter's Notes

Title: Pattern recognition and classification II


1
  • Lecture 8
  • Pattern recognition and classification II
  • Contents
  • Neural networks


2
Classification using neural networks Basic
principles The network transforms an input
vector to an output vector. No logical/mathematica
l rules are known about how the output is related
to the input. Only examples are
available. Training The parameters of the
network is adjusted to that the network
reproduces the e Xamples as good as
possible. Recognition The network receives
input which may or may not be among the examples
. The output is the recognition result .

3
Performance Generalization capability Correct
guess in case of example not included in the
training Speed of training Speed of
recognition Adaptivity Adjustment of internal
parameters according to new examples presented

4
(No Transcript)
5
Neural networks in image analysis Pattern
classification in image analysis can be based on
neural networks. The preprocessed image defines
in some way the input of the net, and the output
is designed to define the resulting class of the
image. Most simple is to assign each binary
pixel to an input neuron, and the output consists
of K output neurons . If the input is an image of
class k, then the net is adjusted to give the
output neuron k 1, the rest has output value
0) . Adjusted means that the weights of the
network are chosen to give approximately the
correct output when the training examples are
applied. A good, well-trained network gives high
value of the kth output neuron and low values of
the remaining output neurons, if an image
similar to a class k image is
input. ___________________________________________
____ ) Other mapping of the between the output
vector and the class index can also be used, for
example the binary representation of the index
6
OCR (optical character recognition) Example The
10 characters 0,1 . .9 normaized to a binary
bitmap of 10 x 10 pixels
7
. . . X X X X . . . . . X . . . . X . . . . X
. . . . . . . . . X . . . . . . . . . X . X X X
. . . . . X X . . . X . . . . X . . . . X . .
. . X . . . . X . . . . X . . . . X . . . . . X
X X X . . . 6 . X X X X X X X . . . . . . . . X
. . . . . . . . . X . . . . . . . . X . . . .
. . . . . X . . . . . . . . X . . . . . . . . .
X . . . . . . . . X . . . . . . . . . X . . . .
. . . . . X . . . . . . 7
Typical example Input 10 x 10 binary
numbers Output Class index
8
Notation concerning the net L total number of
layers minus 1, l is the layer index, intput has
l0, output has lL nl is the number of neurons
in layer l hl,i is the summed input to the ith
neuron of layer l yl,i is the output from
ith neuron of layer l wl,i,j is the weight
between neuron i in layer l-1 and neuron j
in layer l gl(h) is the activation function of
layer l hl,i S over j w l,i,j y l-1,j yl,i
gl(hl,i)
9
Notation concerning the example set m is the
example index ym0,j is the signal on the jth
input neuron for example m tmj is the target
(i.e. the desired) signal on the jth output
neuron for example m Combined net and example
quantity ymL,j is the signal on the jth output
neuron when the input of example m is
applied Cost function E ½ S over m S over i
(ymL,i tmi)2
10
Minimize E ½ S over m S over i (ymL,i
tmi)2 or try solving many equations
ymL,i tmi 0 for m 1..M, and i 1..nL The
equations are still NONLINEAR Number of equations
number of examples MnL Number of unknowns
number of weights Nw Two extremes MnL gtgt Nw
Overdetermined system of equations Many local
minima, the global minumim has E gt 0
Good generalizaton capability MnL lt Nw
Underdetermined system of equations
Few local minima, the global minumim has E 0
Poor generalization capability
11
  • Partial derivatives
  • ?E/wl,i,j can be calculated
  • Gradient descent
  • wl,i,j,new wl,i,j,old - ? ?E/wl,i,j
  • If ? is not too large and ?E/wl,i,j is not zero,
    the cost function will decrease in such a step.
  • Aim
  • To reach the global minimun for E in reasonable
    time
  • Traps
  • Training ends in a poor, local minimum
  • The training is takes too long time

12
  • Modern neural net has more than 1000 weights
  • gtSearch for a minimum in 1000-dimensional space
  • One cannot be sure that a minimum found is global
  • Tricks in the network design
  • Chosing good input representations
  • Chosing good numbers of hidden layers and neuron
    numbers in each hidden layers
  • Tricks in the training procedure
  • Chosing good start guesses of weights
  • Chosing good values of ?
  • (possibly dynamically adjusted)
  • 3 . Chosing a good stop criterion
  • 4 . Applying selective brain damage

13
Test Performance test of classification/
recognition on examples which has not been
included in the training for the purpose of
redesign/retraining Validation Performance test
of classification/ recognition on examples which
has not been included in the training for the
purpose of benchmarking Adaptive adjustment
Changes of weights by dynamically including new
training examples
14
Overtraining
Stop training here
15
  • Efficient way of calculating ?E/wl,i,j
    Backpropagation
  • Here the determination of partial derivatives has
    the same complexity as a forward calculation
  • Modern research in neural networks Other
    updating opdating schemes than backpropagation.
    Automatic removal of outliers among examples
  • Two different principles of gradient decent
  • Batch training all examples produce average
    value of ?E/wl,i,j which then is used for
    updating the weights
  • Online training The value of ?E/wl,i,j is
    calculated for each example and used for updating
    weights before going the next example.

16
Advantage of neural networks over expert
systems No model, no logical/mathematical rules
invovled Disadvantage of neural networks When a
network makes a mistake it is often against all
expectations, i.e. the mistaken case seems to lie
inside the generality of the training examples.
When the network is redesigned or retrained
including the mistaken case as example, then new
surprising mistakes may appear. In other words
there is no such thing as a perfect neural
network (unless all possible inputs are trained
and the network is sufficiently large)
17
  • Applications of neural networks
  • Timeseries (say stock prices)
  • Hyphenation
  • OCR of postal zip-codes (written by hand)
  • Assesment of bank customers (by the bank before
    lending money)

18
The following is a provokative example Input
vector in case of 4 Assesment of bank customers
(by the bank before lending money) Input vector
age, married or not, number of devorces,
number of children, age of death of already dead
parents, sisters and children, did or did not do
military service, hair color, zip-code of
residence, color blind or not, left- handed or
not, smoker or not, diabetics or not, etc, etc
. Output risk of lending money Training using
input vectors and output of previous customers
Write a Comment
User Comments (0)
About PowerShow.com