One-layer neural networks Classification problems - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

One-layer neural networks Classification problems

Description:

Fictive unit. Neural Networks - lecture 3. 3. Functioning. Notations: Neural Networks - lecture 3 ... In the following we shall use X instead of bar X to denote ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 25
Provided by: UVT
Category:

less

Transcript and Presenter's Notes

Title: One-layer neural networks Classification problems


1
One-layer neural networksClassification problems
  • Architecture and functioning
  • Applicability
  • Classification problems. Linear and nonlinear
    separability
  • The Perceptron. The learning algorithm

2
Architecture
  • One layer NN one layer of input units and one
    layer of functional units

Fictive unit
-1
W
Y
X
Total connectivity
Output vector
Input vector
N input units
M functional units (output units
3
Functioning
  • Notations

4
Functioning
  • Computing the output signal
  • Remarks
  • In the following we shall use X instead of bar X
    to denote the extended vector
  • The output units have usually the same activation
    function

5
Applicability
  • Classification
  • Problem find the class corresponding to an
    object described by a set of features
  • The training set contains examples of correctly
    classified objects

6
Applicability
  • Approximation (regression)
  • Problem estimate a functional dependence
    between two variables
  • The training set contains pairs of corresponding
    values

7
Classification problems
  • Input feature vectors
  • Output class labels
  • Basic notions
  • Feature space (S) the set of all feature vectors
    (patterns)
  • Example all representations of letters
  • Class subset of S containing objects with
    similar features
  • Example class of representations of letter A
  • Classifier system which decide to what class a
    given object belong
  • Example method based on the nearest neighbor
    with respect to a standard pattern

8
Classification problems
  • Feature vectors

0
0
1
0
0
(matrix of active/inactive pixels)
0
1
1
0
0
0
1
0
0
1
1
(0,0,1,0,0,0,1,1,0,0,1,0,1,0,0,0,0,1,0,0,0,0,1,0,0
)
0
1
0
0
1
0
0
1
0
0
1
0
Presence of Horizontal line Vertical
line Oblique (right) Oblique (left) Curves
Vector of presence of some properties
(0,1,1,0,0)
9
Classification problems
  • A more formal approach a classifier is
    equivalent with a partitioning of S in subsets
    based on some decision functions

10
Classification problems
  • Example N2, M2 (bidimensional data and 2
    classes)

Decision function
Linear separable classes
Nonlinear separable classes
11
Classification problems
  • In the N-dimensional case two classes are
    considered to be linearly separable if there
    exist a hyperplane which separates them

The concept of linear separability can be
extended to the case of more than two classes
12
Classification problems
  • The case of three classes

-


-

-
Strong linearly separable classes
Linearly separable
Undefined regions
13
Classification problems
  • Remarks
  • A classification problem is considered to be
    linearly separable if there exist hyperplanes
    which separate the classes
  • The strongly linearly separable problems
    correspond to situations where the classes are
    more clearly separated than in the case of just
    linearly separable ones
  • From applications point of view linearly
    separable means that the classes are clearly
    separated

14
One unit perceptron
  • It is the simplest neural network for
    classification
  • It allows the classification in two linearly
    separable classes

Interpretation of the output if y-1 then X
belongs to Class 1 if y1 then X belongs
to Class 2
Architecture and functioning
X
W
15
One unit perceptron
  • Perceptrons training algorithm (Rosenblatt)
  • Training set (X1,d1), (X2,d2), . (XL,dL)
    where Xl is a features vector and dl is -1 if Xl
    should belong to Class 1 and 1 if it should
    belong to Class 2
  • The training is an iterative process based on
    scanning the training set for several times until
    the desired behavior is obtained
  • At each iteration for each example l from the
    training set the weights are adjusted based on

16
One unit perceptron
  • Perceptrons training algorithm (Rosenblatt)

Which means that the weights are adjusted only in
the case when the network gives a wrong answer.
In such a situation the adjustment is just
17
One unit perceptron
The parameter eta (correction step or learning
rate) is a positive value which can be chosen
such that after the correction the network gives
the right answer for the l-th example
18
One unit perceptron
  • Convergence of the perceptrons learning
    algorithm
  • For a linearly separable problem the algorithm
    will converge in a finite number of steps to the
    coefficients of a boundary hyperplane
  • Remarks.
  • Thy hypothesis that the classes are linearly
    separable is essential
  • This property of convergence in a finite number
    of steps is unique in the field of learning
    algorithms
  • The initial values of the weights can influence
    the number of steps until convergence but it
    cannot prevent the convergence

19
Perceptrons with multiple output units
  • Classification in more than two linearly
    separable classes (e.g. M classes)
  • If the classes are strongly linearly separable
    then one can use M simple perceptrons which can
    be trained independently
  • If the classes are just linearly separable then
    the perceptrons cannot be trained separately. In
    this case we should use the so-called multiple
    perceptron

20
Multiple perceptron
  • Architecture, functioning and output
    interpretation
  • N input units
  • M output units with linear activation function
    (each output unit is associated to a class)
  • Interpretation of the result
  • The index of the maximal value in Y is the class
    label to which X belongs
  • If Y is normalized (all elements are in 0,1 and
    their sum is 1) then they can be interpreted as
    probabilities (yi is the probability that the
    input X belongs to class Ci)

X
YWX
M
N
21
Multiple perceptron
  • Learning algorithm
  • Training set (X1,d1), .,(XL,dL) where dl is
    from 1,,M and is the index of the class to
    which Xl belongs
  • Step 1 initialization
  • Initialize the elements of W (M lines and N1
    columns) with randomly selected values from
    -1,1
  • Initialize the iteration counter k0
  • Step 2 iterative adjustment
  • REPEAT
  • scan the training set and adjust the weights
  • kk1
  • UNTIL kkmax OR correct1 (the network learned
    the training set)

22
Multiple perceptron
  • Scan the training set and adjust the weights.
  • FOR l1,L do
  • correct1
  • Compute YWXl
  • Find i the index of the maximum in Y
  • IF i ltgt dl THEN
  • Wi Wi - etaXl
  • WdlWdl etaXl
  • correct0
  • ENDIF
  • ENDFOR

23
Multiple perceptron
  • Remarks
  • Wi denotes the row i of matrix W (X is also
    considered to be a row vector)
  • If there are more than one maximal value in Y
    then i can be any of them
  • The learning rate eta has a similar role as in
    the case of the simple perceptrons
  • If the classes are linearly separable then the
    learning algorithm is convergent

24
Nonlinearly separable problems
  • Classification is similar to representing boolean
    functions
  • f0,1N-gt0,1

For linear separable problems one layer is enough
For nonlinear separable problems introducing a
hidden layer is necessary
Write a Comment
User Comments (0)
About PowerShow.com