Title: Multi-Valued Neurons and Multilayer Neural Network based on Multi-Valued Neurons
1Multi-Valued Neurons and Multilayer Neural
Network based on Multi-Valued Neurons
2A threshold function is a linearly separable
function
Linear separability means that it is possible to
separate 1s and -1s by a hyperplane
f (x1, x2) is the OR function
3Threshold Boolean Functions
- The threshold (linearly separable) function can
be learned by a single neuron - The number of threshold functions is very small
in comparison to the number of all functions (104
of 256 for n3, about 2000 of 65536 for n4,
etc.) - Non-threshold (nonlinearly separable) functions
can not be learned by a single neuron
(Minsky-Papert, 1969), they can be learned only
by a neural network
4XOR a classical non-threshold (non-linearly
separable) function
Non-linear separability means that it is
impossible to separate 1s and -1s by a
hyperplane
5Multi-valued mappings
- The first artificial neurons could learn only
Boolean functions. - However, the Boolean functions can describe only
very limited class of problems. - Thus, the ability to learn and implement not only
Boolean, but also multiple-valued and continuous
functions is very important for solving pattern
recognition, classification and approximation
problems. - This determines the importance of those neurons
that can learn and implement multiple-valued and
continuous mappings
6Traditional approach to learn the
multiple-valued mappings by a neuron
- Sigmoid activation function (the most popular)
7Sigmoidal neurons limitations
- Sigmoid activation function has a limited
plasticity and a limited flexibility. - Thus, to learn those functions whose behavior is
quite different in comparison with the one of the
sigmoid function, it is necessary to create a
network, because a single sigmoidal neuron is not
able to learn such functions.
8Is it possible to overcome the
Minskys-Paperts limitation for the classical
perceptron?
Yes !!!
9We can overcome the Minskys-Paperts limitation
using the complex-valued weights and the complex
activation function
10Is it possible to learn XOR and Parity n
functions using a single neuron?
- Any classical monograph/text book on neural
networks claims that to learn the XOR function a
network from at least three neurons is needed. - This is true for the real-valued neurons and
real-valued neural networks. - However, this is not true for the complex-valued
neurons !!! - A jump to the complex domain is a right way to
overcome the Misky-Paperts limitation and to
learn multiple-valued and Boolean nonlinearly
separable functions using a single neuron.
11(No Transcript)
12Complex numbers
- Unlike a real number, which is geometrically a
point on a line, a complex number is a point on a
plane. - Its coordinates are called a real (Re,
horizontal) and an imaginary (Im, vertical) parts
of the number - i is an imaginary unity
- r is the modulo (absolute value) of the number
r
Algebraic form of a complex number
13Complex numbers
A unit circle f is the argument (phase in terms
of physics) of a complex number
Trigonometric and exponential (Eulers) forms of
a complex number
14Complex numbers
Complex-conjugated numbers
15XOR problem
n2, m4 four sectors W(0, 1, i) the
weighting vector
i
1
-1
-i
16Parity 3 problem
n3, m6 6 sectors W(0, e, 1, 1) the
weighting vector
17Multi-Valued Neuron (MVN)
- A Multi-Valued Neuron is a neural element with n
inputs and one output lying on the unit circle,
and with the complex-valued weights. - The theoretical background behind the MVN is the
Multiple-Valued (k-valued) Threshold Logic over
the field of complex numbers
18Multi-valued mappings and multiple-valued logic
- We traditionally use Boolean functions and
Boolean (two-valued) logic, to present two-valued
mappings - To present multi-valued mappings, we should use
multiple-valued logic
19Multiple-Valued Logic classical view
- The values of multiple-valued (k-valued) logic
are traditionally encoded by the integers 0,1,
, k-1 - On the one hand, this approach looks natural.
- On the other hand, it presents only the
quantitative properties, while it can not present
the qualitative properties.
20Multiple-Valued Logic classical view
- For example, we need to present different colors
in terms of multiple-valued logic. Let Red0,
Orange1, Yellow2, Green3, etc. - What does it mean?
- Is it true that RedltOrangeltYellowltGreen ??!
21Multiple-Valued (k-valued) logic over the field
of complex numbers
- To represent and handle both the quantitative
properties and the qualitative properties, it is
possible to move to the field of complex numbers. - In this case, the argument (phase) may be used to
represent the quality and the amplitude may be
used to represent the quantity
22Multiple-Valued (k-valued) logic over the field
of complex numbers
regular values of k-valued logic
one-to-one correspondence
The kth roots of unity are values of k-valued
logic over the field of complex numbers
primitive kth root of unity
23Important advantage
- In multiple-valued logic over the field of
complex numbers all values of this logic are
algebraically (arithmetically) equitable they
are normalized and their absolute values are
equal to 1 - In the example with the colors, in terms of
multiple-valued logic over the field of complex
numbers they are coded by the different phases.
Hence, their quality is presented by the phase. - Since the phase determines the corresponding
frequency, this representation meats the physical
nature of the colors.
24Discrete-Valued (k-valued)Activation Function
Function P maps the complex plane into the set of
the kth roots of unity
25Discrete-Valued (k-valued)Activation Function
k16
26Multi-Valued Neuron (MVN)
f is a function of k-valued logic (k-valued
threshold function)
27MVN main properties
- The key properties of MVN
- Complex-valued weights
- The activation function is a function of the
argument of the weighted sum - Complex-valued inputs and output that are lying
on the unit circle (kth roots of unity) - Higher functionality than the one for the
traditional neurons (e.g., sigmoidal) - Simplicity of learning
28MVN Learning
- Learning is reduced to movement along the unit
circle - No derivative is needed, learning is based on the
error-correction rule
- Desired output
- Actual output
- error, which completely determines the weights
adjustment
29Learning Algorithm for the Discrete MVN with the
Error-Correction Learning Rule
W weighting vector X - input vector is a
complex conjugated to X ar learning rate
(should be always equal to 1) r - current
iteration r1 the next iteration
is a desired output (sector) is an actual output
(sector)
30Continuous-Valued Activation Function
Continuous-valued case (k??)
Function P maps the complex plane into the unit
circle
31Continuous-Valued Activation Function
32Continuous-Valued Activation Function
33Learning Algorithm for the Continuous MVN with
the Error Correction Learning Rule
W weighting vector X - input vector is a
complex conjugated to X ar a learning rate
(should be always equal to 1) r - current
iteration r1 the next iteration Z the
weighted sum
34Learning Algorithm for the Continuous MVN with
the Error Correction Learning Rule
W weighting vector X - input vector is a
complex conjugated to X ar a learning rate
(should be always equal to 1) r - current
iteration r1 the next iteration Z the
weighted sum
35A role of the factor 1/(n1) in the Learning Rule
The weights after the correction
The weighted sum after the correction
- exactly what we are looking for
36Self-Adaptation of the Learning Rate
1/zr is a self-adaptive part of the learning
rate
37Modified Learning Rules with the Self-Adaptive
Learning Rate
Discrete MVN
1/zr is a self-adaptive part of the learning
rate
Continuous MVN
38Convergence of the learning algorithm
- It is proven that the MVN learning algorithm
converges after not more than k! iterations for
the k -valued activation function - For the continuous MVN the learning algorithm
converges with the precision ? after not more
than (p/?)! iterations because in this case it is
reduced to learning in p/? valued logic.
39MVN as a model of a biological neuron
- The State of a biological neuron is determined by
the frequency - of the generated impulses
- The amplitude of impulses is always a constant
40MVN as a model of a biological neuron
41MVN as a model of a biological neuron
Intermediate State
Maximal inhibition
0
p
2p
Maximal excitation
Intermediate State
42MVN as a model of a biological neuron
Maximal inhibition
0
p
2p
Maximal excitation
43MVN
- Learns faster
- Adapts better
- Learns even highly nonlinear functions
- Opens new very promising opportunities for the
network design - Is much closer to the biological neuron
- Allows to use the Fourier Phase Spectrum as a
source of the features for solving different
recognition/classification problems - Allows to use hybrid (discrete/continuous)
inputs/output