ICT619%20Intelligent%20Systems%20Topic%204:%20Artificial%20Neural%20Networks

About This Presentation

Title:

ICT619%20Intelligent%20Systems%20Topic%204:%20Artificial%20Neural%20Networks

Description:

ICT619 Intelligent Systems. Topic 4: Artificial Neural ... Photo: Osaka University. ICT619. 8. An overview of the biological neuron ... Stock market prediction ... – PowerPoint PPT presentation

Number of Views:75

Avg rating:3.0/5.0

Slides: 48

Provided by: drsham

Category:

more less

Transcript and Presenter's Notes

Title: ICT619%20Intelligent%20Systems%20Topic%204:%20Artificial%20Neural%20Networks

1
ICT619 Intelligent SystemsTopic 4 Artificial
Neural Networks
2
Artificial Neural Networks

PART A
Introduction
An overview of the biological neuron
The synthetic neuron
Structure and operation of an ANN
Problem solving by an ANN
Learning in ANNs
ANN models
Applications

PART B
Developing neural network applications
Design of the network
Training issues
A comparison of ANN and ES
Hybrid ANN systems
Case Studies

3
Introduction

Artificial Neural Networks (ANN)
Also known as
Neural networks
Neural computing (or neuro-computing) systems
Connectionist models
ANNs simulate the biological brain for problem
solving
This represents a totally different approach to
machine intelligence from the symbolic logic
approach
The biological brain is a massively parallel
system of interconnected processing elements
ANNs simulate a similar network of simple
processing elements at a greatly reduced scale

4
Introduction

ANNs adapt themselves using data to learn problem
solutions
ANNs can be particularly effective for problems
that are hard to solve using conventional
computing methods
First developed in the 1950s, slumped in 70s
Great upsurge in interest in the mid 1980s
Both ANNs and expert systems are non-algorithmic
tools for problem solving
ES rely on the solution being expressed as a set
of heuristics by an expert
ANNs learn solely from data.

5
(No Transcript)
6
An overview of the biological neuron

Estimated 1000 billion neurons in the human
brain, with each connected to up to 10,000 others
Electrical impulses produced by a neuron travel
along the axon
The axon connects to dendrites through synaptic
junctions

7
An overview of the biological neuron
8
An overview of the biological neuron

A neuron collects the excitation of its inputs
and "fires" (produces a burst of activity) when
the sum of its inputs exceeds a certain threshold
The strengths of a neurons inputs are modified
(enhanced or inhibited) by the synaptic junctions
Learning in our brains occurs through a
continuous process of new interconnections
forming between neurons, and adjustments at the
synaptic junctions

9
The synthetic neuron

A simple model of the biological neuron, first
proposed in 1943 by McCulloch and Pitts consists
of a summing function with an internal threshold,
and "weighted" inputs as shown below.

10
The synthetic neuron (contd)

For a neuron receiving n inputs, each input xi (
i ranging from 1 to n) is weighted by multiplying
it with a weight wi
The sum of the products wixi gives the net
activation value of the neuron
The activation value is subjected to a transfer
function to produce the neurons output
The weight value of the connection carrying
signals from a neuron i to a neuron j is termed
wij..

11
Transfer functions

These compute the output of a node from its net
activation. Among the popular transfer functions
are
Step function
Signum (or sign) function
Sigmoid function
Hyperbolic tangent function
In the step function, the neuron produces an
output only when its net activation reaches a
minimum value known as the threshold
For a binary neuron i, whose output is a 0 or 1
value, the step function can be summarised as

12
Transfer functions (contd)

The sign function returns a value between -1 and
1. To avoid confusion with 'sine' it is often
called signum.

outputi
1
0
activationi
-1
13
Transfer functions (contd)

The sigmoid
The sigmoid transfer function produces a
continuous value in the range 0 to 1
The parameter gain affects the slope of the
function around zero

14
Transfer functions (contd)

The hyperbolic tangent
A variant of the sigmoid transfer function
Has a shape similar to the sigmoid (like an S),
with the difference being that the value of
outputi ranges between 1 and 1.

15
Structure and operation of an ANN

The building block of an ANN is the artificial
neuron. It is characterised by
weighted inputs
summing and transfer function
The most common architecture of an ANN consists
of two or more layers of artificial neurons or
nodes, with each node in a layer connected to
every node in the following layer
Signals usually flow from the input layer, which
is directly subjected to an input pattern, across
one or more hidden layers towards the output
layer.

16
Structure and operation of an ANN

The most popular ANN architecture, known as the
multilayer perceptron (shown in diagram above),
follows this model.
In some models of the ANN, such as the
self-organising map (SOM) or Kohonen net, nodes
in the same layer may have interconnections among
them
In recurrent networks, connections can even go
backwards to nodes closer to input

17
Problem solving by an ANN

The inputs of an ANN are data values grouped
together to form a pattern
Each data value (component of the pattern vector)
is applied to one neuron in the input layer
The output value(s) of node(s) in the output
layer represent some function of the input
pattern

18
Problem solving by an ANN (contd)

In the example above, the ANN maps the input
pattern to either one of two classes
The ANN produces the output for an accurate
prediction, only if the functional relationships
between the relevant variables, namely the
components of the input pattern, and the
corresponding output, have been learned by the
ANN
Any three-layer ANN can (at least in theory)
represent the functional relationship between an
input pattern and its class
It may be difficult in practice for the ANN to
learn a given relationship

19
Learning in ANN

Common human learning behaviour repeatedly going
through same material, making mistakes and
learning until able to carry out a given task
successfully
Learning by most ANNs is modelled after this type
of human learning
Learned knowledge to solve a given problem is
stored in the interconnection weights of an ANN
The process by which an ANN arrives at the right
values of these weights is known as learning or
training

20
Learning in ANN (contd)

Learning in ANNs takes place through an iterative
training process during which node
interconnection weight values are adjusted
Initial weights, usually small random values, are
assigned to the interconnections between the ANN
nodes.
Like knowledge acquisition in ES, learning in
ANNs can be the most time consuming phase in its
development

21
Learning in ANNs (contd)

ANN learning (or training) can be supervised or
unsupervised
In supervised training,
data sets consisting of pairs, each one an input
patterns and its expected correct output value,
are used
The weight adjustments during each iteration aim
to reduce the error (difference between the
ANNs actual output and the expected correct
output)
Eg, a node producing a small negative output when
it is expected to produce a large positive one,
has its positive weight values increased and the
negative weight values decreased

22
Learning in ANNs

In supervised training,
Pairs of sample input value and corresponding
output value are used to train the net repeatedly
until the output becomes satisfactorily accurate
In unsupervised training,
there is no known expected output used for
guiding the weight adjustments
The function to be optimised can be any function
of the inputs and outputs, usually set by the
application
the net adapts itself to align its weight values
with training patterns
This results in groups of nodes responding
strongly to specific groups of similar inputs
patterns

23
The two states of an ANN

A neural network can be in one of two states
training mode or operation mode
Most ANNs learn off-line and do not change their
weights once training is finished and they are in
operation
In an ANN capable of on-line learning, training
and operation continue together
ANN training can be time consuming, but once
trained, the resulting network can be made to run
very efficiently providing fast responses

24
ANN models

ANNs are supposed to model the structure and
operation of the biological brain
But there are different types of neural networks
depending on the architecture, learning strategy
and operation
Three of the most well known models are
The multilayer perceptron
The Kohonen network (the Self-Organising Map)
The Hopfield net
The Multilayer Perceptron (MLP) is the most
popular ANN architecture

25
The Multilayer Perceptron

Nodes are arranged into an input layer, an output
layer and one or more hidden layers
Also known as the backpropagation network
because of the use of error values from the
output layer in the layers before it to calculate
weight adjustments during training.
Another name for the MLP is the feedforward
network.

26
MLP learning algorithm

The learning rule for the multilayer perceptron
is known as "the generalised delta rule" or the
"backpropagation rule"
The generalised delta rule repeatedly calculates
an error value for each input, which is a
function of the squared difference between the
expected correct output and the actual output
The calculated error is backpropagated from one
layer to the previous one, and is used to adjust
the weights between connecting layers

27
MLP learning algorithm (contd)

New weight Old weight change calculated from
square of errorError difference between
desired output and actual output
Training stops when error becomes acceptable, or
after a predetermined number of iterations
After training, the modified interconnection
weights form a sort of internal representation
that enables the ANN to generate desired outputs
when given the training inputs or even new
inputs that are similar to training inputs
This generalisation is a very important property

28
The error landscape in a multilayer perceptron

For a given pattern p, the error Ep can be
plotted against the weights to give the so called
error surface
The error surface is a landscape of hills and
valleys, with points of minimum error
corresponding to wells and maximum error found on
peaks.
The generalised delta rule aims to minimise Ep by
adjusting weights so that they correspond to
points of lowest error
It follows the method of gradient descent where
the changes are made in the steepest downward
direction
All possible solutions are depressions in the
error surface, known as basins of attraction

29
The error landscape in a multilayer perceptron
Ep
j
i
30
Learning difficulties in multilayer perceptrons -
local minima

The MLP may fail to settle into the global
minimum of the error surface and instead find
itself in one of the local minima
This is due to the gradient descent strategy
followed
A number of alternative approaches can be taken
to reduce this possibility
Lowering the gain term progressively
Used to influence rate at which weight changes
are made during training
Value by default is 1, but it may be gradually
reduced to reduce the rate of change as training
progresses

31
Learning difficulties in multilayer
perceptrons(contd)

Addition of more nodes for better representation
of patterns
Too few nodes (and consequently not enough
weights) can cause failure of the ANN to learn a
pattern
Introduction of a momentum term
Determines effect of past weight changes on
current direction of movement in weight space
Momentum term is also a small numerical value in
the range 0 -1
Addition of random noise to perturb the ANN out
of local minima
Usually done by adding small random values to
weights.
Takes the net to a different point in the error
space hopefully out of a local minimum

32
The Kohonen network (the self-organising map)

Biological systems display both supervised and
unsupervised learning behaviour
A neural network with unsupervised learning
capability is said to be self-organising
During training, the Kohonen net changes its
weights to learn appropriate associations,
without any right answers being provided

33
The Kohonen network (contd)

The Kohonen net consists of an input layer, that
distributes the inputs to every node in a second
layer, known as the competitive layer.
The competitive (output) layer is usually
organised into some 2-D or 3-D surface (feature
map)

34
Operation of the Kohonen Net

Each neuron in the competitive layer is connected
to other neurons in its neighbourhood
Neurons in the competitive layer have excitatory
(positively weighted) connections to immediate
neighbours and inhibitory (negatively weighted)
connections to more distant neurons.
As an input pattern is presented, some of the
neurons in the competitive layer are sufficiently
activated to produce outputs, which are fed to
other neurons in their neighbourhoods
The node with the set of input weights closest to
the input pattern component values produces the
largest output. This node is termed the best
matching (or winning) node

35
Operation of the Kohonen Net(contd)

During training, input weights of the best
matching node and its neighbours are adjusted to
make them resemble the input pattern even more
closely
At the completion of training, the best matching
node ends up with its input weight values aligned
with the input pattern and produces the strongest
output whenever that particular pattern is
presented
The nodes in the winning node's neighbourhood
also have their weights modified to settle down
to an average representation of that pattern
class
As a result, the net is able to represent
clusters of similar input patterns - a feature
found useful for data mining applications, for
example.

36
The Hopfield Model

The Hopfield net is the most widely known of all
the autoassociative - pattern completing - ANNs
In autoassociation, a noisy or partially
incomplete input pattern causes the network to
stabilise to a state corresponding to the
original pattern
It is also useful for optimisation tasks.
The Hopfield net is a recurrent ANN in which the
output produced by each neuron is fed back as
input to all other neurons
Neurons computer a weighted sum with a step
transfer function.

37
The Hopfield Model (contd)

The Hopfield net has no iterative learning
algorithm as such. Patterns (or facts) are simply
stored by adjusting the weights to lower a term
called network energy
During operation, an input pattern is applied to
all neurons simultaneously and the network is
left to stabilise
Outputs from the neurons in the stable state form
the output of the network.
When presented with an input pattern, the net
outputs a stored pattern nearest to the presented
pattern.

38
When ANNs should be applied

Difficulties with some real-life problems
Solutions are difficult, if not impossible, to
define algorithmically due mainly to the
unstructured nature
Too many variables and/or the interactions of
relevant variables not understood well
Input data may be partially corrupt or missing,
making it difficult for a logical sequence of
solution steps to function effectively

39
When ANNs should be applied (contd)

The typical ANN attempts to arrive at an answer
by learning to identify the right answer through
an iterative process of self-adaptation or
training
If there are many factors, with complex
interactions among them, the usual "linear"
statistical techniques may be inappropriate
If sufficient data is available, an ANN can find
the relevant functional relationship by means of
an adaptive learning procedure from the data

40
Current applications of ANNs

ANNs are good at recognition and classification
tasks
Due to their ability to recognise complex
patterns, ANNs have been widely applied in
character, handwritten text and signature
recognition, as well as more complex images such
as faces
They have also been used successfully for speech
recognition and synthesis
ANNs are being used in an increasing number of
applications where high-speed computation of
functions is important, eg, in industrial robotics

41
Current applications of ANNs(contd)

One of the more successful applications of ANNs
has been as a decision support tool in the area
of finance and banking
Some examples of commercial applications of ANN
are
Financial market analysis for investment decision
making
Sales support - targeting customers for
telemarketing
Bankruptcy prediction
Intelligent flexible manufacturing systems
Stock market prediction
Resource allocation scheduling and management
of personnel and equipment

42
ANN applications - broad categories

According to a survey (Quaddus Khan, 2002)
covering the period 1988 up to mid 1998, the main
business application areas of ANNs are
Production (36)
Information systems (20)
Finance (18)
Marketing distribution (14.5)
Accounting/Auditing (5)
Others (6.5)

43
ANN applications - broad categories (contd)

The levelling off of publications on ANN
applications may be attributed to the ANN moving
from the research to the commercial application
domain
The emergence of other intelligent system tools
may be another factor

44
Some advantages of ANNs

Able to take incomplete or corrupt data and
provide approximate results.
Good at generalisation, that is recognising
patterns similar to those learned during
training
Inherent parallelism makes them fault-tolerant
loss of a few interconnections or nodes leaves
the system relatively unaffected
Parallelism also makes ANNs fast and efficient
for handling large amounts of data.

45
ANN State-of-the-art overview

Currently neural network systems are available as
Software simulation on conventional computers -
prevalent
Special purpose hardware that models the
parallelism of neurons.
ANN-based systems not likely to replace
conventional computing systems, but they are an
established alternative to the symbolic logic
approach to information processing
A new computing paradigm in the form of hybrid
intelligent systems has emerged - often involving
ANNs with other intelligent system tools

46
REFERENCES

AI Expert (special issue on ANN), June 1990.
BYTE (special issue on ANN), Aug. 1989.
Caudill,M., "The View from Now", AI Expert, June
1992, pp.27-31.
Dhar, V., Stein, R., Seven Methods for
Transforming Corporate Data into Business
Intelligence., Prentice Hall 1997
Kirrmann,H., "Neural Computing The new gold rush
in informatics", IEEE Micro June 1989 pp. 7-9
Lippman, R.P., "An Introduction to Computing with
Neural Nets", IEEE ASSP Magazine, April 1987
pp.4-21.
Lisboa, P., (Ed.) Neural Networks Current
Applications, Chapman Hall, 1992.
Negnevitsky, M. Artificial Intelligence A Guide
to Intelligent Systems, Addison-Wesley 2005.

47
REFERENCES (contd)

Quaddus, M. A., and Khan, M. S., "Evolution of
Artificial Neural Networks in Business
Applications An Empirical Investigation Using a
Growth Model", International Journal of
Management and Decision Making, Vol.3, No.1,
March 2002, pp.19-34.(see also ANN application
publications end note library files, ICT619 ftp
site)
Wasserman, P.D., Neural Computing, Theory and
Practice, Van Nostrand Reinhold, New York 1989
Wong, B.K., Bodnovich, T.A., Selvi, Yakup,
"Neural Networks applications in business A
Review and Analysis of the literature (1988-95)",
Decision Support Systems, 19, 1997, pp. 301-320.
Zahedi, F., Intelligent Systems for Business,
Wadsworth Publishing, Belmont, California, 1993.
http//www.doc.ic.ac.uk/nd/surprise_96/journal/vo
l4/cs11/report.html