Title: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks
1ICT619 Intelligent SystemsTopic 4 Artificial
Neural Networks
2Artificial Neural Networks
- PART A
- Introduction
- An overview of the biological neuron
- The synthetic neuron
- Structure and operation of an ANN
- Problem solving by an ANN
- Learning in ANNs
- ANN models
- Applications
- PART B
- Developing neural network applications
- Design of the network
- Training issues
- A comparison of ANN and ES
- Hybrid ANN systems
- Case Studies
3Developing neural network applications
- Neural Network Implementations
- Three possible practical implementations of ANNs
are - A software simulation program running on a
digital computer - A hardware emulator connected to a host computer
- called a neurocomputer - True electronic circuits
4Software Simulations of ANN
- Currently the cheapest and simplest
implementation method for ANNs - at least for
general purpose use. - Simulates parallel processing on a conventional
sequential digital computer -
- Replicates temporal behaviour of the network by
updating the activation level and output of each
node for successive time steps - These steps are represented by iterations or
loops - Within each loop, the updates for all nodes in a
layer are performed.
5Software simulations of ANN (contd)
- In multilayer ANNs, processing for a layer is
completed and its output used to calculate states
of the nodes in the following layer - Typical additional features of ANN simulators
- Configuring the net according to a chosen
architecture and node operational characteristic - Implementation of training phase using a chosen
training algorithm - Tools for visualising and analysing behaviour of
nets - ANN simulators are written in hi-level languages
such as C, C and Java.
6Advantages and possible problems with software
simulators
- Advantages and possible problems with software
simulators - Main attraction of ANN simulators is the
relatively low cost and wide availability of
ready-made commercial packages - They are also compact, flexible and highly
portable. - Writing your own simulator requires programming
skills and would be time consuming (except that
you don't have to now!) - Training of ANNs using software simulators can be
slow for larger networks (greater than a few
hundred)
7Commercially available neural net packages
- Prewritten shells with convenient user interfaces
- Cost a few hundred to tens of thousands of
dollars - Allow users to specify the ANN design and
training parameters - Usually provide graphic interfaces to enable
monitoring of the nets training and operation - Likely to provide interfacing with other software
systems such as spreadsheets and databases.
8Neurocomputers
- Dedicated special-purpose digital computer (aka
accelerator boards) - Optimised to perform operations common in neural
network simulation - Acts as a coprocessor to a host computer and is
controlled by a program running on the host. - Can be tens to thousands of times faster than
simulators - Systems are available with approx. 1000 million
IPS connection updates per second for networks
with 8,192 neurons e.g ACC Neural Network
Processor
9Neurocomputers
- Genobyte's CAM-Brain Machine was developed
between 1997 and 2000
10True Networks in Hardware
- Closer to biological neural networks than
simulations - Consist of synthetic neurons actually fabricated
on silicon chips - Commercially available hardwired ANNs are limited
to a few thousand neurons per chip1. - Chips connected in parallel to achieve larger
networks. - Problems interconnection and interference,
fixed-valued weights - work progressing on
modifiable synapses. - 1 Figures more than five years old.
11Neural Network Development Methodology
- Aims to add structure and organisation to ANN
applications development for reducing cost,
increasing accuracy, consistency, user confidence
and friendliness - Split development into the following phases
- The Concept Phase
- The Design Phase
- The Implementation Phase
- The Maintenance Phase
12Neural Network Development Methodology - the
Concept Phase
- Involves
- Validating the proposed application
- Selecting an appropriate neural paradigm.
- Application validation
- Problem characteristics suitable for neural
network application are - Data intensive
- Multiple interacting parameters
- Incomplete, erroneous, noisy data
- Solution function unknown or expensive
- Requires flexibility, generalisation,
fault-tolerance, speed
13ANN Development Methodology - the Concept Phase
(contd)
- Common examples of applications with above
attributes are - pattern recognition (eg, printed or handwritten
character, consumer behaviour, risk patterns), - forecasting (eg, stock market), signal (audio,
video, ultrasound) processing - Problems not suitable for ANN-based solutions
include - A mathematically accurate and precise solution is
available - Solution involving deduction and step-wise logic
appropriate - Applications involving explaination or reporting
- One application area that is unsuitable for ANNs
is resource management eg, inventory, accounts,
sales data analysis
14Selecting an ANN paradigm
- Decision based on comparison of application
requirements to capabilities of different
paradigms - eg, the multilayer perceptron is well known
for its pattern recognition capabilities, - Kohonen net more suited for applications
involving data clustering - Choice of paradigm also influenced by the
training method that can be employed - eg. supervised training must have adequate
number of input-correct output pairs available
and training may take a relatively long time - Technical and economic feasibility assessments
should be carried out to complete the concept
phase
15The Design Phase
- The design phase specifies initial values and
conditions at the node, network and training
levels - Decisions to be made at the node level include
- Types of input binary (0,1), bipolar (-1,1),
trivalent (-1, 0, 1), discrete,
continuous-valued - Transfer function - step or threshold,
hyperbolic tangent, sigmoid, consider possible
use of lookup tables for speeding up calculations - Decisions to be made at the network architecture
level - The number and size of layers and their
connectivity - (fully interconnected, or sparsely
interconnected, feedforward or recurrent, other?)
16The Design Phase (contd)
- 'Size' of a layer is the number of nodes in the
layer - For the input layer, size is determined by number
of data sources (input vector components) and
possibly the mathematical transformations done - The number of nodes in the output layer is
determined by the number of classes or decision
values to be output - Finding optimal size of the hidden layer needs
some experimentation - Too few nodes will produce inadequate mapping,
while too many may result in inadequate
generalisation
17The Design Phase (contd)
- Connectivity
- Connectivity determines the flow of signals
between neurons in the same or different layers - Some ANN models, such as the multilayer
perceptron, have only interlayer connections -
there is no intralayer connection - The Hopfield net is an example of a model with
intralayer connections
18The Design Phase (contd)
- Feedback
- There may be no feedback of output values, eg,
the multilayer perceptron - or
- There may be feedback as in a recurrent network
eg, the Hopfield net - Other design questions include
- Setting of parameters for the learning phase
eg, stopping criterion, learning rate. - Possible addition of noise to speed up training.
19The Implementation phase
- Typical steps
- Gathering the training set
- Selecting the development environment
- Implementing the neural network
- Testing and debugging the network
- Gathering the training set
- Aims to get right type of data in adequate amount
and in the right format
20Gathering training data (contd)
- How much data to gather?
- Increasing data amount increases training time
but may help earlier convergence - Quality more important than quantity
- Collection of data
- Potential sources - historical records,
instrument readings, simulation results - Preparation of data
- Involves preprocessing including scaling,
normalisation, binarisation, mapping to
logarithmic scale, etc.
21Gathering training data (contd)
- Type of data to collect should be representative
of given problem including routine, unusual and
boundary-condition cases - Mix of good as well as imperfect data but not
ambiguous or too erroneous. - Amount of data to gather
- Increasing data amount increases training time
but may help earlier convergence - Quality more important than quantity
22Gathering training data (contd)
- Collection of data
- Potential sources - historical records,
instrument readings, simulation results - Preparation of data
- Involves preprocessing including normalisation
and possible binarisation
23Selecting the development environment
- Hardware and software aspects
- Hardware requirements based on
- speed of operation
- memory and storage capacity
- software availability
- cost
- compatibility
- The most popular platforms are workstations and
high-end PC's (with accelerator board option) -
24Selecting the development environment
- Two options in choosing software
- Custom-coded simulators which requires more
expertise on part of the user but provides
maximum flexibility - Commercial development packages which are
usually easy to use because of a more
sophisticated interface
25Selecting the development environment (contd)
- Selection of hardware and software environment
usually based on following considerations - ANN paradigm to be implemented
- Speed in training and recall
- Transportability
- Vendor support
- Extensibility
- Price
26Implementing the neural network
- Common steps involved are
- Selection of appropriate neural paradigm
- Setting network size
- Deciding on the learning algorithm
- Creation of screen displays
- Determining the halting criteria
- Collecting data for training and testing
- Data preparation including preprocessing
- Organising data into training and test sets
27Implementation - Training
- Training the net, which consists of
- Loading the training set
- Initialisation of network weights usually to
small random values - Starting the training process
- Monitoring the training process until training is
completed - Saving of weight values in a file for use during
operation mode
28Implementation Training (contd)
- Possible problems arising during training
- Failure to converge to a set of optimal weight
values - Further weight adjustments fail to reduce output
error, stuck in a local minimum - Remedied by resetting the learning parameters and
reinitialising the weights - Overtraining
- Net fails to generalise, i.e., fails to classify
less than perfect patterns - Mix of good and imperfect patterns for training
helps
29Implementation Training (contd)
- Training results may be affected by the method of
presenting data set to the network. - Adjustments may be made by varying the layer
sizes and fine-tuning the learning parameters. - To ensure optimal results, several variations of
a neural network may be trained and each tested
for accuracy
30Implementation - Testing and Debugging
- Testing can be done by
- 1. Observing operational behaviour of the net.
- 2. Analysing actual weights
- 3. Study of network behaviour under specific
conditions - Observing operational behaviour
- Network treated as a black box and its response
to a series of test cases is evaluated - Test data
- Should contain training cases as well as new
cases - Routine, unusual as well as boundary condition
cases should be tried
31Implementation - Testing and Debugging (contd)
- Testing by weight analysis
- Weights entering and exiting nodes analysed for
relatively small and large values - In case of significant errors detected in
testing, debugging would involve examining - the training cases for representativeness,
accuracy and adequacy of number - learning algorithm parameters such as the rate at
which weights are adjusted - neural network architecture, node
characteristics, and connectivity - training set-network interface, user-network
interface
32The Maintenance Phase
- Consists of
- placing the neural network in an operational
environment with possible integration - periodic performance evaluation, and maintenance
- Although often designed as stand-alone systems,
some neural network systems are integrated with
other information systems using - Loose-coupling preprocessor, postprocessor,
distributed component - Tight-coupling or full integration as embedded
component
33The Maintenance Phase
- Possible ANN operational environments
34System evaluation
- Continual evaluation is necessary to
- ensure satisfactory performance in solving
dynamic problems - check for damaged or retrained networks.
- Evaluation can be carried out by reusing original
test procedures with current data.
35ANN Maintenance
- Involves modification necessitated by
- Decreasing accuracy
- Enhancements
- System modification falls into two categories
involving either data or software. - Data modification steps
- Training data is modified or replaced
- Network retrained and re-evaluated.
36ANN Maintenance (contd)
- Software changes include changes in
- Interfaces
- cooperating programs
- the structure of the network.
- If the network is changed, part of the design and
most of the implementation phase may have to be
repeated. - Backup copies should be used for maintenance and
research.
37A comparison of ANN and ES
- Similarities between ES and ANN
- Both aim to create intelligent computer systems
by mimicking human intelligence, although at
different levels -
- Design process of neither ES nor ANN is automatic
- Knowledge extraction in ES is a time and labour
intensive process - ANNs are capable of learning but selection and
preprocessing of data have to be done carefully.
38A comparison of ANN and ES (contd)
- Differences between ANN and ES
- Differ in aspects of design, operation and use
- Logic vs. brain
- ES simulate the human reasoning process based on
formal logic - ANNs are based on modelling the brain, both in
structure and operation -
- Sequential vs. parallel
- The nature of processing in ES is sequential
- ANNs are inherently parallel
39A comparison of ANN and ES (contd)
- External and static vs. internal and dynamic
- Learning is performed external to the ES
- ANN itself is responsible for its knowledge
acquisition during the training phase. - Learning is always off-line in ES - knowledge
remains static during operation - Learning in ANNs, although mostly off-line, can
be on-line - Deductive vs. inductive inferencing
- Knowledge in an ES always used in a deductive
reasoning process - An ANN constructs its knowledge base inductively
from examples, and uses it to produce decision
through generalisation
40A comparison of ANN and ES (contd)
- Knowledge representation explicit vs. implicit
- ES store knowledge in explicit form -possible to
inspect and modify individual rules - ANNs knowledge stored implicitly in the
interconnection weight values - Design issues simple vs. complex
- Technical side of ES development relatively
simple without difficult design choices. - ANN design process often one of trial and error
41A comparison of ANN and ES (contd)
- User interface white box vs. black box
- ES have explanation capability
- Difficulty in interpreting an ANN's
knowledge-base effectively makes it a black box
to the user - State of maturity and recognition
well-established vs. early - ES already well established as a methodology in
commercial applications - ANN recognition and development tools at a
relatively early stage.
42Hybrid systems
- Neuro-symbolic computing utilises the
complementary nature of computing in neural
networks (numerical) and expert systems
(symbolic). - Neuro-fuzzy systems combine neural networks with
fuzzy logic - ANNs can also be combined with genetic algorithm
methodology - Hybrid ES-ANN systems
- The strengths of the ES can be utilised to
overcome the weaknesses of an ANN based system
and vice versa. - For example, ANNs extraction of knowledge from
data - ESs explanation capability
43Hybrid ES-ANN systems
- Rule extraction by inference justification in an
ANN - MACIE, an ANN based decision support system
described in (Gallant 1993) - Extracts a single rule that justifies an
inference in an ANN - Inference in an ANN is represented by output of a
single node - This output is based upon incomplete input values
fed from a number of nodes as shown in the
diagram below.
44Hybrid ES-ANN systems (contd)
- A node ui is defined to be a contributing node to
node uj if wij ui ? 0.
45Hybrid ES-ANN systems (contd)
- In this example, the contributing variables are
u2, u3, u5, u6 . - The rule produced in this example is
- IF u6 Unknown
- AND u2 TRUE
- AND u3 FALSE
- AND u5 TRUE
- THEN conclude u7 TRUE.
46Hybrid ES-ANN systems (contd)
- One approach to hybrid systems divides a problem
into tasks suitable for either ES and ANN - These tasks are then performed by the appropriate
methodology - One example of such a system (Caudill 1991) is an
intelligent system for delivering packages - ES performs the task of producing the best
loading strategy for packages into trucks - ANN works out best route for delivering the
packages efficiently.
47Hybrid ES-ANN systems (contd)
- Hybrid ES-ANN systems with ANNs embedded within
expert systems - ANN used to determine which rule to fire, given
the current state of facts. - Another approach to hybrid ES-ANN uses an ANN as
a preprocessor - One or more ANNs produce classifications.
- Numerical outputs produced by ANN are interpreted
symbolically by an ES as facts - ES applies the facts for deductive reasoning
48Case Study
- Case Application of ANNs in bankruptcy
prediction (Coleman et al, AI Review, Summer
1991, in Zahedi 1993) - Predicts banks that were certain to fail within
a year - Predicts certainty given to bank examiners
dealing with the bank in question. - ANN has 11 inputs, each of which is a ratio
developed by Peat Marwick. - Developed by NeuralWares Application Development
Services and Support Group (ADSS) - Software used - the NeuralWorks Professional
neural network development system. - Uses the standard backpropagation (multiplayer
perceptron) network.
49Case Study (contd)
- ANN has 11 inputs, each a ratio developed by Peat
Marwick. - Inputs connected to a single hidden layer, which
in turn is connected to a single node in the
output layer. - Network outputs a single value denoting whether
the bank would or would not fail within that
calendar year - Employed the hyperbolic-tangent transfer function
and a proprietary error function created by the
ADSS staff. - Trained on a set of 1,000 examples, 900 of which
were viable banks and 100 of which were banks
that had actually gone bankrupt - Training consisted of about 50,000 iterations of
the training set. - Predicted 50 of banks that are viable, and 99
of banks that actually failed.
50REFERENCES
- AI Expert (special issue on ANN), June 1990.
- BYTE (special issue on ANN), Aug. 1989.
- Caudill,M., "The View from Now", AI Expert, June
1992, pp.27-31. - Dhar, V., Stein, R., Seven Methods for
Transforming Corporate Data into Business
Intelligence., Prentice Hall 1997 - Kirrmann,H., "Neural Computing The new gold rush
in informatics", IEEE Micro June 1989 pp. 7-9 - Lippman, R.P., "An Introduction to Computing with
Neural Nets", IEEE ASSP Magazine, April 1987
pp.4-21. - Lisboa, P., (Ed.) Neural Networks Current
Applications, Chapman Hall, 1992. - Negnevitsky, M. Artificial Intelligence A Guide
to Intelligent Systems, Addison-Wesley 2005.
51REFERENCES (contd)
- Bailey, D., Thompson, D., How to Develop Neural
Network Applications, AI Expert, June 1990, pp.
38-47. - Caudill Butler, Naturally Intelligent Systems,
MIT Press,1989, pp 227-240. - Caudill, M., Expert networks, BYTE pp.109-116,
October 1991. - Dhar, V., Stein, R., Seven Methods for
Transforming Corporate Data into Business
Intelligence., Prentice Hall 1997. - Gallant, S., Neural Network Learning and Expert
Systems, MIT Press 1993. - Medsker,L., Hybrid Intelligent Systems, Kluwer
Academic Press, Boston 1995 - Zahedi, F., Intelligent Systems for Business,
Wadsworth Publishing, , Belmont, California, 1993.