ICT619 Intelligent Systems Topic 4: Artificial Neural Networks - PowerPoint PPT Presentation

About This Presentation
Title:

ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

Description:

Custom-coded simulators which requires more expertise on part of the user but ... the task of producing the best loading strategy for packages into trucks ... – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 52
Provided by: drsham
Category:

less

Transcript and Presenter's Notes

Title: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks


1
ICT619 Intelligent SystemsTopic 4 Artificial
Neural Networks
2
Artificial Neural Networks
  • PART A
  • Introduction
  • An overview of the biological neuron
  • The synthetic neuron
  • Structure and operation of an ANN
  • Problem solving by an ANN
  • Learning in ANNs
  • ANN models
  • Applications
  • PART B
  • Developing neural network applications
  • Design of the network
  • Training issues
  • A comparison of ANN and ES
  • Hybrid ANN systems
  • Case Studies

3
Developing neural network applications
  • Neural Network Implementations
  • Three possible practical implementations of ANNs
    are
  • A software simulation program running on a
    digital computer
  • A hardware emulator connected to a host computer
    - called a neurocomputer
  • True electronic circuits

4
Software Simulations of ANN
  • Currently the cheapest and simplest
    implementation method for ANNs - at least for
    general purpose use.
  • Simulates parallel processing on a conventional
    sequential digital computer
  • Replicates temporal behaviour of the network by
    updating the activation level and output of each
    node for successive time steps
  • These steps are represented by iterations or
    loops
  • Within each loop, the updates for all nodes in a
    layer are performed.

5
Software simulations of ANN (contd)
  • In multilayer ANNs, processing for a layer is
    completed and its output used to calculate states
    of the nodes in the following layer
  • Typical additional features of ANN simulators
  • Configuring the net according to a chosen
    architecture and node operational characteristic
  • Implementation of training phase using a chosen
    training algorithm
  • Tools for visualising and analysing behaviour of
    nets
  • ANN simulators are written in hi-level languages
    such as C, C and Java.

6
Advantages and possible problems with software
simulators
  • Advantages and possible problems with software
    simulators
  • Main attraction of ANN simulators is the
    relatively low cost and wide availability of
    ready-made commercial packages
  • They are also compact, flexible and highly
    portable.
  • Writing your own simulator requires programming
    skills and would be time consuming (except that
    you don't have to now!)
  • Training of ANNs using software simulators can be
    slow for larger networks (greater than a few
    hundred)

7
Commercially available neural net packages
  • Prewritten shells with convenient user interfaces
  • Cost a few hundred to tens of thousands of
    dollars
  • Allow users to specify the ANN design and
    training parameters
  • Usually provide graphic interfaces to enable
    monitoring of the nets training and operation
  • Likely to provide interfacing with other software
    systems such as spreadsheets and databases.

8
Neurocomputers
  • Dedicated special-purpose digital computer (aka
    accelerator boards)
  • Optimised to perform operations common in neural
    network simulation
  • Acts as a coprocessor to a host computer and is
    controlled by a program running on the host.
  • Can be tens to thousands of times faster than
    simulators
  • Systems are available with approx. 1000 million
    IPS connection updates per second for networks
    with 8,192 neurons e.g ACC Neural Network
    Processor

9
Neurocomputers
  • Genobyte's CAM-Brain Machine was developed
    between 1997 and 2000

10
True Networks in Hardware
  • Closer to biological neural networks than
    simulations
  • Consist of synthetic neurons actually fabricated
    on silicon chips
  • Commercially available hardwired ANNs are limited
    to a few thousand neurons per chip1.
  • Chips connected in parallel to achieve larger
    networks.
  • Problems interconnection and interference,
    fixed-valued weights - work progressing on
    modifiable synapses.
  • 1 Figures more than five years old.

11
Neural Network Development Methodology
  • Aims to add structure and organisation to ANN
    applications development for reducing cost,
    increasing accuracy, consistency, user confidence
    and friendliness
  • Split development into the following phases
  • The Concept Phase
  • The Design Phase
  • The Implementation Phase
  • The Maintenance Phase

12
Neural Network Development Methodology - the
Concept Phase
  • Involves
  • Validating the proposed application
  • Selecting an appropriate neural paradigm.
  • Application validation
  • Problem characteristics suitable for neural
    network application are
  • Data intensive
  • Multiple interacting parameters
  • Incomplete, erroneous, noisy data
  • Solution function unknown or expensive
  • Requires flexibility, generalisation,
    fault-tolerance, speed

13
ANN Development Methodology - the Concept Phase
(contd)
  • Common examples of applications with above
    attributes are
  • pattern recognition (eg, printed or handwritten
    character, consumer behaviour, risk patterns),
  • forecasting (eg, stock market), signal (audio,
    video, ultrasound) processing
  • Problems not suitable for ANN-based solutions
    include
  • A mathematically accurate and precise solution is
    available
  • Solution involving deduction and step-wise logic
    appropriate
  • Applications involving explaination or reporting
  • One application area that is unsuitable for ANNs
    is resource management eg, inventory, accounts,
    sales data analysis

14
Selecting an ANN paradigm
  • Decision based on comparison of application
    requirements to capabilities of different
    paradigms
  • eg, the multilayer perceptron is well known
    for its pattern recognition capabilities,
  • Kohonen net more suited for applications
    involving data clustering
  • Choice of paradigm also influenced by the
    training method that can be employed
  • eg. supervised training must have adequate
    number of input-correct output pairs available
    and training may take a relatively long time
  • Technical and economic feasibility assessments
    should be carried out to complete the concept
    phase

15
The Design Phase
  • The design phase specifies initial values and
    conditions at the node, network and training
    levels
  • Decisions to be made at the node level include
  • Types of input binary (0,1), bipolar (-1,1),
    trivalent (-1, 0, 1), discrete,
    continuous-valued
  • Transfer function - step or threshold,
    hyperbolic tangent, sigmoid, consider possible
    use of lookup tables for speeding up calculations
  • Decisions to be made at the network architecture
    level
  • The number and size of layers and their
    connectivity
  • (fully interconnected, or sparsely
    interconnected, feedforward or recurrent, other?)

16
The Design Phase (contd)
  • 'Size' of a layer is the number of nodes in the
    layer
  • For the input layer, size is determined by number
    of data sources (input vector components) and
    possibly the mathematical transformations done
  • The number of nodes in the output layer is
    determined by the number of classes or decision
    values to be output
  • Finding optimal size of the hidden layer needs
    some experimentation
  • Too few nodes will produce inadequate mapping,
    while too many may result in inadequate
    generalisation

17
The Design Phase (contd)
  • Connectivity
  • Connectivity determines the flow of signals
    between neurons in the same or different layers
  • Some ANN models, such as the multilayer
    perceptron, have only interlayer connections -
    there is no intralayer connection
  • The Hopfield net is an example of a model with
    intralayer connections

18
The Design Phase (contd)
  • Feedback
  • There may be no feedback of output values, eg,
    the multilayer perceptron
  • or
  • There may be feedback as in a recurrent network
    eg, the Hopfield net
  • Other design questions include
  • Setting of parameters for the learning phase
    eg, stopping criterion, learning rate.
  • Possible addition of noise to speed up training.

19
The Implementation phase
  • Typical steps
  • Gathering the training set
  • Selecting the development environment
  • Implementing the neural network
  • Testing and debugging the network
  • Gathering the training set
  • Aims to get right type of data in adequate amount
    and in the right format

20
Gathering training data (contd)
  • How much data to gather?
  • Increasing data amount increases training time
    but may help earlier convergence
  • Quality more important than quantity
  • Collection of data
  • Potential sources - historical records,
    instrument readings, simulation results
  • Preparation of data
  • Involves preprocessing including scaling,
    normalisation, binarisation, mapping to
    logarithmic scale, etc.

21
Gathering training data (contd)
  • Type of data to collect should be representative
    of given problem including routine, unusual and
    boundary-condition cases
  • Mix of good as well as imperfect data but not
    ambiguous or too erroneous.
  • Amount of data to gather
  • Increasing data amount increases training time
    but may help earlier convergence
  • Quality more important than quantity

22
Gathering training data (contd)
  • Collection of data
  • Potential sources - historical records,
    instrument readings, simulation results
  • Preparation of data
  • Involves preprocessing including normalisation
    and possible binarisation

23
Selecting the development environment
  • Hardware and software aspects
  • Hardware requirements based on
  • speed of operation
  • memory and storage capacity
  • software availability
  • cost
  • compatibility
  • The most popular platforms are workstations and
    high-end PC's (with accelerator board option)

24
Selecting the development environment
  • Two options in choosing software
  • Custom-coded simulators which requires more
    expertise on part of the user but provides
    maximum flexibility
  • Commercial development packages which are
    usually easy to use because of a more
    sophisticated interface

25
Selecting the development environment (contd)
  • Selection of hardware and software environment
    usually based on following considerations
  • ANN paradigm to be implemented
  • Speed in training and recall
  • Transportability
  • Vendor support
  • Extensibility
  • Price

26
Implementing the neural network
  • Common steps involved are
  • Selection of appropriate neural paradigm
  • Setting network size
  • Deciding on the learning algorithm
  • Creation of screen displays
  • Determining the halting criteria
  • Collecting data for training and testing
  • Data preparation including preprocessing
  • Organising data into training and test sets

27
Implementation - Training
  • Training the net, which consists of
  • Loading the training set
  • Initialisation of network weights usually to
    small random values
  • Starting the training process
  • Monitoring the training process until training is
    completed
  • Saving of weight values in a file for use during
    operation mode

28
Implementation Training (contd)
  • Possible problems arising during training
  • Failure to converge to a set of optimal weight
    values
  • Further weight adjustments fail to reduce output
    error, stuck in a local minimum
  • Remedied by resetting the learning parameters and
    reinitialising the weights
  • Overtraining
  • Net fails to generalise, i.e., fails to classify
    less than perfect patterns
  • Mix of good and imperfect patterns for training
    helps

29
Implementation Training (contd)
  • Training results may be affected by the method of
    presenting data set to the network.
  • Adjustments may be made by varying the layer
    sizes and fine-tuning the learning parameters.
  • To ensure optimal results, several variations of
    a neural network may be trained and each tested
    for accuracy

30
Implementation - Testing and Debugging
  • Testing can be done by
  • 1. Observing operational behaviour of the net.
  • 2. Analysing actual weights
  • 3. Study of network behaviour under specific
    conditions
  • Observing operational behaviour
  • Network treated as a black box and its response
    to a series of test cases is evaluated
  • Test data
  • Should contain training cases as well as new
    cases
  • Routine, unusual as well as boundary condition
    cases should be tried

31
Implementation - Testing and Debugging (contd)
  • Testing by weight analysis
  • Weights entering and exiting nodes analysed for
    relatively small and large values
  • In case of significant errors detected in
    testing, debugging would involve examining
  • the training cases for representativeness,
    accuracy and adequacy of number
  • learning algorithm parameters such as the rate at
    which weights are adjusted
  • neural network architecture, node
    characteristics, and connectivity
  • training set-network interface, user-network
    interface

32
The Maintenance Phase
  • Consists of
  • placing the neural network in an operational
    environment with possible integration
  • periodic performance evaluation, and maintenance
  • Although often designed as stand-alone systems,
    some neural network systems are integrated with
    other information systems using
  • Loose-coupling preprocessor, postprocessor,
    distributed component
  • Tight-coupling or full integration as embedded
    component

33
The Maintenance Phase
  • Possible ANN operational environments

34
System evaluation
  • Continual evaluation is necessary to
  • ensure satisfactory performance in solving
    dynamic problems
  • check for damaged or retrained networks.
  • Evaluation can be carried out by reusing original
    test procedures with current data.

35
ANN Maintenance
  • Involves modification necessitated by
  • Decreasing accuracy
  • Enhancements
  • System modification falls into two categories
    involving either data or software.
  • Data modification steps
  • Training data is modified or replaced
  • Network retrained and re-evaluated.

36
ANN Maintenance (contd)
  • Software changes include changes in
  • Interfaces
  • cooperating programs
  • the structure of the network.
  • If the network is changed, part of the design and
    most of the implementation phase may have to be
    repeated.
  • Backup copies should be used for maintenance and
    research.

37
A comparison of ANN and ES
  • Similarities between ES and ANN
  • Both aim to create intelligent computer systems
    by mimicking human intelligence, although at
    different levels
  • Design process of neither ES nor ANN is automatic
  • Knowledge extraction in ES is a time and labour
    intensive process
  • ANNs are capable of learning but selection and
    preprocessing of data have to be done carefully.

38
A comparison of ANN and ES (contd)
  • Differences between ANN and ES
  • Differ in aspects of design, operation and use
  • Logic vs. brain
  • ES simulate the human reasoning process based on
    formal logic
  • ANNs are based on modelling the brain, both in
    structure and operation
  • Sequential vs. parallel
  • The nature of processing in ES is sequential
  • ANNs are inherently parallel

39
A comparison of ANN and ES (contd)
  • External and static vs. internal and dynamic
  • Learning is performed external to the ES
  • ANN itself is responsible for its knowledge
    acquisition during the training phase.
  • Learning is always off-line in ES - knowledge
    remains static during operation
  • Learning in ANNs, although mostly off-line, can
    be on-line
  • Deductive vs. inductive inferencing
  • Knowledge in an ES always used in a deductive
    reasoning process
  • An ANN constructs its knowledge base inductively
    from examples, and uses it to produce decision
    through generalisation

40
A comparison of ANN and ES (contd)
  • Knowledge representation explicit vs. implicit
  • ES store knowledge in explicit form -possible to
    inspect and modify individual rules
  • ANNs knowledge stored implicitly in the
    interconnection weight values
  • Design issues simple vs. complex
  • Technical side of ES development relatively
    simple without difficult design choices.
  • ANN design process often one of trial and error

41
A comparison of ANN and ES (contd)
  • User interface white box vs. black box
  • ES have explanation capability
  • Difficulty in interpreting an ANN's
    knowledge-base effectively makes it a black box
    to the user
  • State of maturity and recognition
    well-established vs. early
  • ES already well established as a methodology in
    commercial applications
  • ANN recognition and development tools at a
    relatively early stage.

42
Hybrid systems
  • Neuro-symbolic computing utilises the
    complementary nature of computing in neural
    networks (numerical) and expert systems
    (symbolic).
  • Neuro-fuzzy systems combine neural networks with
    fuzzy logic
  • ANNs can also be combined with genetic algorithm
    methodology
  • Hybrid ES-ANN systems
  • The strengths of the ES can be utilised to
    overcome the weaknesses of an ANN based system
    and vice versa.
  • For example, ANNs extraction of knowledge from
    data
  • ESs explanation capability

43
Hybrid ES-ANN systems
  • Rule extraction by inference justification in an
    ANN
  • MACIE, an ANN based decision support system
    described in (Gallant 1993)
  • Extracts a single rule that justifies an
    inference in an ANN
  • Inference in an ANN is represented by output of a
    single node
  • This output is based upon incomplete input values
    fed from a number of nodes as shown in the
    diagram below.

44
Hybrid ES-ANN systems (contd)
  • A node ui is defined to be a contributing node to
    node uj if wij ui ? 0.

45
Hybrid ES-ANN systems (contd)
  • In this example, the contributing variables are
    u2, u3, u5, u6 .
  • The rule produced in this example is
  • IF u6 Unknown
  • AND u2 TRUE
  • AND u3 FALSE
  • AND u5 TRUE
  • THEN conclude u7 TRUE.

46
Hybrid ES-ANN systems (contd)
  • One approach to hybrid systems divides a problem
    into tasks suitable for either ES and ANN
  • These tasks are then performed by the appropriate
    methodology
  • One example of such a system (Caudill 1991) is an
    intelligent system for delivering packages
  • ES performs the task of producing the best
    loading strategy for packages into trucks
  • ANN works out best route for delivering the
    packages efficiently.

47
Hybrid ES-ANN systems (contd)
  • Hybrid ES-ANN systems with ANNs embedded within
    expert systems
  • ANN used to determine which rule to fire, given
    the current state of facts.
  • Another approach to hybrid ES-ANN uses an ANN as
    a preprocessor
  • One or more ANNs produce classifications.
  • Numerical outputs produced by ANN are interpreted
    symbolically by an ES as facts
  • ES applies the facts for deductive reasoning

48
Case Study
  • Case Application of ANNs in bankruptcy
    prediction (Coleman et al, AI Review, Summer
    1991, in Zahedi 1993)
  • Predicts banks that were certain to fail within
    a year
  • Predicts certainty given to bank examiners
    dealing with the bank in question.
  • ANN has 11 inputs, each of which is a ratio
    developed by Peat Marwick.
  • Developed by NeuralWares Application Development
    Services and Support Group (ADSS)
  • Software used - the NeuralWorks Professional
    neural network development system.
  • Uses the standard backpropagation (multiplayer
    perceptron) network.

49
Case Study (contd)
  • ANN has 11 inputs, each a ratio developed by Peat
    Marwick.
  • Inputs connected to a single hidden layer, which
    in turn is connected to a single node in the
    output layer.
  • Network outputs a single value denoting whether
    the bank would or would not fail within that
    calendar year
  • Employed the hyperbolic-tangent transfer function
    and a proprietary error function created by the
    ADSS staff.
  • Trained on a set of 1,000 examples, 900 of which
    were viable banks and 100 of which were banks
    that had actually gone bankrupt
  • Training consisted of about 50,000 iterations of
    the training set.
  • Predicted 50 of banks that are viable, and 99
    of banks that actually failed.

50
REFERENCES
  • AI Expert (special issue on ANN), June 1990.
  • BYTE (special issue on ANN), Aug. 1989.
  • Caudill,M., "The View from Now", AI Expert, June
    1992, pp.27-31.
  • Dhar, V., Stein, R., Seven Methods for
    Transforming Corporate Data into Business
    Intelligence., Prentice Hall 1997
  • Kirrmann,H., "Neural Computing The new gold rush
    in informatics", IEEE Micro June 1989 pp. 7-9
  • Lippman, R.P., "An Introduction to Computing with
    Neural Nets", IEEE ASSP Magazine, April 1987
    pp.4-21.
  • Lisboa, P., (Ed.) Neural Networks Current
    Applications, Chapman Hall, 1992.
  • Negnevitsky, M. Artificial Intelligence A Guide
    to Intelligent Systems, Addison-Wesley 2005.

51
REFERENCES (contd)
  • Bailey, D., Thompson, D., How to Develop Neural
    Network Applications, AI Expert, June 1990, pp.
    38-47.
  • Caudill Butler, Naturally Intelligent Systems,
    MIT Press,1989, pp 227-240.
  • Caudill, M., Expert networks, BYTE pp.109-116,
    October 1991.
  • Dhar, V., Stein, R., Seven Methods for
    Transforming Corporate Data into Business
    Intelligence., Prentice Hall 1997.
  • Gallant, S., Neural Network Learning and Expert
    Systems, MIT Press 1993.
  • Medsker,L., Hybrid Intelligent Systems, Kluwer
    Academic Press, Boston 1995
  • Zahedi, F., Intelligent Systems for Business,
    Wadsworth Publishing, , Belmont, California, 1993.
Write a Comment
User Comments (0)
About PowerShow.com