CITS7212 Computational Intelligence - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

CITS7212 Computational Intelligence

Description:

Traditional computers struggle to recognize and generalize patterns of the past ... A Brief History. 1982. John Hopfield presented paper to Academy of Sciences ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 45
Provided by: gareth68
Category:

less

Transcript and Presenter's Notes

Title: CITS7212 Computational Intelligence


1
CITS7212 Computational Intelligence
  • Neural Networks

2
Neural NetworksNature-Inspired
Inputs
Outputs
Neural Network
Outputs
Inputs
3
Neural NetworksNature-Inspired
  • Inspired by the brain
  • Simple animal brains still capable of functions
    that impossible for computers
  • Computers strong at
  • Performing complex math
  • Maintaining data
  • Traditional computers struggle to recognize and
    generalize patterns of the past for future
    actions
  • Neural networks offer a different way to analyze
    and recognize patterns within data

4
Neural NetworksA Brief History
  • 1940s
  • McCulloch Pitts wrote paper on neurons and
    modeled simple neural network
  • Reinforced in The Organisation of Behaviour
    (Debb, 1949)
  • 1950s
  • Took backseat as traditional computing took main
    stage
  • 1959
  • Multi ADAptive LINear Elements (MADALINE)
  • Widrow Hoff of Standford
  • Removed echoes from telephone lines
  • First neural network used commercially
  • Still in use
  • Results
  • Too much hype
  • Unfulfilled promises
  • Fear of Thinking machines
  • Halted funding till 1981

5
Neural NetworksA Brief History
  • 1982
  • John Hopfield presented paper to Academy of
    Sciences
  • Neural networks not simply used to model brains
  • Create useful devices
  • Used charisma and mathematical analysis to
    champion the technology
  • Japan announced a 5th Generation effort into
    neural networks
  • Led the US to fear falling behind
  • Funding began to flow again
  • Post 1985
  • Annual meetings hosted by American Institute of
    Physics
  • Neural Networks for Computing
  • IEEE first International Conference on Neural
    Networks in 1987 drew 1,800 people
  • Discussions ongoing everywhere

6
Neural NetworksThe Brain
  • The Brain
  • Interconnected network of neurons that collect,
    process and disseminate electrical signals via
    synapses.
  • Neurons
  • Synapses
  • Neuron
  • Cells in the brain that performs aggregation and
    dissemination of electrical signals
  • Interconnected in vast networks via synapses to
    provide computational power and intelligence

7
Neural NetworksRepresentation of the Brain
  • Brain
  • Interconnected network of neurons that collect,
    process and disseminate electrical signals via
    synapses
  • Neurons
  • Synapses
  • Neural Network
  • Interconnected network of units (or nodes) that
    collect, process and disseminate values via links
  • Nodes
  • Links

8
Neural NetworksUnits (or Nodes)
  • Represents a neuron in the brain
  • Are the building blocks of neural networks
  • Collects values via input links
  • Determines output values through activation
    function
  • Disseminates values via output links

9
UnitsInput Function
  • Each input link has two values
  • aj the value received as input from of node j
  • Wj,i a numeric weight associated with the link
    connecting node j and node i,
  • determines strength and sign of connection
  • Bias link
  • a0 is reserved for a fixed input of -1
  • Given a biased weight W0,i
  • Determines threshold needed for a positive
    response
  • Node i computes the weighted sum of its inputs
    (ini)
  • ini ? Wj,iaj

10
UnitsActivation function
  • Activation function (g) is applied to the input
    function (ini) to produce output (ai) of node i
  • ai g(ini )
  • g( ? Wj,iaj )

n
j 0
11
UnitsActivation functions
  • A variety of activation functions
  • Needs to meet two (2) desiderata
  • Unit to be active (near 1) when given right
    inputsUnit to be inactive (near 0) when given
    wrong inputs
  • Activation needs to be nonlinear
  • Prevents a neural network collapsing into a
    simple linear function
  • Two commonly used functions
  • Threshold function
  • Sigmoid function

12
Activation functionsThreshold function
  • Function
  • g(ini) 1 , if ini gt 0
  • 0 , if ini 0
  • Useful for classifying inputs into two groups.
  • Used to build networks that function as
  • Feature identifiers
  • Boolean input computers


13
Activation functionsSigmoid function
  • Function
  • g(ini)
  • Also known as logistic function
  • Main advantage is that it has a nice derivative
  • g(ini) g(ini )(1 - g(ini))
  • Helpful for the weight-learning algorithm to be
    seen later.

1 (1 e-ini)
14
UnitsOutput
  • Output of node i will be ai
  • Output (ai) direct result of activation function
    (gi) on the input function (ini)
  • ai g(ini )
  • ai g( ? Wj,iaj )

n
j 0
15
Neural NetworksNetwork Structures
  • Units (nodes) are the building blocks of neural
    networks.
  • Power of neural networks comes from their
    structure
  • How units are linked together
  • Two main types
  • Feed-forward networks (acyclic)
  • Recurrent (cyclic)

16
Feed-forward Networks
  • Units (nodes) usually arranged in layers
  • Each unit receives input only from units in the
    preceding layer
  • Represent a function of its current inputs
  • No internal state other than the weights on links
  • Two types of feed-forward networks
  • Single layer
  • Multilayer

17
Single-layer Feed-forward Networks
  • Also called a perceptron network
  • All inputs connected directly to outputs
  • Input units typically disseminate 1 (on) or 0
    (off)
  • Output units use the threshold activation
    function
  • threshold(ini) 1 , if ini gt 0
  • 0 , if ini 0
  • Perceptron networks represent some Boolean
    functions
  • e.g. Majority function with 7 input units (Fires
    if more than half of n inputs are 1)
  • Bias (W0,i a0) -1 x n/2 -3.5
  • Therefore to reach threshold, 4 or more units
    must have aj 1


18
Single-layer Feed-forward NetworksRepresentation
  • Cannot represent all Boolean functions
  • Threshold perceptron only returns 1 if its
    weighted sum of inputs gt 0
  • ? Wj,xj gt 0
  • W . x 0 defines a hyperplane in the input space
  • One side of it returns 0, other side returns 1
  • Threshold perceptron can only solve functions
    that are linearly separable

n
j0
19
Neural Network Learning Approaches
  • Hebbs Rule
  • Strengthen weights between highly active neurons
  • Hopfield Law
  • Extension of Hebbs rule that increments or
    decrements by a learning rate
  • Kohonens Learning Law
  • Units compete for opportunity to learn, winner
    can inhibit competitors or excite its neighbors.
  • The Delta Rule
  • Adjust weights to minimise difference between
    expected output and actual output for the
    training set
  • Gradient Descent Rule
  • Extension of Delta rule that also implements a
    learning rate

20
Single-layer Feed-forward NetworksLearning
  • Two approaches to training
  • Unsupervised
  • Supervised
  • Aim to minimise a measure of error on training
    set, becomes an optimization search in weight
    space
  • Measure of error is the sum of squared errors (E)
  • Err y hw(x)
  • where,
  • x is the input
  • hw (x) is the output of the perceptron on the
    example,
  • y is the true ouput
  • Weights for the network are adjusted using an
    update algorithm
  • Wj ? Wj a x Err x g(in) x xj
  • Wj ? Wj a x Err x xj
  • where,
  • a is the learning rate

// sigmoid // threshold
21
Single-layer Feed-forward NetworksLearning
  • Training examples are run through the network one
    at a time (cycle or epoch)
  • For each epoch, weights are adjusted to reduce
    error
  • If Err y - hw(x) is positive then the network
    output is too small
  • Weights of positive inputs increased
  • Weights of negative inputs decreased
  • This continues until a stopping criteria has been
    reached
  • Weight changes have become very small
  • Run out of time
  • Pseudocode

// The opposite happens when Err is negative
22
Single-layer Feed-forward NetworksPerformance
  • Better solving linearly separable functions cf.
    decision-tree learning
  • Struggles to solve restaurant example which is
    not linearly separable
  • Best plane through the data correctly classifies
    only 65

23
Multilayer Feed-forward Networks
  • Most applications require at least three layers
  • Input layer (e.g. read from files or electronic
    sensors)
  • Hidden layer(s)
  • Output layer (e.g. sends to another process or
    device)
  • Enlarges space of hypotheses network can
    represents
  • Large hidden layer represent any continuous
    function of the inputs
  • TWO hidden layers can represent discontinuous
    functions
  • Problem of choosing correct number of hidden
    units in advance still a difficult task

24
Multilayer Feed-forward Networks Learning
  • Learning is similar to that in single-layer FFN
  • Differences
  • Output vector hw(x) rather than single value
  • Example has output vector y
  • Major difference with single-layer
  • Calculation of the error at the hidden layers
  • There is no training data to guide the values in
    hidden layers
  • Solution
  • back-propagate error from output layer to hidden
    layers

25
Multilayer Feed-forward Networks Back-propagation
  • Extension of perceptron learning algorithm
  • Wj ? Wj a x Err x g(in) x xj
  • To simplify algorithm define
  • ?i Err x g(in)
  • New representation
  • Wj ? Wj a x xj x ?i
  • Hidden node j is responsible for fraction of the
    error ?i in each of the output nodes to which it
    connects
  • ?i values are divided among connections based on
    link weighting (Wj,i) between hidden node and
    output node

26
Multilayer Feed-forward Networks Back-propagation
  • These values are propagated back to provide the
    ?j values for hidden layer
  • ?j g(inj) ? Wj,i?i
  • Weight update rule for input units to hidden
    layers
  • Wk,j ? Wk,j a x ak x ?j
  • Follow same process backwards for networks with
    more layers

27
Multilayer Feed-forward Networks Back-propagation
  • Pseudocode

28
Multilayer Feed-forward Networks Back-propagation
  • Summary
  • Compute the ? values for the output units, using
    observed error
  • Starting with output layer, repeat the following
    for each layer in the network until the earliest
    layer is reached
  • Propagate the ? values back to previous layer
  • Update the weights between the two layers

29
Multilayer Feed-forward Networks Performance
  • AIMS
  • aim for convergence to something close to the
    global optimum in weight space.
  • network that gives highest prediction accuracy on
    validation sets
  • From restaurant example
  • (a) Training curve converges to perfect fit for
    training data
  • (b) Network learns well
  • Not as fast as decision-tree learning for
    obvious reasons
  • Much improved over single-layer
  • Can handle complexity well but requires correct
    network structures
  • Number of hidden layers and hidden units

30
Neural NetworksConstruction guidelines
  • No quantifiable approach to the layout of a
    network for a particular application
  • Three rules (guidelines) followed by designers
  • More complexity in relationship between the
    inputs and outputs should lead to an increase in
    the number of units in hidden layer(s).
  • If the process being modeled has multiple stages,
    may require multiple hidden layers. If not,
    multiple layers will enable memorization
    (undesireable).
  • Amount of training data sets upper bound on
    number of units in hidden layer.
  • (Number of input-output pairs in training set)
  • (Number of input and output units in network)
  • Scaling factor usually between five (5) and ten
    (10). Larger values for noisy data.
  • Tradeoff exists - the more units in the hidden
    layer(s), memorisation of training data more
    likely to occur but also allows for more
    complexity in relationship between inputs and
    outputs.

Scaling factor
31
Recurrent Networks
  • Recurrent neural networks are an extension to
    feed-forward networks
  • Feed-forward networks operate on an input space
  • Recurrent networks operate on an input space AND
    an internal state space
  • State space is a trace of what already has
    already been processed by the network
  • Allows for a more dynamic system as its response
    depends on initial state which may depend on
    previous inputs (state space)
  • Allows for short-term memory which brings new
    possibilities
  • Allows functionality more resembling a brain.
  • Learning can be very slow - back propagation
    through time (BPTT)

32
Recurrent Networks
  • Simple recurrent network (Elman network)
  • Context units in input layer
  • Connections from hidden to input layer with fixed
    weight of 1
  • Fully recurrent network
  • All units are connected to all other units

Simple recurrent network
33
Other Types of Neural Networks
  • Kohonen self-organizing network
  • Recurrent Networks
  • Hopfield Network
  • Echo state network
  • Stochastic neural networks
  • Modular neural networks
  • Committee of machines (CoM)
  • Cascading neural networks

34
Neuroevolution
  • Use of evolutionary algorithms for training the
    network.
  • Two methods available
  • Updates the weights on connections within network
    topology
  • Updates the weights AND topology of the network
    itself
  • Add a link - complexification
  • Remove a link - simplification
  • Software packages available to allow development
  • NeuroEvolution of Augmenting Topologies (NEAT)

35
Neural NetworksLimitations
  • Not the solution for all computing problems
  • For a neural network to be developed, a number of
    requirements need to be met
  • Needs a data set that can characterize the
    problem
  • A large set of data for training and testing the
    network
  • An implementer who understands the problem and
    can decide on the activation functions and
    learning methods to be used
  • Adequate hardware to support the high demands for
    processing power
  • Development of neural networks can be very
    difficult
  • Neural architects emerging
  • Art to the development process

36
Neural NetworksLimitations
  • Neural networks will continue to make mistakes
  • Hard to ensure its the optimal network
  • Not used for applications that require error free
    results
  • Neural networks frequent applications where
    humans also struggle to be right all the time
  • Used where the accuracy (being below 100) is a
    better result than the alternative system
  • Some examples
  • Pick stocks
  • Approve or deny loans

37
Neural NetworksApplication Areas
  • Sensor processing
  • System identification and control
  • Vehicle control
  • Process control
  • Game-playing
  • Backgammon
  • Chess
  • Racing
  • Pattern recognition
  • Radar systems
  • Face identification
  • Object recognition

38
Neural NetworksApplication Areas
  • Sequence recognition
  • Gesture
  • Speech
  • Handwritten text
  • Medical diagnosis
  • Financial
  • Automated trading systems
  • Data mining

39
Neural NetworksCommercial Packages
  • All commercial packages claim to be
  • Easy to use
  • Powerful
  • Customizable
  • Packages available
  • NeuroSolutions - http//www.neurosolutions.com/pro
    ducts/ns/
  • Peltarion Synapse - http//www.peltarion.com/

40
Neural NetworksThe Future
  • Hybrid systems
  • Greater integration of fuzzy logic into neural
    networks
  • Hardware specialized for neural networks
  • Greater speed
  • Neural networks use many more neurons
  • Need more advanced high-performing hardware
  • Allows greater functionality and performance
  • New applications will emerge
  • Current technologies will be improved
  • Greater sophistication and accuracy with better
    training methods and network architectures

41
Neural NetworksThe Future
  • Neural Networks might, in the future, allow
  • Robots that can see, feel, and predict the world
    around them
  • Wide-spread use of self-driving cars
  • Composition of music
  • Conversion of handwritten documents to word
    processing documents
  • Discovery of trends in the human genome to aid in
    the understanding of the data compiled by the
    Human Genome Project
  • Self-diagnosis of medical problems
  • and much more!

42
Neural NetworksThe Future
43
Neural NetworksSources
  • Diagrams and pseudocode taken from
  • S. Russell and P. Norvig, Section 20.5,
    Artificial Intelligence A Modern Approach,
    Prentice Hall, 2002
  • Example perceptron algorithm
  • http//lcn.epfl.ch/tutorial/english/perceptron/htm
    l/index.html

44
Neural NetworksThe End
Write a Comment
User Comments (0)
About PowerShow.com