Artificial Neural Networks : An Introduction - PowerPoint PPT Presentation

About This Presentation
Title:

Artificial Neural Networks : An Introduction

Description:

Artificial Neural Networks : An Introduction G.Anuradha Learning rate Denoted by . Used to control the amount of weight adjustment at each step of training Learning ... – PowerPoint PPT presentation

Number of Views:745
Avg rating:3.0/5.0
Slides: 82
Provided by: anur9
Category:

less

Transcript and Presenter's Notes

Title: Artificial Neural Networks : An Introduction


1
Artificial Neural Networks An Introduction
  • G.Anuradha

2
Learning Objectives
  • Reasons to study neural computation
  • Comparison between biological neuron and
    artificial neuron
  • Basic models of ANN
  • Different types of connections of NN, Learning
    and activation function
  • Basic fundamental neuron model-McCulloch-Pitts
    neuron and Hebb network

3
Reasons to study neural computation
  • To understand how brain actually works
  • Computer simulations are used for this purpose
  • To understand the style of parallel computation
    inspired by neurons and their adaptive
    connections
  • Different from sequential computation
  • To solve practical problems by using novel
    learning algorithms inspired by brain

4
Fundamental concept
  • NN are constructed and implemented to model the
    human brain.
  • Performs various tasks such as pattern-matching,
    classification, optimization function,
    approximation, vector quantization and data
    clustering.
  • These tasks are difficult for traditional
    computers

5
Biological Neural Network
6
Neuron and a sample of pulse train
7
Biological Neuron
  • Has 3 parts
  • Soma or cell body- cell nucleus is located
  • Dendrites- nerve connected to cell body
  • Axon carries impulses of the neuron
  • End of axon splits into fine strands
  • Each strand terminates into a bulb-like organ
    called synapse
  • Electric impulses are passed between the synapse
    and dendrites
  • Synapses are of two types
  • Inhibitory- impulses hinder the firing of the
    receiving cell
  • Excitatory- impulses cause the firing of the
    receiving cell
  • Neuron fires when the total of the weights to
    receive impulses exceeds the threshold value
    during the latent summation period
  • After carrying a pulse an axon fiber is in a
    state of complete nonexcitability for a certain
    time called the refractory period.

8
How does the brain work
  • Each neuron receives inputs from other neurons
  • Use spikes to communicate
  • The effect of each input line on the neuron is
    controlled by a synaptic weight
  • Positive or negative
  • Synaptic weight adapts so that the whole network
    learns to perform useful computations
  • Recognizing objects, understanding languages,
    making plans, controlling the body
  • There are 1011 neurons with 104 weights.

9
Modularity and brain
  • Different bits of the cortex do different things
  • Local damage to the brain has specific effects
  • Early brain damage makes function relocate
  • Cortex gives rapid parallel computation plus
    flexibility
  • Conventional computers requires very fast central
    processors for long sequential computations

10
Information flow in nervous system
11
ANN
  • ANN posess a large number of processing elements
    called nodes/neurons which operate in parallel.
  • Neurons are connected with others by connection
    link.
  • Each link is associated with weights which
    contain information about the input signal.
  • Each neuron has an internal state of its own
    which is a function of the inputs that neuron
    receives- Activation level

12
Comparison between brain verses computer
Brain ANN
Speed Few ms. Few nano sec. massive el processing
Size and complexity 1011 neurons 1015 interconnections Depends on designer
Storage capacity Stores information in its interconnection or in synapse. No Loss of memory Contiguous memory locations loss of memory may happen sometimes.
Tolerance Has fault tolerance No fault tolerance Inf gets disrupted when interconnections are disconnected
Control mechanism Complicated involves chemicals in biological neuron Simpler in ANN
13
Artificial Neural Networks
14
McCulloch-Pitts Neuron Model
15
McCulloch Pits for And and or model
16
McCulloch Pitts for NOT Model
17
Advantages and Disadvantages of McCulloch Pitt
model
  • Advantages
  • Simplistic
  • Substantial computing power
  • Disadvantages
  • Weights and thresholds are fixed
  • Not very flexible

18
Features of McCulloch-Pitts model
  • Allows binary 0,1 states only
  • Operates under a discrete-time assumption
  • Weights and the neurons thresholds are fixed in
    the model and no interaction among network
    neurons
  • Just a primitive model

19
General symbol of neuron consisting of processing
node and synaptic connections
20
Neuron Modeling for ANN
Is referred to activation function. Domain is set
of activation values net.
Scalar product of weight and input vector
Neuron as a processing node performs the
operation of summation of its weighted input.
21
Binary threshold neurons
  • There are two equivalent ways to write the
    equations for a binary threshold neuron

1 if
1 if
0 otherwise
0 otherwise
22
Sigmoid neurons
  • These give a real-valued output that is a smooth
    and bounded function of their total input.
  • Typically they use the logistic function
  • They have nice derivatives which make learning
    easy

1
0.5
0
0
23
Activation function
  • Bipolar binary and unipolar binary are called as
    hard limiting activation functions used in
    discrete neuron model
  • Unipolar continuous and bipolar continuous are
    called soft limiting activation functions are
    called sigmoidal characteristics.

24
Activation functions
Bipolar continuous
Bipolar binary functions
25
Activation functions
Unipolar continuous
Unipolar Binary
26
Common models of neurons
Binary perceptrons
Continuous perceptrons
27
Quiz
  • Which of the following tasks are neural networks
    good at?
  • Recognizing fragments of words in a pre-processed
    sound wave.
  • Recognizing badly written characters.
  • Storing lists of names and birth dates.
  • logical reasoning

Neural networks are good at finding statistical
regularities that allow them to recognize
patterns. They are not good at flawlessly
applying symbolic rules or storing exact numbers.
28
Basic models of ANN
29
Classification based on interconnections
30
Feed-forward neural networks
  • These are the commonest type of neural network in
    practical applications.
  • The first layer is the input and the last layer
    is the output.
  • If there is more than one hidden layer, we call
    them deep neural networks.
  • They compute a series of transformations that
    change the similarities between cases.
  • The activities of the neurons in each layer are a
    non-linear function of the activities in the
    layer below.

output units
hidden units
input units
31
Feedforward Network
  • Its output and input vectors are respectively
  • Weight wij connects the ith neuron with jth
    input. Activation rule of ith neuron is

where
EXAMPLE
32
Multilayer feed forward network
Can be used to solve complicated problems
33
Feedback network
When outputs are directed back as inputs to same
or preceding layer nodes it results in the
formation of feedback networks
34
Lateral feedback
If the feedback of the output of the processing
elements is directed back as input to the
processing elements in the same layer then it is
called lateral feedback
35
Recurrent networks
  • These have directed cycles in their connection
    graph.
  • That means you can sometimes get back to where
    you started by following the arrows.
  • They can have complicated dynamics and this can
    make them very difficult to train.
  • There is a lot of interest at present in finding
    efficient ways of training recurrent nets.
  • They are more biologically realistic.

Recurrent nets with multiple hidden layers are
just a special case that has some of the
hidden?hidden connections missing.
36
Recurrent neural networks for modeling sequences
time ?
  • Recurrent neural networks are a very natural way
    to model sequential data
  • They are equivalent to very deep nets with one
    hidden layer per time slice.
  • Except that they use the same weights at every
    time slice and they get input at every time
    slice.
  • They have the ability to remember information in
    their hidden state for a long time.
  • But its very hard to train them to use this
    potential.

output
output
output
hidden
hidden
hidden
input
input
input
37
An example of what recurrent neural nets can now
do (to whet your interest!)
  • Ilya Sutskever (2011) trained a special type of
    recurrent neural net to predict the next
    character in a sequence.
  • After training for a long time on a string of
    half a billion characters from English Wikipedia,
    he got it to generate new text.
  • It generates by predicting the probability
    distribution for the next character and then
    sampling a character from this distribution.

38
Some text generated one character at a time by
Ilya Sutskevers recurrent neural network
In 1974 Northern Denver had been overshadowed by
CNL, and several Irish intelligence agencies in
the Mediterranean region. However, on the
Victoria, Kings Hebrew stated that Charles
decided to escape during an alliance. The
mansion house was completed in 1882, the second
in its bridge are omitted, while closing is the
proton reticulum composed below it aims, such
that it is the blurring of appearing on any
well-paid type of box printer.
39
Symmetrically connected networks
  • These are like recurrent networks, but the
    connections between units are symmetrical (they
    have the same weight in both directions).
  • John Hopfield (and others) realized that
    symmetric networks are much easier to analyze
    than recurrent networks.
  • They are also more restricted in what they can
    do. because they obey an energy function.
  • For example, they cannot model cycles.
  • Symmetrically connected nets without hidden units
    are called Hopfield nets.

40
Symmetrically connected networks with hidden
units
  • These are called Boltzmann machines.
  • They are much more powerful models than Hopfield
    nets.
  • They are less powerful than recurrent neural
    networks.
  • They have a beautifully simple learning
    algorithm.

41
Basic models of ANN
42
Learning
  • Its a process by which a NN adapts itself to a
    stimulus by making proper parameter adjustments,
    resulting in the production of desired response
  • Two kinds of learning
  • Parameter learning- connection weights are
    updated
  • Structure Learning- change in network structure

43
Training
  • The process of modifying the weights in the
    connections between network layers with the
    objective of achieving the expected output is
    called training a network.
  • This is achieved through
  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning

44
Classification of learning
  • Supervised learning-
  • Learn to predict an output when given an input
    vector.
  • Unsupervised learning
  • Discover a good internal representation of the
    input.
  • Reinforcement learning
  • Learn to select an action to maximize payoff.

45
Supervised Learning
  • Child learns from a teacher
  • Each input vector requires a corresponding target
    vector.
  • Training pairinput vector, target vector

Neural Network W
X
Y
(Actual output)
(Input)
Error (D-Y) signals
Error Signal Generator
(Desired Output)
46
Two types of supervised learning
  • Each training case consists of an input vector x
    and a target output t.
  • Regression The target output is a real number or
    a whole vector of real numbers.
  • The price of a stock in 6 months time.
  • The temperature at noon tomorrow.
  • Classification The target output is a class
    label.
  • The simplest case is a choice between 1 and 0.
  • We can also have multiple alternative labels.

47
Unsupervised Learning
  • How a fish or tadpole learns
  • All similar input patterns are grouped together
    as clusters.
  • If a matching input pattern is not found a new
    cluster is formed
  • One major aim is to create an internal
    representation of the input that is useful for
    subsequent supervised or reinforcement learning.
  • It provides a compact, low-dimensional
    representation of the input.

48
Self-organizing
  • In unsupervised learning there is no feedback
  • Network must discover patterns, regularities,
    features for the input data over the output
  • While doing so the network might change in
    parameters
  • This process is called self-organizing

49
Reinforcement Learning
X
NN W
Y
(Input)
(Actual output)
Error signals
Error Signal Generator
R Reinforcement signal
50
When Reinforcement learning is used?
  • If less information is available about the target
    output values (critic information)
  • Learning based on this critic information is
    called reinforcement learning and the feedback
    sent is called reinforcement signal
  • Feedback in this case is only evaluative and not
    instructive

51
Basic models of ANN
52
Activation Function
  • Identity Function
  • f(x)x for all x
  • Binary Step function
  • Bipolar Step function
  • Sigmoidal Functions- Continuous functions
  • Ramp functions-

53
Some learning algorithms we will learn are
  • Supervised
  • Adaline, Madaline
  • Perceptron
  • Back Propagation
  • multilayer perceptrons
  • Radial Basis Function Networks
  • Unsupervised
  • Competitive Learning
  • Kohenen self organizing map
  • Learning vector quantization
  • Hebbian learning

54
Neural processing
  • Recall- processing phase for a NN and its
    objective is to retrieve the information. The
    process of computing o for a given x
  • Basic forms of neural information processing
  • Auto association
  • Hetero association
  • Classification

55
Neural processing-Autoassociation
  • Set of patterns can be stored in the network
  • If a pattern similar to a member of the stored
    set is presented, an association with the input
    of closest stored pattern is made

56
Neural Processing- Heteroassociation
  • Associations between pairs of patterns are stored
  • Distorted input pattern may cause correct
    heteroassociation at the output

57
Neural processing-Classification
  • Set of input patterns is divided into a number of
    classes or categories
  • In response to an input pattern from the set, the
    classifier is supposed to recall the information
    regarding class membership of the input pattern.

58
Important terminologies of ANNs
  • Weights
  • Bias
  • Threshold
  • Learning rate
  • Momentum factor
  • Vigilance parameter
  • Notations used in ANN

59
Weights
  • Each neuron is connected to every other neuron by
    means of directed links
  • Links are associated with weights
  • Weights contain information about the input
    signal and is represented as a matrix
  • Weight matrix also called connection matrix

60
Weight matrix
  • W


61
Weights contd
  • wij is the weight from processing element i
    (source node) to processing element j
    (destination node)

62
Activation Functions
  • Used to calculate the output response of a
    neuron.
  • Sum of the weighted input signal is applied with
    an activation to obtain the response.
  • Activation functions can be linear or non linear
  • Already dealt
  • Identity function
  • Single/binary step function
  • Discrete/continuous sigmoidal function.

63
Bias
  • Bias is like another weight. Its included by
    adding a component x01 to the input vector X.
  • X(1,X1,X2Xi,Xn)
  • Bias is of two types
  • Positive bias increase the net input
  • Negative bias decrease the net input

64
Why Bias is required?
  • The relationship between input and output given
    by the equation of straight line ymxc

C(bias)
X
Y
Input
ymxC
65
Threshold
  • Set value based upon which the final output of
    the network may be calculated
  • Used in activation function
  • The activation function using threshold can be
    defined as

66
Learning rate
  • Denoted by a.
  • Used to control the amount of weight adjustment
    at each step of training
  • Learning rate ranging from 0 to 1 determines the
    rate of learning in each time step

67
Other terminologies
  • Momentum factor
  • used for convergence when momentum factor is
    added to weight updation process.
  • Vigilance parameter
  • Denoted by ?
  • Used to control the degree of similarity required
    for patterns to be assigned to the same cluster

68
Neural Network Learning rules
c learning constant
69
Hebbian Learning Rule
FEED FORWARD UNSUPERVISED LEARNING
  • The learning signal is equal to the neurons
    output

70
Features of Hebbian Learning
  • Feedforward unsupervised learning
  • When an axon of a cell A is near enough to
    exicite a cell B and repeatedly and persistently
    takes place in firing it, some growth process or
    change takes place in one or both cells
    increasing the efficiency
  • If oixj is positive the results is increase in
    weight else vice versa

71
(No Transcript)
72
Perceptron Learning rule
  • Learning signal is the difference between the
    desired and actual neurons response
  • Learning is supervised

73
Example
74
Quiz
  • Suppose we have 3D input x(0.5,-0.5) connected
    to a neuron with weights w(2,-1) and bias b0.5.
    furthermore the target for x is t0. in this case
    we use a binary threshold neuron for the output
    so that
  • y1 if xTwbgt0 and 0 otherwise
  • What will be the weights and bias after 1
    iteration of perceptron learning algorithm?
  • w (1.5,-0.5) b-1.5
  • w(1.5,-0.5) b-0.5
  • w(2.5,-1.5) b0.5
  • w(-1.5,0.5) b1.5

75
Delta Learning Rule
  • Only valid for continuous activation function
  • Used in supervised training mode
  • Learning signal for this rule is called delta
  • The aim of the delta rule is to minimize the
    error over all training patterns

76
Delta Learning Rule Contd.
Learning rule is derived from the condition of
least squared error. Calculating the gradient
vector with respect to wi
Minimization of error requires the weight changes
to be in the negative gradient direction
77
Widrow-Hoff learning Rule
  • Also called as least mean square learning rule
  • Introduced by Widrow(1962), used in supervised
    learning
  • Independent of the activation function
  • Special case of delta learning rule wherein
    activation function is an identity function ie
    f(net)net
  • Minimizes the squared error between the desired
    output value di and neti

78
Winner-Take-All learning rules
79
Winner-Take-All Learning rule Contd
  • Can be explained for a layer of neurons
  • Example of competitive learning and used for
    unsupervised network training
  • Learning is based on the premise that one of the
    neurons in the layer has a maximum response due
    to the input x
  • This neuron is declared the winner with a weight

80
(No Transcript)
81
Summary of learning rules
82
Linear Separability
  • Separation of the input space into regions is
    based on whether the network response is positive
    or negative
  • Line of separation is called linear-separable
    line.
  • Example-
  • AND function OR function are linear separable
    Example
  • EXOR function Linearly inseparable. Example

83
Hebb Network
  • Hebb learning rule is the simpliest one
  • The learning in the brain is performed by the
    change in the synaptic gap
  • When an axon of cell A is near enough to excite
    cell B and repeatedly keep firing it, some growth
    process takes place in one or both cells
  • According to Hebb rule, weight vector is found to
    increase proportionately to the product of the
    input and learning signal.

84
Flow chart of Hebb training algorithm
Start
1
Initialize Weights
Activate output yt
Weight update
For Each st
n
Bias update b(new)b(old) y
y
Activate input xisi
Stop
1
85
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com