Title: Artificial Intelligence Methods
1Artificial Intelligence Methods
- Neural Networks
- Lecture 3
- Rakesh K. Bissoondeeal
2Supervised learning in single layer networks
- Learning in perceptron
- - perceptron learning rule
- Learning in Adaline
- - Widrow-Hoff learning rule (delta rule, least
mean square)
3Issue common to single layer networks
- Single layer networks can solve only linearly
separable problems - Linear separability
- - Two categories are linearly separable patterns
if their members can be separated by a single
line
4Linearly separable
- Consider a system like AND
- x1 x2 x1 AND x2
- 1 1 1
- 0 1 0
- 1 0 0
- 0 0 0
Decision boundary
1-
1
5Linearly inseparable - XOR
- Consider a system like XOR
- x1 x2 x1 XOR x2
- 1 1 0
- 0 1 1
- 1 0 1
- 0 0 0
1
1
6Single layer perceptron
- A perceptron neuron has the step function as the
transfer function - Output is either 1 or 0
- 1 when net input into transfer function is 0 or
greater than 0 - 0 otherwise, i.e., when net input is less than 0
7Single layer perceptron
A bias acts as a weight on a connection from a
unit whose value is always one. The bias shifts
the function f b units to the left If bias not
included decision boundary would be forced to go
through origin. Many linearly separable function
would change into linearly inseparable
x1
w1
f
w2
x2
b
1
bias
8Perceptron learning rule
- Supervised learning
- We have both inputs and outputs
- Let piinput i
- aoutput of network
- t target
- E.g. AND function
- x1 x2 x1 AND x2
- 1 1 1
- 0 1 0
- 1 0 0
- 0 0 0
- We train the network with the aim that a new
(unseen) input similar to old (seen) pattern will
be classified correctly.
9Perceptron learning rule
- 3 cases to consider
- Case 1
- an input vector is presented to the network and
the output of the network is correct. - at and et-a0.
- the weights are not changed
10Perceptron learning rule
- Case 2 If neuron output is 0 and should have
been 1, then - a0 and t1,
- et-a1-01
- then the inputs are added to their corresponding
weights - Case 3 if neuron output is 1 and should have
been 0, then - a1 and t0,
- et-a0-1-1
- then the inputs are subtracted from their
corresponding weights
11Perceptron learning rule
- Perceptron learning rule can be more conveniently
represented as - wnewwoldLRep (LRlearning rate)
- bnewboldLRe
- Convergence
- The perceptron learning rule will converge to a
solution in a finite number of steps if a
solution exists. These include all classification
problems that are linearly separable.
12Perceptron Learning Algorithm
- While epoch produces an error
- Present network with next inputs from epoch
- e t a
- If e ltgt 0 then
- wj wj LR pj e
- bjbjLRe
- End If
- End While
Epoch Presentation of the entire training set
to the neural network.In the case of the AND
function an epoch consists of four sets of inputs
being presented to the network (i.e. 0,0,
0,1, 1,0, 1,1)
13Example
- x1 x2 t
- 2 2 0
- 1 -2 1
- -2 2 0
- -1 1 1
- Learning rate 1
- Initial weights 0, 0
- Bias 0
14Adaline
- Adaline Adaptive Linear Filter
- Similar to perceptrons but has the identity
function (f(x)x) as transfer function instead of
the step function - Uses the Widrow-Hoff learning rule (delta rule,
least mean square-LMS) - More powerful than perceptron learning rule.
- Rule provides basis for the backpropagation
algorithm which can learn with many
interconnected neurons and layers
15Adaline
- LMS learning rule adjusts the weights and biases
so as to minimise the mean squared error for each
pattern - is based on the gradient descent algorithm
16Gradient Descent
17The ADALINE
- Training algorithm goes through the all training
examples a number of times, until a stopping
criterion is reached
Step 1 Initialise all weights and set learning
rate wi (small random values) LR 0.2 (for
example) Step 2 While stopping condition is
false (for example, error gt0.01) Update
bias and weights bi(new) bi(old)
2LRei wi(new) wi(old) 2LRepi
18Comparison Perceptron and Adaline learning rules
- One fixes binary error, the other minimises
continuous error - The perceptron rule converges after a finite
number of iterations if solution is linearly
separable, LMS converges asymptotically towards
the minimum error, probably requiring unbounded
time
19Recommended Reading
- Fundamentals of neural networks Architectures,
Algorithms and Applications, L. Fausett, 1994. - Artificial Intelligence A Modern Approach, S.
Russel and P. Norvig, 1995. - An Introduction to Neural Networks. 2nd Edition,
Morton, IM.