Multilayer Feedforward Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Multilayer Feedforward Networks

Description:

Chapter 3 Multilayer Feedforward Networks Chapter 3 Multilayer Feedforward Networks Example : Application of MLP for classification (cont ... – PowerPoint PPT presentation

Number of Views:145

Avg rating:3.0/5.0

Slides: 48

Provided by: Naw65

Category:

more less

Transcript and Presenter's Notes

Title: Multilayer Feedforward Networks

1
Chapter 3

Multilayer Feedforward Networks

2
Multilayer Feedforward Network Structure
Output nodes
Layer N
Layer N-1
Hidden nodes
Layer 1
Connections
Input nodes
Layer 0
N-layer network
3
Multilayer Feedforward Network Structure (cont.)
x1
o1
x2
x3
Output ????????????
???
4
Multilayer Feedforward Network Structure (cont.)
???????? ???????? superscript index
???????????????????????????????????
Superscript index ????????????????????????????????
?
5
Multilayer Perceptron How it works
Function XOR
y1
x1 x2 y
0 0 0
0 1 1
1 0 1
1 1 0
x1
o
x2
y2
Layer 1
Layer 2
f( ) step function
6
Multilayer Perceptron How it works (cont.)
Outputs at layer 1
y1
x1 x2 y1 y2
0 0 0 0
0 1 1 0
1 0 1 0
1 1 1 1
x1
x2
y2
7
Multilayer Perceptron How it works (cont.)
Inside layer 1
(1,1)
Linearly separable !
(0,0)
(1,0)
8
Multilayer Perceptron How it works (cont.)
Inside output layer
y1
o
y2
Space y1-y2 ???? linearly separable ???????????
????????? L3 ???? class 0 ??? class 1???
9
Multilayer Perceptron How it works (cont.)

??????????? hidden layers
?????????????????????????????? layer
???????????? linearly separable
?????????????????????????? output layer
??????????????????? Hidden layer, ??????????????
linearly separable
???????????? hidden layer ??????? 1 layer
????????????????????????
??? linearly separable
????????? Activation function ???????? layer
?????????????????
Thresholding function ????????????????????
function ????????

10
Backpropagation Algorithm
2-Layer case
Output layer
Hidden layer
Input layer
11
Backpropagation Algorithm (cont.)
2-Layer case
????????????? e2 ???????? w(2)k,j
????????????? e2 ???????? q (2)k
12
Backpropagation Algorithm (cont.)
2-Layer case
????????????? e2 ???????? w(1)j,i
13
Backpropagation Algorithm (cont.)
?????????????????????? e2 ???????? w(1)j,i
????????????????? weight ???????????????? Node j
??? current layer (Layer 1) ??? Node i ??? Lower
Layer (Layer 0)
Weight between upper Node k and Node j of
current layer
Error from upper Node k
Derivative of upper Node k
Input from lower Node i
Derivative of Node j of current layer
?????????? Error ?????????????? (back
propagation) ????? Node j ??? current layer
14
Backpropagation Algorithm (cont.)
???????????
??????????? e2 ???????? w(2)k,j
Input from lower node
Derivative of current node
Error at current node
??????????? e2 ???????? w(1)j,i
????????? ??????????? error ???????? weight
???????????????? 2 layer ????????? ???????????????
???? ??????????????????? error, derivative ???
input
15
Backpropagation Algorithm (cont.)
General case
??????????? e2 ???????? weight
???????????????? Node j ?? Layer n (current
layer) ??? Node i ?? Layer n-1 (lower layer)
Input from Node i of lower layer
Weighted input sum at Node j
Error at Node j of Layer n
Derivative of Node j
16
Backpropagation Algorithm (cont.)
General case
??????????? e2 ???????? bias (q ) ????? Node
j ?? Layer n (current layer)
Weighted input sum at Node j
Error at Node j of Layer n
Derivative of Node j
17
Backpropagation Algorithm (cont.)
General case
Error at Node j of Layer n
Error at Node k of Layer N (Output layer)
18
Updating Weights Gradient Descent Method
Updating weights and bias
19
Updating Weights Gradient Descent with Momentum
Method
0lt b lt1
Momentum term
Dw will converge to
20
Updating Weights Newtons Methods
From Taylors Series
where H Hessian matrix
E Error function (e2)
From Taylors series, we get
or
( Newtons method )
21
Updating Weights Newtons Methods
Advantages w Fast (quadratic convergence) Disadv
antages w Computationally expensive (requires
the computation of inverse matrix operation at
each iteration time). w Hessian matrix is
difficult to be computed.
22
Levenberg-Marquardt Backpropagation Methods
where J Jacobian matrix E All errors I
Identity matrix m Learning rate
23
Example Application of MLP for classification
Matlab command Create training data
Input pattern x1 and x2 generated from random
numbers
x randn(2 200) o (x(1,).2x(2,).2)lt1
Desired output o if (x1,x2) lies in a circle of
radius 1 centered at the origin then o
1 else o 0
x2
x1
24
Example Application of MLP for classification
(cont.)
Matlab command Create a 2-layer network
PR min(x(1,)) max(x(1,)) min(x(2,))
max(x(2,)) S1 10 S2 1 TF1
'logsig' TF2 'logsig' BTF 'traingd' BLF
'learngd' PF 'mse' net newff(PR,S1
S2,TF1 TF2,BTF,BLF,PF)
Range of inputs
No. of nodes in Layers 1 and 2
Activation functions of Layers 1 and 2
Training function
Learning function
Cost function
Command for creating the network
25
Example Application of MLP for classification
(cont.)
Matlab command Train the network
No. of training rounds
net.trainParam.epochs 2000 net.trainParam.goal
0.002 net train(net,x,o) y
sim(net,x) netout ygt0.5
Maximum desired error
Training command
Compute network outputs (continuous)
Convert to binary outputs
26
Example Application of MLP for classification
(cont.)
Network structure
Output node (Sigmoid)
x1
Input nodes
x2
Threshold unit (for binary output)
Hidden nodes (sigmoid)
27
Example Application of MLP for classification
(cont.)
Initial weights of the hidden layer nodes (10
nodes) displayed as Lines w1x1w2x2q 0
28
Example Application of MLP for classification
(cont.)
Training algorithm Gradient descent method
MSE vs training epochs
29
Example Application of MLP for classification
(cont.)
Results obtained using the Gradient descent method
Classification Error 40/200
30
Example Application of MLP for classification
(cont.)
Training algorithm Gradient descent with
momentum method
MSE vs training epochs
31
Example Application of MLP for classification
(cont.)
Results obtained using the Gradient descent with
momentum method
Classification Error 40/200
32
Example Application of MLP for classification
(cont.)
Training algorithm Levenberg-Marquardt
Backpropagation
MSE vs training epochs (success with in only 10
epochs!)
33
Example Application of MLP for classification
(cont.)
Results obtained using the Levenberg-Marquardt
Backpropagation
Unused node
Only 6 hidden nodes are adequate !
Classification Error 0/200
34
Example Application of MLP for classification
(cont.)
??????????????? ??????????????????????????????? -
????????? Classification ???????? Node
?????????? ???????????????????????? ??????
(boundary) ??? Class ?????????????????????????????
????????????? (Local) - ???????????????????????
?????????????????? ??????????????????????? ???
Node ??????????????????? Global boundary
??????????????????????????
35
Example Application of MLP for function
approximation
Function to be approximated
x 00.014 y (sin(2pix)1).exp(-x.2)
36
Example Application of MLP for function
approximation (cont.)
Matlab command Create a 2-layer network
PR min(x) max(x) S1 6 S2 1 TF1
'logsig' TF2 'purelin' BTF 'trainlm' BLF
'learngd' PF 'mse' net newff(PR,S1
S2,TF1 TF2,BTF,BLF,PF)
Range of inputs
No. of nodes in Layers 1 and 2
Activation functions of Layers 1 and 2
Training function
Learning function
Cost function
Command for creating the network
37
Example Application of MLP for function
approximation
Network structure
Output node (Linear)
x
y
Input nodes
Hidden nodes (sigmoid)
38
Example Application of MLP for function
approximation
Initial weights of the hidden nodes displayed in
terms of the activation function of each node
(sigmoid function).
39
Example Application of MLP for function
approximation
Final weights of the hidden nodes after training
displayed in terms of the activation function of
each node (sigmoid function).
40
Example Application of MLP for function
approximation
Weighted summation of all outputs from the
first layer nodes yields function approximation.
41
Example Application of MLP for function
approximation (cont.)
Matlab command Create a 2-layer network
PR min(x) max(x) S1 3 S2 1 TF1
'logsig' TF2 'purelin' BTF 'trainlm' BLF
'learngd' PF 'mse' net newff(PR,S1
S2,TF1 TF2,BTF,BLF,PF)
Range of inputs
No. of nodes in Layers 1 and 2
Activation functions of Layers 1 and 2
Training function
Learning function
Cost function
Command for creating the network
42
Example Application of MLP for function
approximation
Network structure
No. of hidden nodes is too small !
Function approximated using the network
43
Example Application of MLP for function
approximation (cont.)
Matlab command Create a 2-layer network
PR min(x) max(x) S1 5 S2 1 TF1
'radbas' TF2 'purelin' BTF 'trainlm' BLF
'learngd' PF 'mse' net newff(PR,S1
S2,TF1 TF2,BTF,BLF,PF)
Range of inputs
No. of nodes in Layers 1 and 2
Activation functions of Layers 1 and 2
Training function
Learning function
Cost function
Command for creating the network
44
Example Application of MLP for function
approximation
Radiasl Basis Func.
Initial weights of the hidden nodes displayed in
terms of the activation function of each node
(radial basis function).
45
Example Application of MLP for function
approximation
Final weights of the hidden nodes after training
displayed in terms of the activation function of
each node (Radial Basis function).
46
Example Application of MLP for function
approximation
Function approximated using the network
47
Example Application of MLP for function
approximation