Connectionist Machine Learning IIa

About This Presentation

Title:

Connectionist Machine Learning IIa

Description:

Function s is called the sigmoid or logistic function. It has the following property: ... each node is a sigmoid or squashing function. ... – PowerPoint PPT presentation

Number of Views:54

Avg rating:3.0/5.0

Slides: 25

Provided by: ricardo125

Category:

more less

Transcript and Presenter's Notes

Title: Connectionist Machine Learning IIa

1
Connectionist Machine Learning IIa

Basics
Backpropagation Algorithm
Momentum
Summary

2
Basics
In contrast to perceptrons, multilayer networks
can learn multiple decision boundaries. In
addition, the boundaries may be nonlinear.
Output nodes
Internal nodes
Input nodes
3
Example
x2
x1
4
Example
5
One Single Unit
To make nonlinear partitions on the space we need
to define each unit as a nonlinear function
(unlike the perceptron). One solution is to use
the sigmoid unit.
x1
w1
x2
net
w2
S
w0
wn
xn
xo1
O s(net) 1 / 1 e -net
6
One Single Unit
The sigmoid or squashing function.
s(net)
net
O s(net) 1 / 1 e -net
7
More Precisely
O(x1,x2,,xn)
s ( WX )
where s ( WX ) 1 / 1 e -WX
Function s is called the sigmoid or logistic
function. It has the following property d
s(y) / dy s(y) (1 s(y))
8
Connectionist Machine Learning IIa

Basics
Backpropagation Algorithm
Momentum
Summary

9
Many weights need adjustment
Multilayer networks need many weights to be
adjusted
Output nodes
Internal nodes
Input nodes
10
Backpropagation Algorithm
Goal To learn the weights for all links in an
interconnected multilayer network. We begin by
defining our measure of error E(W) ½ Sd Sk
(tkd okd) 2 k varies along the output nodes
and d over the training examples. The idea is to
use again a gradient descent over the space of
weights to find a global minimum.
11
Output Nodes
Output nodes
12
Algorithm
The idea is to use again a gradient descent over
the space of weights to find a global minimum (no
guarantee).

Create a network with nin input nodes, nhidden
internal nodes, and nout output nodes.
Initialize all weights to small random numbers.
Until error is small do
For each example X do
Propagate example X forward through the network
Propagate errors backward through the network

13
Propagating Forward
Given example X, compute the output of every
node until we reach the output nodes
Output nodes
Compute sigmoid function
Internal nodes
Input nodes
Example X
14
Error Output Nodes
Estimation
Target function
Output nodes
15
Propagating Error Backward