CENG 569 Spring 2006 NEUROCOMPUTING - PowerPoint PPT Presentation

1 / 40

About This Presentation

Title:

CENG 569 Spring 2006 NEUROCOMPUTING

Description:

Dept. of Computer Engineering. Middle East Technical University ... Later f( ) is replaced by other continuous squashing functions. ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 41

Provided by: erols

Category:

more less

Transcript and Presenter's Notes

Title: CENG 569 Spring 2006 NEUROCOMPUTING

1
CENG 569Spring 2006NEUROCOMPUTING
Erol SahinDept. of Computer EngineeringMiddle
East Technical UniversityInonu Bulvari, 06531,
Ankara, TURKEY

Week 1
Introduction to neural nets the beginnings and
the basics
Course objectives

2
Whats our motivation?

Science Model how biological neural systems,
like human brain, work?
How do we see?
How is information stored in/retrieved from
memory?
How do you learn to not to touch fire?
How do your eyes adapt to the amount of light in
the environment?
Related fields Neuroscience, Computational
Neuroscience, Psychology, Psychophysiology,
Cognitive Science, Medicine, Math, Physics.

3
Whats our motivation?

Engineering Design information processing
systems that are as good as the biological ones?
How can we design a vision system that can see?
how can we store data such that it can be
retrieved fast?
How can we make a robot not to repeat an action
that burned it before?
How can we make an articial retina that
automatically adapts to the mount of light?
Related fields Computer Science, Statistics,
Electronics Engineering, Mechanical Engineering.

4
The biological neuron - simplified

The basic information processing element of
neural systems. The neuron
receives input signals generated by other neurons
through its dendrites,
integrates these signals in its body,
then generates its own signal (a series of
electric pulses) that travel along the axon which
in turn makes contacts with dendrites of other
neurons.
The points of contact between neurons are called
synapses.

5
The biological neuron - 1

The pulses generated by the neuron travels along
the axon as an electrical wave.
Once these pulses reach the synapses at the end
of the axon open up chemical vesicles exciting
the other neuron.

6
The biological neuron - 2
7
What this course is about? - 1

Introduce and review computational models that
are often classified as neural networks.
The first part of the course will cover
Perceptron, Adaline
Multi-layer perceptrons, and the back-propagation
learning algorithm
Hopfield model, Boltzmann machine
Unsupervised learning models
Kohonens self-organized maps
Radial basis functions
Adaptive Resonance Theory models
Support vector machines
Properties and application of these models will
be reviewed.
Several programming projects will be given.

8
What this course is about? - 2

In the second part of the course, we will cover
more biologically-plausible models of neural
cirtcuits
Hodgkin-Huxley model of the biological neuron
Feed-forward shunting networks and their
properties
Recurrent shunting networks and their properties
Classical and operant conditioning and their
neural models

9
Todays topics

Brief history
McCulloch-Pitts neuron
Perceptron
Adaline

10
Brief History

Old Ages
Association (William James 1890)
McCulloch-Pitts Neuron (1943,1947)
Perceptrons (Rosenblatt 1958,1962)
Adaline/LMS (Widrow and Hoff 1960)
Perceptrons book (Minsky and Papert 1969)
Dark Ages
Self-organization in visual cortex (von der
Malsburg 1973)
Backpropagation (Werbos, 1974)
Foundations of Adaptive Resonance Theory
(Grossberg 1976)
Neural Theory of Association (Amari 1977)

11
Contd

Modern Ages
Adaptive Resonance Theory (Grossberg 1980)
Hopfield model (Hopfield 1982, 1984)
Self-organizing maps (Kohonen 1982)
Reinforcement learning (Sutton and Barto 1983)
Simulated Annealing (Kirkpatrick et al. 1983)
Boltzmann machines (Ackley, Hinton, Terrence
1985)
Backpropagation (Rumelhart, Hinton, Williams
1986)
ART-networks (Carpenter, Grossberg 1992)
Support Vector Machines

12
William James

William James (1890) Association
Mental facts cannot properly be studied apart
from the physical environment of which they take
cognizance
Principle of association When two brain
processes are active together or in immediate
succession, one of them, on reoccurring tends to
propagate its excitement into the other

13
McCulloch-Pitts neuron model (1943)

f() is called the activation function.
? is also called as the bias.

x1, x2 xN are the inputs at time t-1.
Inputs are binary.
At each interval the neuron can fire at most
once.
Positive weights (wi) correspond to excitatory
synapses and negative weights correspond to
inhibitory synapses.
? is the threshold for the neuron to fire.
f() is a non-linear hard-limiting function in
the original McCulloch-Pitts model.
No learning mechanism!
Later f() is replaced by other continuous
squashing functions.

14
McCulloch-Pitts

McCulloch and Pitts (1943) showed that networks
made from these neurons can implement logic
functions, such as AND, OR, XOR. Therefore these
networks are universal computation devices.
Homework
Build AND, OR and INVERT gates using
McCulloch-Pitts neurons.
What aspects of the McCulloch-Pitts neuron are
different from the biological neuron?

15
McCulloch-Pitts

McCulloch-Pitts (1943) The first computational
neuron model.
Showed that networks made from these neurons can
implement logic functions, such as AND, OR, XOR.
Therefore these networks are universal
computation devices.
Assumptions made
Neuron activation is binary.
At least a certain number of excitatory inputs
are needed to excite the neuron.
Even a single inhibitory input can inhibit the
neuron.
No delay.
Network structure is fixed.
No adaptation!

16
Hebbs Learning Law

In 1949, Donald Hebb formulated William James
principle of association into a mathematical form.

If the activation of the neurons, y1 and y2 , are
both on (1) then the weight between the two
neurons grow. (Off 0)
Else the weight between remains the same.
However, when bipolar activation -1,1 scheme
is used, then the weights can also decrease when
the activation of two neurons does not match.

17
Perceptron - history

Proposed by Rosenblatt et al. (1958-1962).
A large class of neural models that incorporate
learning.
Mark I perceptron is built with a retina of
20X20 receptors.
Learned to recognize letters.
Created excitement, and hype. (ENIAC was built in
1945)

18
Perceptron - structure

A perceptron is a network of S, A and R units
with a variable interaction which depends on the
sequence of the past activity states of the
network (Rosenblatt 1962).

S Sensory unit
A Association unit
V Variable interaction matrix
R The learning unit, a.k.a. the perceptron
Learning Setting the weights of V such that the
network correctly classify the input patterns.

19
Perceptron clearer structure
Associative units
Retina
Response unit
Variable weights
Fixed weights
Step activation function
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
20
Perceptron - neuron

Perceptron's activation y depends on the (linear)
sum of inputs (including a bias) converging on
the neuron through weighted pathways.

x1, x2,,xN can be continuous.
It can learn!

21
Perceptron - activation
X1
w1,1
w2,1
Y1
Y2
w1,2
w2,2
X2
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
22
Classification problem

Imagine that there are two groups patterns One
group is classified as A whereas the other
classified as B
From a given set of example patterns whose
categories are known apriori, how can one learn
to correctly classify the input patterns, both
those that were seen before and those that were
not seen before.
Not an easy problem!

23
Input space representation
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
24
Perceptron learning

If the perceptron classified the input correct
then do nothing.
If not, the weights of the active units are
decremented.
Learning is guaranteed for all the problems that
the perceptron can classify!

For each input pattern X, compute the (binary)
output y, and compare it against the desired
output d.
Activation
If X.W ? gt 0
then y1, else y0.
Learning
If y d, that is the output is correct
then W(t1) W(t)
If not, that is the output is incorrect,
then W(t1) W(t) ? W(t) where ? w(t) ? X.

25
Perceptron - intuition

A perceptron defines a hyperplane in N-1 space a
line in 2-D (two inputs), a plane in 3-D (three
inputs),.
The perceptron is a linear classifier Its
output is 0 on one side of the plane, and 1 for
the other.
Given a linearly separable problem, the
perceptron learning rule guarantees convergence.

26
Adaline - history

An adaptive pattern classication machine (called
Adaline, for adaptive linear). . .
Proposed by Widrow and Hoff.
During a training phase, crude geometric
patterns are fed to the machine by setting the
toggle switches in the 4X4 input switch array.
Setting another toggle switch (the reference
switch) tells the machine whether the desired
output for the particular input pattern is 1 or
-1 . The system learns something from each
pattern and accordingly experiences a design
change.. (Widrow and Hoff 1960)

27
Adaline Widrow-Hoff Learning

The learning idea is as follows
Define an error function that measure the
performance of the performance in terms of the
weights, input, output and desired output.
Take the derivative of this function with respect
to the weights, and modify the weights
accordingly such that the error is decreased.
Also known as the Least Mean Square (LMS) error
algorithm, the Widrow-Hoff rule, the Delta rule.

28
The ADALINE

The Widrow-Hoff rule (also known as Delta Rule)
Minimizes the error between desired output t and
the net input y_in
Minimizes the squared error for each pattern
Example if s 1 and t 0.5, then the graph of
E against w1,1 would be
Gradient decent

wij(new) wij(old) ?(ti y_ini)xj
E (t y_in)2
E
w1,1
w1,1
0
0.5
0.1
0.25
0.9
1
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
29
The ADALINE learning algorithm
Step 0 Initialize all weights and set learning
rate wij (small random values) ? 0.2 (for
example) Step 1 While stopping condition is
false Step 1.1 For each training pair
st Step 1.1.1 Set activations on input
units xj sj Step 1.1.2 Compute net input
to output units y_ini bi ? xjwij Step
1.1.3 Update bias and weights bi(new)
bi(old) ?(ti y_ini) wij(new) wij(old)
?(ti y_ini)xj
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
30
Adaptive filter

F1 registers the input pattern.
Signals Si are modulated through weighted
connections.
F2 computes the pattern match between the input
and the weights.
?i xi wij X . Wj X Wj cos(X, Wj)

31
Adaptive filter elements

The dot product computes the projection of one
vector on another. The term XWj denotes the
energy, whereas cos(X,Wj) denotes the pattern.
If the both vectors are normalized (X Wj
XWj 1), then X.Wj cos(X,Wj). This
indicates how well the weight vector of the
neuron matched with the input vector.
The neuron with the largest activity at F2 has
the weights that are most close to the input.

This property is inherent in computational models
of neurons, and can be considered a good model
for the biological neurons.
32
Course Organization
33
Teaching staff

Asst.Prof. Erol Sahin (erol_at_ceng.metu.edu.tr)
Location B-106, Tel 210 5539
E-mail erol_at_ceng.metu.edu.tr
Office hours By appointment.
Lectures Monday 1340-1630 (BMB4)

34
Books

Introduction to the Theory of Neural Computation
by John Hertz, Anders Krogh, Richard G. Palmer,
Santa Fe Institute Studies in the Sciences of
Complexity
Neurocomputing Foundations of ResearchEdited by
J. A. Anderson, E.Rosenfeld, MIT Press,
Cambridge, 1988.
Self-organization and associative memory by T.
Kohonen, Springer-Verlag, 1988.Available at the
Reserve section of the library.
Parallel Distributed Processing I and II,by J.
McClelland and D. Rumelhart . MIT Press,
Cambridge, MA, 1986.
Pattern Recognition by Self-Organizing Neural
Networks, Edited by G.A. Carpenter, and S.
Grossberg, Cambridge, MA, MIT Press, 1994.
Principles of Neural Science, E.R, Kandel, J.H.
Schwartz, and T.M. Jessel, Appleton Lange, 1991.
Also other complementary articles that will be
made available

35
Workload and Grading
36
Weekly reading assignments

You will be given weekly readings to be read
before the next class.
One-two page summary of these readings will be
asked. Sometimes, you will be handed some
questions to answer, or problems to solve.

37
Projects

You will be asked to simulate four neural network
models and apply them to a given problem.
For each project, you will be asked to submit a
4-5 page report, at a conference-paper quality
level.
Your project report will be graded based on its
Style,
Writing
Results and their analysis
Discussion of results

38
Presentation

You will be asked to review one or more papers
and make a 15 minute presentation on it.
If you already have a topic you are interested,
you can also propose them in advance.

39
Communication

These slides will be available at the course
webpage http//kovan.ceng.metu.edu.tr/erol/Course
s/CENG569/.
Announcements about the course will be made on
the web site (The CENG569 newsgroup at
news//metu.ceng.course.569 can be used for other
discussions regarding the course)
If you have a specific question you can send an
e-mail to me. However make sure that the subject
line starts with CENG569 capital letters, and no
spaces to get faster reply.

40
Policies

Late assignments
Reading assignments are due within 15 minutes of
each class. Late submissions are not accepted.
Project reports submitted within 1 week of its
due time will get 80, within 2 weeks 60 .
Reports will not be accepted afterwards.
Academic dishonesty
All assignments submitted should be fully your
own. We have a zero tolerance policy on cheating
and plagiarism. Your work will be regularly
checked for such misconduct and the following
punishment policy will be applied

41
Cheating

What is cheating?
Sharing code either by copying, retyping,
looking at, or supplying a copy of a file.
What is NOT cheating?
Helping others use systems or tools.
Helping others with high-level design issues.
Helping others debug their code.

42
Good Luck!
43
Homework - 1

In half a page, describe the information
processing aspects of the biological neuron.
Build AND, OR and INVERT gates using
McCulloch-Pitts neurons.
What aspects of the McCulloch-Pitts neuron are
different from the biological neuron?

Write a Comment

User Comments (0)