CENG 569 Spring 2006 NEUROCOMPUTING - PowerPoint PPT Presentation

1 / 40
About This Presentation



Dept. of Computer Engineering. Middle East Technical University ... Later f( ) is replaced by other continuous squashing functions. ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 41
Provided by: erols


Transcript and Presenter's Notes

Title: CENG 569 Spring 2006 NEUROCOMPUTING

Erol SahinDept. of Computer EngineeringMiddle
East Technical UniversityInonu Bulvari, 06531,
Ankara, TURKEY
  • Week 1
  • Introduction to neural nets the beginnings and
    the basics
  • Course objectives

Whats our motivation?
  • Science Model how biological neural systems,
    like human brain, work?
  • How do we see?
  • How is information stored in/retrieved from
  • How do you learn to not to touch fire?
  • How do your eyes adapt to the amount of light in
    the environment?
  • Related fields Neuroscience, Computational
    Neuroscience, Psychology, Psychophysiology,
    Cognitive Science, Medicine, Math, Physics.

Whats our motivation?
  • Engineering Design information processing
    systems that are as good as the biological ones?
  • How can we design a vision system that can see?
  • how can we store data such that it can be
    retrieved fast?
  • How can we make a robot not to repeat an action
    that burned it before?
  • How can we make an articial retina that
    automatically adapts to the mount of light?
  • Related fields Computer Science, Statistics,
    Electronics Engineering, Mechanical Engineering.

The biological neuron - simplified
  • The basic information processing element of
    neural systems. The neuron
  • receives input signals generated by other neurons
    through its dendrites,
  • integrates these signals in its body,
  • then generates its own signal (a series of
    electric pulses) that travel along the axon which
    in turn makes contacts with dendrites of other
  • The points of contact between neurons are called

The biological neuron - 1
  • The pulses generated by the neuron travels along
    the axon as an electrical wave.
  • Once these pulses reach the synapses at the end
    of the axon open up chemical vesicles exciting
    the other neuron.

The biological neuron - 2
What this course is about? - 1
  • Introduce and review computational models that
    are often classified as neural networks.
  • The first part of the course will cover
  • Perceptron, Adaline
  • Multi-layer perceptrons, and the back-propagation
    learning algorithm
  • Hopfield model, Boltzmann machine
  • Unsupervised learning models
  • Kohonens self-organized maps
  • Radial basis functions
  • Adaptive Resonance Theory models
  • Support vector machines
  • Properties and application of these models will
    be reviewed.
  • Several programming projects will be given.

What this course is about? - 2
  • In the second part of the course, we will cover
    more biologically-plausible models of neural
  • Hodgkin-Huxley model of the biological neuron
  • Feed-forward shunting networks and their
  • Recurrent shunting networks and their properties
  • Classical and operant conditioning and their
    neural models

Todays topics
  • Brief history
  • McCulloch-Pitts neuron
  • Perceptron
  • Adaline

Brief History
  • Old Ages
  • Association (William James 1890)
  • McCulloch-Pitts Neuron (1943,1947)
  • Perceptrons (Rosenblatt 1958,1962)
  • Adaline/LMS (Widrow and Hoff 1960)
  • Perceptrons book (Minsky and Papert 1969)
  • Dark Ages
  • Self-organization in visual cortex (von der
    Malsburg 1973)
  • Backpropagation (Werbos, 1974)
  • Foundations of Adaptive Resonance Theory
    (Grossberg 1976)
  • Neural Theory of Association (Amari 1977)

  • Modern Ages
  • Adaptive Resonance Theory (Grossberg 1980)
  • Hopfield model (Hopfield 1982, 1984)
  • Self-organizing maps (Kohonen 1982)
  • Reinforcement learning (Sutton and Barto 1983)
  • Simulated Annealing (Kirkpatrick et al. 1983)
  • Boltzmann machines (Ackley, Hinton, Terrence
  • Backpropagation (Rumelhart, Hinton, Williams
  • ART-networks (Carpenter, Grossberg 1992)
  • Support Vector Machines

William James
  • William James (1890) Association
  • Mental facts cannot properly be studied apart
    from the physical environment of which they take
  • Principle of association When two brain
    processes are active together or in immediate
    succession, one of them, on reoccurring tends to
    propagate its excitement into the other

McCulloch-Pitts neuron model (1943)
  • f() is called the activation function.
  • ? is also called as the bias.
  • x1, x2 xN are the inputs at time t-1.
  • Inputs are binary.
  • At each interval the neuron can fire at most
  • Positive weights (wi) correspond to excitatory
    synapses and negative weights correspond to
    inhibitory synapses.
  • ? is the threshold for the neuron to fire.
  • f() is a non-linear hard-limiting function in
    the original McCulloch-Pitts model.
  • No learning mechanism!
  • Later f() is replaced by other continuous
    squashing functions.

  • McCulloch and Pitts (1943) showed that networks
    made from these neurons can implement logic
    functions, such as AND, OR, XOR. Therefore these
    networks are universal computation devices.
  • Homework
  • Build AND, OR and INVERT gates using
    McCulloch-Pitts neurons.
  • What aspects of the McCulloch-Pitts neuron are
    different from the biological neuron?

  • McCulloch-Pitts (1943) The first computational
    neuron model.
  • Showed that networks made from these neurons can
    implement logic functions, such as AND, OR, XOR.
    Therefore these networks are universal
    computation devices.
  • Assumptions made
  • Neuron activation is binary.
  • At least a certain number of excitatory inputs
    are needed to excite the neuron.
  • Even a single inhibitory input can inhibit the
  • No delay.
  • Network structure is fixed.
  • No adaptation!

Hebbs Learning Law
  • In 1949, Donald Hebb formulated William James
    principle of association into a mathematical form.
  • If the activation of the neurons, y1 and y2 , are
    both on (1) then the weight between the two
    neurons grow. (Off 0)
  • Else the weight between remains the same.
  • However, when bipolar activation -1,1 scheme
    is used, then the weights can also decrease when
    the activation of two neurons does not match.

Perceptron - history
  • Proposed by Rosenblatt et al. (1958-1962).
  • A large class of neural models that incorporate
  • Mark I perceptron is built with a retina of
    20X20 receptors.
  • Learned to recognize letters.
  • Created excitement, and hype. (ENIAC was built in

Perceptron - structure
  • A perceptron is a network of S, A and R units
    with a variable interaction which depends on the
    sequence of the past activity states of the
    network (Rosenblatt 1962).
  • S Sensory unit
  • A Association unit
  • V Variable interaction matrix
  • R The learning unit, a.k.a. the perceptron
  • Learning Setting the weights of V such that the
    network correctly classify the input patterns.

Perceptron clearer structure
Associative units
Response unit
Variable weights
Fixed weights
Step activation function
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
Perceptron - neuron
  • Perceptron's activation y depends on the (linear)
    sum of inputs (including a bias) converging on
    the neuron through weighted pathways.
  • x1, x2,,xN can be continuous.
  • It can learn!

Perceptron - activation
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
Classification problem
  • Imagine that there are two groups patterns One
    group is classified as A whereas the other
    classified as B
  • From a given set of example patterns whose
    categories are known apriori, how can one learn
    to correctly classify the input patterns, both
    those that were seen before and those that were
    not seen before.
  • Not an easy problem!

Input space representation
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
Perceptron learning
  • If the perceptron classified the input correct
    then do nothing.
  • If not, the weights of the active units are
  • Learning is guaranteed for all the problems that
    the perceptron can classify!
  • For each input pattern X, compute the (binary)
    output y, and compare it against the desired
    output d.
  • Activation
  • If X.W ? gt 0
  • then y1, else y0.
  • Learning
  • If y d, that is the output is correct
  • then W(t1) W(t)
  • If not, that is the output is incorrect,
  • then W(t1) W(t) ? W(t) where ? w(t) ? X.

Perceptron - intuition
  • A perceptron defines a hyperplane in N-1 space a
    line in 2-D (two inputs), a plane in 3-D (three
  • The perceptron is a linear classifier Its
    output is 0 on one side of the plane, and 1 for
    the other.
  • Given a linearly separable problem, the
    perceptron learning rule guarantees convergence.

Adaline - history
  • An adaptive pattern classication machine (called
    Adaline, for adaptive linear). . .
  • Proposed by Widrow and Hoff.
  • During a training phase, crude geometric
    patterns are fed to the machine by setting the
    toggle switches in the 4X4 input switch array.
    Setting another toggle switch (the reference
    switch) tells the machine whether the desired
    output for the particular input pattern is 1 or
    -1 . The system learns something from each
    pattern and accordingly experiences a design
    change.. (Widrow and Hoff 1960)

Adaline Widrow-Hoff Learning
  • The learning idea is as follows
  • Define an error function that measure the
    performance of the performance in terms of the
    weights, input, output and desired output.
  • Take the derivative of this function with respect
    to the weights, and modify the weights
    accordingly such that the error is decreased.
  • Also known as the Least Mean Square (LMS) error
    algorithm, the Widrow-Hoff rule, the Delta rule.

  • The Widrow-Hoff rule (also known as Delta Rule)
  • Minimizes the error between desired output t and
    the net input y_in
  • Minimizes the squared error for each pattern
  • Example if s 1 and t 0.5, then the graph of
    E against w1,1 would be
  • Gradient decent

wij(new) wij(old) ?(ti y_ini)xj
E (t y_in)2
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
The ADALINE learning algorithm
Step 0 Initialize all weights and set learning
rate wij (small random values) ? 0.2 (for
example) Step 1 While stopping condition is
false Step 1.1 For each training pair
st Step 1.1.1 Set activations on input
units xj sj Step 1.1.2 Compute net input
to output units y_ini bi ? xjwij Step
1.1.3 Update bias and weights bi(new)
bi(old) ?(ti y_ini) wij(new) wij(old)
?(ti y_ini)xj
Slide adapted from Dr. Nigel Crook from Oxford
Brookes University
Adaptive filter
  • F1 registers the input pattern.
  • Signals Si are modulated through weighted
  • F2 computes the pattern match between the input
    and the weights.
  • ?i xi wij X . Wj X Wj cos(X, Wj)

Adaptive filter elements
  • The dot product computes the projection of one
    vector on another. The term XWj denotes the
    energy, whereas cos(X,Wj) denotes the pattern.
  • If the both vectors are normalized (X Wj
    XWj 1), then X.Wj cos(X,Wj). This
    indicates how well the weight vector of the
    neuron matched with the input vector.
  • The neuron with the largest activity at F2 has
    the weights that are most close to the input.

This property is inherent in computational models
of neurons, and can be considered a good model
for the biological neurons.
Course Organization
Teaching staff
  • Asst.Prof. Erol Sahin (erol_at_ceng.metu.edu.tr)
  • Location B-106, Tel 210 5539
  • E-mail erol_at_ceng.metu.edu.tr
  • Office hours By appointment.
  • Lectures Monday 1340-1630 (BMB4)

  • Introduction to the Theory of Neural Computation
    by John Hertz, Anders Krogh, Richard G. Palmer,
    Santa Fe Institute Studies in the Sciences of
  • Neurocomputing Foundations of ResearchEdited by
    J. A. Anderson, E.Rosenfeld,  MIT Press,
    Cambridge, 1988.
  • Self-organization and associative memory by T.
    Kohonen, Springer-Verlag, 1988.Available at the
    Reserve section of the library.
  • Parallel Distributed Processing I and II,by J.
    McClelland and D. Rumelhart . MIT Press,
    Cambridge, MA, 1986.
  • Pattern Recognition by Self-Organizing Neural
    Networks, Edited by G.A. Carpenter, and S.
    Grossberg,  Cambridge, MA, MIT Press, 1994. 
  • Principles of Neural Science, E.R, Kandel, J.H.
    Schwartz, and T.M. Jessel, Appleton Lange, 1991.
  • Also other complementary articles that will be
    made available

Workload and Grading
Weekly reading assignments
  • You will be given weekly readings to be read
    before the next class.
  • One-two page summary of these readings will be
    asked. Sometimes, you will be handed some
    questions to answer, or problems to solve.

  • You will be asked to simulate four neural network
    models and apply them to a given problem.
  • For each project, you will be asked to submit a
    4-5 page report, at a conference-paper quality
  • Your project report will be graded based on its
  • Style,
  • Writing
  • Results and their analysis
  • Discussion of results

  • You will be asked to review one or more papers
    and make a 15 minute presentation on it.
  • If you already have a topic you are interested,
    you can also propose them in advance.

  • These slides will be available at the course
    webpage http//kovan.ceng.metu.edu.tr/erol/Course
  • Announcements about the course will be made on
    the web site (The CENG569 newsgroup at
    news//metu.ceng.course.569 can be used for other
    discussions regarding the course)
  • If you have a specific question you can send an
    e-mail to me. However make sure that the subject
    line starts with CENG569 capital letters, and no
    spaces to get faster reply.

  • Late assignments
  • Reading assignments are due within 15 minutes of
    each class. Late submissions are not accepted.
  • Project reports submitted within 1 week of its
    due time will get 80, within 2 weeks 60 .
    Reports will not be accepted afterwards.
  • Academic dishonesty
  • All assignments submitted should be fully your
    own. We have a zero tolerance policy on cheating
    and plagiarism. Your work will be regularly
    checked for such misconduct and the following
    punishment policy will be applied

  • What is cheating?
  • Sharing code either by copying, retyping,
    looking at, or supplying a copy of a file.
  • What is NOT cheating?
  • Helping others use systems or tools.
  • Helping others with high-level design issues.
  • Helping others debug their code.

Good Luck!
Homework - 1
  • In half a page, describe the information
    processing aspects of the biological neuron.
  • Build AND, OR and INVERT gates using
    McCulloch-Pitts neurons.
  • What aspects of the McCulloch-Pitts neuron are
    different from the biological neuron?
Write a Comment
User Comments (0)
About PowerShow.com