Title: Neural networks
1Neural networks
- Eric Postma
- IKAT
- Universiteit Maastricht
2Overview
- Introduction The biology of neural networks
- the biological computer
- brain-inspired models
- basic notions
- Interactive neural-network demonstrations
- Perceptron
- Multilayer perceptron
- Kohonens self-organising feature map
- Examples of applications
3A typical AI agent
4Two types of learning
- Supervised learning
- curve fitting, surface fitting, ...
- Unsupervised learning
- clustering, visualisation...
5An input-output function
6Fitting a surface to four points
7(Artificial) neural networks
- The digital computer versus
- the neural computer
8The Von Neumann architecture
9The biological architecture
10Digital versus biological computers
- 5 distinguishing properties
- speed
- robustness
- flexibility
- adaptivity
- context-sensitivity
11Speed The hundred time steps argument
- The critical resource that is most obvious is
time. Neurons whose basic computational speed is
a few milliseconds must be made to account for
complex behaviors which are carried out in a few
hudred milliseconds (Posner, 1978). This means
that entire complex behaviors are carried out in
less than a hundred time steps. - Feldman and Ballard (1982)
12Graceful Degradation
performance
damage
13Flexibility the Necker cube
14vision constraint satisfaction
15Adaptivitiy
processing implies learning in biological
computers versus processing does not imply
learning in digital computers
16Context-sensitivity patterns
emergent properties
17Robustness and context-sensitivitycoping with
noise
18The neural computer
- Is it possible to develop a model after the
natural example? - Brain-inspired models
- models based on a restricted set of structural en
functional properties of the (human) brain
19The Neural Computer (structure)
20Neurons, the building blocks of the brain
21Neural activity
out
in
22Synapses,the basis of learning and memory
23Learning Hebbs rule
neuron 1
synapse
neuron 2
24Connectivity
- An example
- The visual system is a feedforward hierarchy of
neural modules - Every module is (to a certain extent)
responsible for a certain function
25(Artificial) Neural Networks
- Neurons
- activity
- nonlinear input-output function
- Connections
- weight
- Learning
- supervised
- unsupervised
26Artificial Neurons
- input (vectors)
- summation (excitation)
- output (activation)
i1
a f(e)
i2
e
i3
27Input-output function
a ? 0
f(e)
a ? ?
e
28Artificial Connections (Synapses)
- wAB
- The weight of the connection from neuron A to
neuron B
29The Perceptron
30Learning in the Perceptron
- Delta learning rule
- the difference between the desired output tand
the actual output o, given input x
- Global error E
- is a function of the differences between the
desired and actual outputs
31Gradient Descent
32Linear decision boundaries
33The history of the Perceptron
- Rosenblatt (1959)
- Minsky Papert (1961)
- Rumelhart McClelland (1986)
34The multilayer perceptron
input
hidden
output
35Training the MLP
- supervised learning
- each training pattern input desired output
- in each epoch present all patterns
- at each presentation adapt weights
- after many epochs convergence to a local minimum
36phoneme recognition with a MLP
Output pronunciation
input frequencies
37Non-linear decision boundaries
38Compression with an MLPthe autoencoder
39hidden representation
40Learning in the MLP
41Preventing Overfitting
- GENERALISATION performance on test set
- Early stopping
- Training, Test, and Validation set
- k-fold cross validation
- leaving-one-out procedure
42Image Recognition with the MLP
43(No Transcript)
44Hidden Representations
45Other Applications
- Practical
- OCR
- financial time series
- fraud detection
- process control
- marketing
- speech recognition
- Theoretical
- cognitive modeling
- biological modeling
46Some mathematics
47Perceptron
48Derivation of the delta learning rule
Target output
Actual output
h i
49MLP
50Sigmoid function
- May also be the tanh function
- (lt-1,1gt instead of lt0,1gt)
- Derivative f(x) f(x) 1 f(x)
51Derivation generalized delta rule
52Error function (LMS)
53Adaptation hidden-output weights
54Adaptation input-hidden weights
55Forward and Backward Propagation
56Decision boundaries of Perceptrons
Straight lines (surfaces), linear separable
57Decision boundaries of MLPs
Convex areas (open or closed)
58Decision boundaries of MLPs
Combinations of convex areas
59Learning and representing similarity
60Alternative conception of neurons
- Neurons do not take the weighted sum of their
inputs (as in the perceptron), but measure the
similarity of the weight vector to the input
vector - The activation of the neuron is a measure of
similarity. The more similar the weight is to the
input, the higher the activation - Neurons represent prototypes
61Course Coding
622nd order isomorphism
63Prototypes for preprocessing
64Kohonens SOFM(Self Organizing Feature Map)
- Unsupervised learning
- Competitive learning
winner
output
input (n-dimensional)
65Competitive learning
- Determine the winner (the neuron of which the
weight vector has the smallest distance to the
input vector) - Move the weight vector w of the winning neuron
towards the input i
66Kohonens idea
- Impose a topological order onto the competitive
neurons (e.g., rectangular map) - Let neighbours of the winner share the prize
(The postcode lottery principle.) - After learning, neurons with similar weights tend
to cluster on the map
67- Topological order
- neighbourhoods
- Square
- winner (red)
- Nearest neighbours
- Hexagonal
- Winner (red)
- Nearest neighbours
68A simple example
- A topological map of 2 x 3 neurons and two inputs
visualisation
input
weights
69Weights before training
70Input patterns (note the 2D distribution)
71Weights after training
72Another example
- Input uniformly randomly distributed points
- Output Map of 202 neurons
- Training
- Starting with a large learning rate and
neighbourhood size, both are gradually decreased
to facilitate convergence
73(No Transcript)
74Dimension reduction
75Adaptive resolution
76Application of SOFM
Examples (input)
SOFM after training (output)
77Visual features (biologically plausible)
78Relation with statistical methods 1
- Principal Components Analysis (PCA)
Projections of data
pca1
pca2
79Relation with statistical methods 2
- Multi-Dimensional Scaling (MDS)
- Sammon Mapping
80Image Miningthe right feature
81Fractal dimension in art
Jackson Pollock (Jack the Dripper)
82Taylor, Micolich, and Jonas (1999). Fractal
Analysis of Pollocks drip paintings. Nature,
399, 422. (3 june).
Range for natural images
83Our Van Gogh research
- Two painters
- Vincent Van Gogh paints Van Gogh
- Claude-Emile Schuffenecker paints Van Gogh
84Sunflowers
- Is it made by
- Van Gogh?
- Schuffenecker?
85Approach
- Select appropriate features (skipped here, but
very important!) - Apply neural networks
86(No Transcript)
87Training Data
Schuffenecker (5000 textures)
88Results
- Generalisation performance
- 96 correct classification on untrained data
89Resultats, cont.
- Trained art-expert network applied to Yasuda
sunflowers - 89 of the textures is geclassificeerd as a
genuine Van Gogh
90A major caveat
- Not only the painters are different
- but also the materialand maybe many other
things
91(No Transcript)