Title: Connectionism in 2 hours
1Connectionism in 2 hours
- Christer Johansson
- Computational Linguistics
- Bergen University
2Why should Linguists be interested in
Connectionism?
- Alternative to good old AI (rules)
- Learning - knowledge is acquired
- Biological plausibility (?)
- Practical applications handles uncertain data,
only needs representative exemplars.
3The main point of Connectionism
- There is no central processor
4Processing is interaction
- Each neuron is a simple processor that receives
information from other neurons and sends out
activation or deactivation to other neurons,
depending on how much it was activated. - Processing is an emergent phenomena from the
activity of large quantities of such simple
processors.
5Hebbian Learning (1949)
- Neurons that fire togetherwire together
6Processing is subsymbolic
- The information neurons work with is without
(much) content. - This allows for easy interaction between
different modalities. (McGurk-effect).
7Biological inspirations
- Some numbers
- The human brain contains about 10 billion nerve
cells (neurons) - Each neuron is connected to the others through
10000 synapses (average some more some much
less) - Properties of the brain
- It can learn, reorganize itself from experience
- It adapts to the environment
- It is robust and fault tolerant
8Connectionism vs AI
9Connectionist Successes
- Graceful degradation
- Connectionist models can be damaged and still
keep (some) functionality. - Models of reaction time studies.
- Learning path emerges from complexity in the data
and the learning law. U-shaped learning common.
10Not so good (yet)
- Systematicity, information structure and
encapsulation of information. - If you understand the boy ate the fish you also
understand the fish ate the boy. - Typically a neural net would allow global
information to affect the interpretation (thus
disregarding structural information). - The tomato ate the boy.
- I love apples.
11Not so good (yet)
- Fast Mapping
- A child may observe the use of a word once or
twice, and still be able to use the word
correctly several weeks later. - Sound is mapped to meaning fast
- The mapping degrades slowly
- Neural networks do not typically show these
behaviors. - Radial Basis Networks? Instance based learning.
12Philosophical issues
- Is connectionism a better model of intelligence
(mind) than symbolic AI? - What do we mean by better? Researchers do not
agree what intelligence is. - None of the models correct?
- AI models handle symbolic information better but
have little to say how symbolic behavior emerge. - Connectionist models better at pattern
recognition (noisy input, missing values,
redundancy).
13Other arguments for connectionism
- Many relevant phenomena has been modeled. The
activity in itself leads to new knowledge, and
some insights into possible mechanisms. - Models of aphasia, dyslexia etc. Many with
detailed predictions, and even implications for
remedies. - Interaction between information sources is taken
seriously.
14Connectionism in Linguistics
- Case Rule based behaviorPast Tense U-shaped
learning
15Chomsky Pinker
- Learning language is done by acquiring rules
(setting the parameters in a fixed format), which
are processed by a specific, fixed, mechanism and
expressed in an internal language (compare
machine language). - Chomsky main interest description of language
complexity. - Pinker Tries to push the point that Language
depends on innate machinerywe only acquire the
data the machinery processes and set the
parameters of the processor. - (Neural networks have an innate mechanism for
how to learn from input).
16U-shaped Learning
- Children often use correct past tense forms,
before over-generalizing the rule, and later
recover to correct usage. - Neural networks have a tendency for a similar
behavior. - At first the free capacity in the net allows
memorization. - When the regularity is discovered it is applied
generally, and interacts with previous knowledge,
which then gets an error signal that makes it
possible to recover.
17U-shaped learning
- On the downside for connectionism
- The input needs to contain a clear signal.
- Which means some preprocessing, and in effect
building the solution into the representation. - Still, the u-shaped learning seems a fairly
robust characteristic of many different neural
models.
18U-shaped learning
- Pinker has proposed a so called dual route model.
He argues that - Regular forms are done by symbolic processing
- The irregular are done by association, a la
connectionism. - But this makes little sense if we are allowed to
separate the regular from the irregular then a
neural network can easily learn the regular
alternations (as well as the irregulars but it
cannot learn gaps).
19Fodor Modularity
- Fodor, among others, argues that
compositionality systematicity can only be
made by symbolic machinery. - /S//P//I//L/ gt spill ed(past) spilled.
- The kangaroo jumped over the elephant. gt
- The elephant jumped over the kangaroo.
20Not Truth based
- Adam loves Eve.
- does NOT imply
- Eve loves Adam.
- But still if the first is understood, the second
should also be understood. (Role of Syntax).
21Fodor Modularity
- Information needs to be encapsulated.
- Different levels should not interact (each level
is encapsulated). - Leaky modules
- Modules in the brain. Are different anatomical
areas specialized for - Different general tasks?
- Functionally specific tasks (say syntactic
processing)?
22Outline
- The rest of the talk falls into two categories
- Biology Neurons, the Brain and Language Areas
- Technology Practical Applications of Neural
Networks
23Looking at the Brain
- What about the argument for specialized modules
for language? - Brocas area.
24Biological neuron
- A neuron has
- input (via dendrites)
- output (via the axon)
- The information mediated from the dendrites to
the axon via the cell body - Axon connects to dendrites (of other neurons) via
synapses - Synapses vary in strength
- Synapses may be excitatory or inhibitory
25Neurons
Cell machinery
Surface structure
26Schematic Neuron
Summation function Input
Output Weights
27Example Neurons
Neurons come in a variety of flavours.
28Neuronal organisation
Neurons are organised into hierarchical
layers. Within each layer we often have
inhibitory connections.
29The Brain
30Outline
31Neuroimaging ConfirmationYES
- reading complex sentences vs. letter strings
32Points
- confirmation of left hemisphere dominance
- confirmation of classical language areas
- modification
- involvement of additional areas
33Brocas area
- Brocas area is involved in the comprehension
of complex sentences
34Simple Sentences vs. Passive Fixation
35The role of Brocas area?
- That Brocas area is involved does not mean that
syntactic processing is located in the left
inferior frontal lobe - simple sentences do not reliably activate this
area - other tasks with similar cognitive components
also activate this area
36Wijers et al WM task
37Wijers et al WM task
38Wijers et al WM task
39Conclusions
- Language Areas not specific for language.
- Language may depend on interaction
- Modules? (The brain is functionally structured)
- Neurons? (All neurons may contribute)
40Technical Applications
41Properties of Neural Networks
- Supervised networks are universal approximators
- Theorem Any limited function can be
approximated by a neural network with a finite
number of hidden neurons to an arbitrary
precision. - This could be useful )
42Other properties
- Adaptivity
- Adapt weights to environment (examples)
- Easily retrainable
- Generalization ability
- May counteract lack of data
- Fault tolerance
- Graceful degradation of performances if damaged.
- damage might also be faulty input (noise,
missing values etc.) - The information is distributed within the entire
net.
43Classification (Discrimination)
- Estimation of the probability for a certain
object to belong to a specific class - Can be used for Data Mining
- Applications Economy, speech and visual pattern
recognition, sociology, etc.
44Example
Examples of handwritten postal codes drawn from
a database available from the US Postal service
45What do we need to use NN ?
- Determination of input should be (what
information is available, what info do we need ) - A representative Collection of data for the
learning and testing phase of the neural network - Find an optimum number of hidden nodes
- Estimate the parameters (Learning running the
algorithm) - Evaluate the performance of the network
- IF (when) performance is not satisfactory
Review (all) the precedent points
46What are NNs used for?
- Prediction
- The weather tomorrow
- Classification
- X-ray shows cancer or not?
- Association / error correction
- Associate a pattern with another pattern /
itself. - Filtering
- Take noise / echo out of telephone signal
47What are NNs used for in Language Technology?
- Text-to-speech
- NetTalk
- Speech Recognition (as part of larger systems)
- Estimate probability distributions
- Word ltgt document association
- Information Retrieval