Systematicity in sentence processing by recurrent neural networks - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Systematicity in sentence processing by recurrent neural networks

Description:

beauty full = beautiful. beauty less = ugly. Some issues for ... They do not form good models of human language behavior. But. what does it mean to 'fail' ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 28
Provided by: sfr33
Category:

less

Transcript and Presenter's Notes

Title: Systematicity in sentence processing by recurrent neural networks


1
Systematicity in sentence processingby recurrent
neural networks
  • Stefan Frank
  • Nijmegen Institute for Cognition and Information
  • Radboud University Nijmegen
  • The Netherlands

2
Please make it heavy on computers and AI and
light on the psycho stuff (Konstantopoulos,
personal communication, December 23, 2005)
3
Systematicity in language
  • Imagine you meet someone who only knows two
    sentences of English

Could you please tell me where the toilet is?
I cant find my hotel.
So (s)he does not know
Could you please tell me where my hotel is?
I cant find the toilet.
This person has no knowledge of English but
simply memorized some lines from a phrase book.
4
Systematicity in language
  • Human language behavior is (more or less)
    systematic if you know some sentences, you know
    many.
  • Sentences are not atomic but made up of words.
  • Likewise, words can be made up of
    morphemes.(e.g., un clear unclear, un
    stable unstable, )
  • It seems like language results from applying a
    set of rules (grammar, morphology) to symbols
    (words, morphemes).

5
Systematicity in language
  • The Classical symbol system hypothesis the mind
    contains word-like symbols the are manipulated by
    structure-sensitive processes (Fodor Pylyshyn,
    1988). E.g., for dealing with language
  • boy and girl are nouns (N)
  • loves and sees are verbs (V)
  • N V N is a possible sentence structure
  • This hypothesis explains the systematicity found
    in language If you know the N V N structure, you
    know all N V N sentences (boy sees girl, girl
    loves boy, boy sees boy, )

6
Some issues for the Classical theory
  • Lack of systematic behavior Why are people often
    so unsystematic in practice?

The boy plays. OK
The boy who the girl likes plays. OK
The boy who the girl who the man sees likes
plays. OK?
The athlete who the coach who the sponsor hired
trained won. OK!
7
Some issues for the Classical theory
  • Lack of systematic behavior Why are people often
    so unsystematic in practice?
  • Lack of systematicity in language Why are there
    exceptions to rules?

help full helpful help less
helpless meaning full meaningful meaning
less meaningless
beauty full beautiful beauty less ugly
8
Some issues for the Classical theory
  • Lack of systematic behavior Why are people often
    so unsystematic in practice?
  • Lack of systematicity in language Why are there
    exceptions to rules?
  • Development How do children learn the rules from
    what they hear?

The Classical theory has answers to these
questions, but no explanations.
9
Connectionism
  • The state of mind is represented as a pattern
    of activity over a large number of simple,
    quantitative (i.e., non-logical) processing units
    (neurons).
  • These units are connected by weighted links,
    forming a (neural) network through which
    activation moves around.
  • The connection weights are adjusted to the
    networks input and task.
  • The network develops its own internal
    representation of the input.
  • It should generalize to new (test) inputs

10
Connectionism and the Classical issues
  • Lack of systematic behavior Systematicity is
    built on top of an unsystematic architecture.
  • Lack of systematicity in language Beautiless
    is expected statistically but never occurs, so
    the network learns it doesnt exist.
  • Development The network adapts to its input.

But can neural networks explain systematicity, or
even behave systematically?
11
Connectionism and systematicity
  • Fodor Pylyshyn (1988) Neural networks cannot
    be systematic. They only learn to associate
    examples rather than becoming sensitive to
    structure.
  • Systematicity knowing X ? knowing
    Y.Generalization training on X ? learning
    Y.So, systematicity equals generalization
    (Hadley, 1994)
  • Demonstrations of connectionist systematicity
  • require many training examples but only use few
    tests
  • are not robust oversensitive to training details
  • only display weak systematicity words occur in
    the same syntactic positions of training and
    test sentences

12
Simple Recurrent Networks Elman (1990)
Feedforward networks have long-term memory (LTM)
but no short-term memory (STM). So how to process
sequential input, like the words of a sentence?
output layer
hidden layer
input layer
13
SRNs and systematicity Van der Velde et al.
(2004)
  • An SRN processed a minilanguage with
  • 18 words (boy, girl, loves, sees, who, ., )
  • 3 sentence types
  • N V N . (boy sees girl.)
  • N V N who V N . (boy sees girl who loves boy.)
  • N who N V V N . (boy who girl sees loves boy.)
  • Nouns and verbs were divided into four groups,
    each had two nouns and two verbs.
  • In training sentences, nouns and verbs were from
    the same group lt 0.44 of sentences used for
    training.
  • In test sentences, nouns and verbs came from
    different groups. Note weak systematicity only.

14
SRNs and systematicity Van der Velde et al.
(2004)
  • SRNs fail on test sentences, so
  • They do not generalize to structurally similar
    sentences
  • They cannot learn systematic behavior from a
    small training set
  • They do not form good models of human language
    behavior
  • But
  • what does it mean to fail? Maybe the network
    was more than completely non-systematic?
  • was the size of the network appropriate?
  • larger network ? more STM ? better processing ?
  • smaller network ? less LTM ? better
    generalization ?
  • was the language complex enough? With more
    different words there is more reason to abstract
    to syntactic types (nouns, verbs)

15
SRNs and systematicityreplication of Van der
Velde et al. (2004)
  • What if a network does not generalize at all?
    When given a new sentence, it can only use the
    last word because combing words requires
    generalization.
  • This hypothetical, unsystematic network serves as
    the baseline for rating SRN performance.
  • Performance 1 network never makes ungrammatical
    predictions
  • Performance 0 network does not generalize at
    all, but gives the best possible output based on
    the last word
  • Performance 1 network only makes ungrammatical
    predictions.
  • Positive performance indicates systematicity

16
Network architecture
w 18 units(one for each word)
output layer
10 units
hidden layer
n 20 units
recurrenthidden layer
w 18 units(one for each word)
input layer
17
SRN Results
Positive performance at each word of each test
sentence type, so there is some systematicity.
18
SRN Resultseffect of recurrent layer size
N V N
N V N who V N
N who N V V N
Larger networks (n 40) do better, but very
large ones (n 100) overfit.
19
SRN performance and memory
  • SRNs do show systematicity to some extent.
  • But their performance is limited
  • small n ? limited processing capacity (STM)
  • large n ? large LTM ? overfitting.
  • How to combine large STM with small LTM?

20
Echo State NetworksJaeger (2003)
  • Keep the connections to and within the recurrent
    layer fixed at random values.
  • The recurrent layer becomes a dynamical
    reservoir a non-specific STM for the input
    sequence.
  • Some constraints on the dynamical reservoir
  • large enough
  • sparsely connected (here 15)
  • weight matrix has spectral radius lt 1
  • LTM capacity
  • In SRNs O(n2)
  • In ESNs O(n)
  • So, can ESNs combine large STM with small LTM?

21
Network architecture
w 18 units
output layer
trained untrained
10 units
hidden layer
The STM remains untrained, but the network does
develop internal representations
n 20 units
recurrenthidden layer
input layer
w 18 units
22
ESN Results
Positive performance at each word of each test
sentence type, so there is some systematicity,
but less than in an SRN of the same size
23
ESN Resultseffect of recurrent layer size
N V N
N V N who V N
N who N V V N
Bigger is better no overfitting even when n
1530!
24
ESN Resultseffect of lexicon size (n 100)
N V N
N V N who V N
N who N V V N
Note with larger w, a smaller percentage of
possible sentences is used for training.
25
Strong systematicity
  • 30 words (boy(s), girl(s), like(s), see(s), who,
    )
  • Many sentence types
  • N V N . (girl sees boys.)
  • N V N who V N . (girl sees boys who like boy.)
  • N who N V V N . (girl who boy sees likes boy.)
  • N who V N who N V . (girls who like boys see boys
    who girl likes.)
  • Unlimited recursion (girls see boy who sees boy
    who sees man who )
  • Number agreement between nouns and verbs

26
Strong systematicity
  • In training sentences females as grammatical
    subjects, males as grammatical objects (girl sees
    boy)
  • In test sentences vice versa (boy sees girl)
  • Positive performance on all words of four test
    sentences types
  • N who V N V N . (boy who likes girls sees woman.)
  • N V N who V N . (boy likes girls who see woman.)
  • N who N V V N . (boys who man likes see girl.)
  • N V N who N V . (boys like girl who man sees.)

27
Conclusions
  • ESNs can display both weak and strong
    systematicity
  • Even with few training sentences and many test
    sentences
  • By doing less training, the network can learn
    more
  • Training fewer connections gives better results
  • Training a smaller part of possible sentences
    gives better results
  • Can connectionism explain systematicity?
  • No, because neural networks do not need to be
    systematic
  • Yes, because they need to adapt to systematicity
    in the training input.
  • The source of systematicity is not the cognitive
    system, but the external world.
Write a Comment
User Comments (0)
About PowerShow.com