Title: 9.012 Brain and Cognitive Sciences II
19.012Brain andCognitive Sciences II
Part VIII Intro to Language Psycholinguistics
- Dr. Ted Gibson
2Presented by Liu Lab
Fighting for Freedom with Cultured Neurons
3 Distributed Representations, Simple Recurrent
Networks, And Grammatical Structure Jeffrey L.
Elman (1991) Machine Learning
4Distributed Representations/ Neural Networks
- are meant to capture the essence of neural
computation - many small, independent units calculating very
simple functions in parallel.
5Distributed Representations/ Neural Networks
EXPLICIT RULES?
6Distributed Representations/ Neural Networks
EXPLICIT RULES?
7Distributed Representations/ Neural Networks
EXPLICIT RULES?
EMERGENCE!
8Distributed Representations/ Neural Networks
- are meant to capture the essence of neural
computation - many small, independent units calculating very
simple functions in parallel.
9FeedForward Neural Network (from Sebastians
Teaching)
10Dont forget the nonlinearity!
11FeedForward Neural Network (from Sebastians
Teaching)
12Recurrent Network (also from Sebastian)
13Why Apply Network / Connectionist Modeling to
Language Processing?
- Connectionist Modeling is Good at What it Does
- Language is a HARD problem
14What We Are Going to Do
15What We Are Going to Do
16What We Are Going to Do
- Build a network
- Let it learn how to read
17What We Are Going to Do
- Build a network
- Let it learn how to read
- Then test it!
18What We Are Going to Do
- Build a network
- Let it learn how to read
- Then test it!
- Give it some words in a reasonably grammatical
sentence - Let it try to predict the next word,
- Based on what it knows about grammar
19What We Are Going to Do
- Build a network
- Let it learn how to read
- Then test it!
- Give it some words in a reasonably grammatical
sentence - Let it try to predict the next word,
- Based on what it knows about grammar
- BUT Were not going to tell it any of the rules
20What We Are Going to Do
21FeedForward Neural Network (from Sebastians
Teaching)
22Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
INPUT
23What We Are Going to Do
- Build a network
- Let it learn how to read
24Methods gt Network Implementation gt Training
Words Were going to Teach it
- - Nouns
- boy girl cat dog
- boys girls cats dogs
- - Proper Nouns
- John Mary
- Who
- - Verbs
- chase feed see hear walk live
- chases feeds sees hears walks lives
- End Sentence
25Methods gt Network Implementation gt Training
1. Encode Each Word with Unique Activation Pattern
26Methods gt Network Implementation gt Training
1. Encode Each Word with Unique Activation Pattern
- - boy gt 000000000000000000000001
- girl gt 000000000000000000000010
- feed gt 000000000000000000000100
- -sees gt 000000000000000000001000
- . . .
- who gt 010000000000000000000000
- End sentence gt
- 100000000000000000000000
27Methods gt Network Implementation gt Training
1. Encode Each Word with Unique Activation Pattern
- - boy gt 000000000000000000000001
- girl gt 000000000000000000000010
- feed gt 000000000000000000000100
- -sees gt 000000000000000000001000
- . . .
- who gt 010000000000000000000000
- End sentence gt
- 100000000000000000000000
2. Feed these words sequentially to the
network (only feed words in sequences that make
good grammatical sense!)
28Methods gt Network Implementation gt Structure
INPUT
29Methods gt Network Implementation gt Structure
1000000000000
INPUT
30Methods gt Network Implementation gt Structure
HIDDEN
1000000000000
INPUT
31Methods gt Network Implementation gt Structure
100100100100100100100100
HIDDEN
1000000000000
INPUT
32Methods gt Network Implementation gt Structure
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
INPUT
33Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
INPUT
34Methods gt Network Implementation gt Training
1. Encode Each Word with Unique Activation Pattern
- - boy gt 000000000000000000000001
- girl gt 000000000000000000000010
- feed gt 000000000000000000000100
- -sees gt 000000000000000000001000
- . . .
- who gt 010000000000000000000000
- End sentence gt
- 100000000000000000000000
2. Feed these words sequentially to the
network (only feed words in sequences that make
good grammatical sense!)
35Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
INPUT
36What We Are Going to Do
- Build a network
- Let it learn how to read
37Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
INPUT
38Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
If learning word relations, need some sort of
memory from word to word!
1000000000000
INPUT
39FeedForward Neural Network (from Sebastians
Teaching)
40Recurrent Network (also from Sebastian)
41Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
INPUT
42Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
100100100100100100100100
INPUT
CONTEXT
43Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
100100100100100100100100
INPUT
CONTEXT
44Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
100100100100100100100100
INPUT
CONTEXT
45Methods gt Network Implementation gt Structure
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
100100100100100100100100
INPUT
CONTEXT
46Methods gt Network Implementation gt Structure
BACKPROP!
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
100100100100100100100100
INPUT
CONTEXT
47What We Are Going to Do
- Build a network
- Let it learn how to read
- Then test it!
- Give it some words in a reasonably grammatical
sentence - Let it try to predict the next word,
- Based on what it knows about grammar
- BUT Were not going to tell it any of the rules
48Results gt Emergent Properties of Network gt
Subject-Verb Agreement
- After Hearing
- boy.
- Network SHOULD predict next word is
- chases
- NOT
- chase
- Subject and verb should agree!
49Results gt Emergent Properties of Network gt
Noun-Verb Agreement
- After Hearing
- boy.
- Network SHOULD predict next word is
- chases
- NOT
- chase
- Subject and verb should agree!
50Results gt Emergent Properties of Network gt
Noun-Verb Agreement
boy..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
51Results gt Emergent Properties of Network gt
Noun-Verb Agreement
boy..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
52Results gt Emergent Properties of Network gt
Noun-Verb Agreement
- Likewise, after Hearing
- boys. (or boyz!)
- Network SHOULD predict next word is
- chase
- NOT
- chases
- Again, subject and verb should agree!
53Results gt Emergent Properties of Network gt
Noun-Verb Agreement
boys..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
54Results gt Emergent Properties of Network gt
Noun-Verb Agreement
boys..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
55Results gt Emergent Properties of Network gt
Noun-Verb Agreement
boys..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Theres a difference between nouns and verbs.
There are even different kinds of nouns that
require different kinds of verbs.
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
56Results gt Emergent Properties of Network gt
Verb-Argument Agreement
- After Hearing
- chase
- Network SHOULD predict next word is
- some direct object (like boys)
- NOT
- .
- Hey, if a verb needs an argument,
- it only makes sense to give it one!
57Results gt Emergent Properties of Network gt
Verb-Argument Agreement
- Likewise, after hearing the verb
- lives
- Network SHOULD predict next word is
- .
- NOT
- dog
- If the verb doesnt make sense with an argument,
- It falls upon us to withhold one from it.
58Results gt Emergent Properties of Network gt
Verb-Argument Agreement
boy chases..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
59Results gt Emergent Properties of Network gt
Verb-Argument Agreement
boy chases..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
60Results gt Emergent Properties of Network gt
Verb-Argument Agreement
boy lives..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
61Results gt Emergent Properties of Network gt
Verb-Argument Agreement
boy lives..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
62Results gt Emergent Properties of Network gt
Verb-Argument Agreement
boy lives..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
There are different kinds of verbs that require
different kinds of nouns.
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
63Results gt Emergent Properties of Network gt
Longer-Range Dependence
- After hearing
- boy who mary chases
- Network might predict next word is
- boys
- Since it learned that boys follows mary
chases - But if its smart
- may realize that chases is linked to
boys, not mary - In which case you need a verb next, not a noun!
- A good lithmus test for some intermediate
understanding?
64Results gt Emergent Properties of Network gt
Verb-Argument Agreement
boys who Mary..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
65Results gt Emergent Properties of Network gt
Verb-Argument Agreement
boys who Mary..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
66Results gt Emergent Properties of Network gt
Subject-Verb Agreement
boys who mary chases..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
67Results gt Emergent Properties of Network gt
Subject-Verb Agreement
boys who mary chases feed..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
68Results gt Emergent Properties of Network gt
Subject-Verb Agreement
boys who mary chases feed cats..
End of Sentence
Who
Plural Verb, DO Impossible
Plural Verb, DO Required
Plural Verb, DO Optional
What Word Network Predicts is Next
Single Verb, DO Impossible
Single Verb, DO Required
Single Verb, DO Optional
Plural Noun
Single Noun
0.0 0.2 0.4 0.6 0.8 1.0
Activation
69What We Are Going to Do
- Build a network
- Let it learn how to read
- Then test it!
- Give it some words in a reasonably grammatical
sentence - Let it try to predict the next word,
- Based on what it knows about grammar
- BUT Were not going to tell it any of the rules
70Did Network Learn About Grammar?
- It learned there are different classes of nouns
that need singular and plural verbs. - It learned there are different classes of verbs
that have diff. requirements in terms of direct
objects. - It learned that sometimes there are long-distance
dependencies that dont follow from immediately
preceding words - gt relative clauses and constituent structure of
sentences.
71(No Transcript)
72Once You Have a Successful Network, can Examine
its Properties with Controlled I/O Relationships
- Boys hear boys
- Boy hears boys.
- Boy who boys chase chases boys.
- Boys who boys chase chase boys.
73Methods gt Network Implementation gt Structure
BACKPROP!
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
100100100100100100100100
INPUT
CONTEXT
74Distributed Representations/ Neural Networks
EXPLICIT RULES?
75Distributed Representations/ Neural Networks
EXPLICIT RULES?
76Methods gt Network Implementation gt Structure
BACKPROP!
0000000000001
OUTPUT
100100100100100100100100
HIDDEN
1000000000000
100100100100100100100100
INPUT
CONTEXT
77What Does it Mean, No Explicit Rules?
- Does it just mean the mapping is too
complicated? - Too difficult to formulate?
- Unknown?
- Possibly just our own failure to understand the
mechanism, rather than description of mechanism
itself.
78General Advantages of Distributed Models
- Distributed, which while not limitless, is less
rigid than models where there is strict mapping
from concept to node. - Generalizations are captured at a higher level
than input abstractly. So generalization to
new input is possible.
79FOUND / ISOLATED 4-CELL NEURAL NETWORKS
80(No Transcript)
819.012Brain andCognitive Sciences II
Part VIII Intro to Language Psycholinguistics
- Dr. Ted Gibson
82Presented by Liu Lab
Fighting for Freedom with Cultured Neurons
83If you have built castles in the air, your work
need not be lost that is where they should be.
Now put the foundations under them.
-- Henry David Thoreau