Title: Connectionist Simulation of the Empirical Acquisition of Grammatical Relations
1 Connectionist Simulation of the Empirical
Acquisition of Grammatical Relations William C.
Morris, Jeffrey Elman
- Prepared by Katarzyna Gorczyca i Izabela Wnek
2Introduction
- Many accounts of L1A assume that grammatical
relations and linking rules are innate and
universal. - The main aim of our presentation - quite an
opposite approach grammatical relations are
learnt in a bottom-up fashion in lg acquisition
process. - The proposal is based on two observations
- early production of childhood speech is formulaic
and becomes systematic in a progressive fashion - grammatical relations are family-resemblance
categories and are too complex to be described by
a single parameter
3This hypothesis tested by connectionists
(Elman) Simple Recurrent Network
- SRN
- learns to map from sentences to semantic roles
- its newly developed subject has hidden layers
representations - makes generalisations and undergeneralisations
similar to those made by children
4Innateness vs bottom-up learning
- Grammatical relations (subject, object) a
problem for lg acquisition system - /Semantics world-knowledge ltgt syntax
abstract/ - One approach to learning syntax grammatical
relations relegated to the innate endowment that
the child is born with - - single parameter with the binary value
- accusative and ergative
- is sufficient to account for various
grammatical - systems
- BUT cross-linguistically therere no strictly
accusative or ergative lgs
5Connectionists proposal
- Abstractions such as subject emerge in two steps
- rote learning of particular constructions
- merging of the separately learnt constructions
(mini-grammars) - The experiment to be presented shows
- neural net trained with the task of
assigning semantic roles to sentence constituents
can acquire grammatical relations - - it associates particular subjecthood
properties with the appropriate verb arguments - - it manages (to a certain extent) to abstract
this nominal from its semantic context
6Shape of grammatical relations
Lg acquisition theories claim that lgs are either
- ACCUSATIVE
- Subject is an agent of
- the action, eg Max hit Larry and run away.
- (it is Max that run away
- nominal Max controls
- clause coordination)
- ERGATIVE
- Subject is a patient of
- the action, eg Max hit Larry and run away.
- (it is Larry that run
- away nominal Larry
- controls clause
- coordination Larry was hit by Max and run away)
7BUT!The issue is not merely the identity of the
subject.The issue is what properties the
various grammatical relations control.
8- Exemplary properties that can be associated with
the subject cross-linguistically - addressee of imperatives Idalia, listen to us!
- control of reflexivisation Beata enjoys herself.
- control of coordination Laura pinched Zaneta and
smiled.
9The grammatical relations of various lgs control
various combinations of these (and other)
properties.
- This is what we mean by the SHAPEof grammatical
relations. - Example
- English highly syntactically accusative lg
- (Most of the properties are controlled by the
subject) - Dyrbial highly syntactically ergative lg
- (Most of the properties are controlled by the
ergative subject or pivot) - Kampangan split lg
- (Neither highly ergative nor accusative in syntax)
10- For a lg acquisition process to be UNIVERSAL, it
must be able to accomodate a variety of lg types. - Simply setting on the identity of the subject is
not sufficient. - Rather, the various control patterns (shapes)
must be accomodated. - SRN- can learn a variety of shapes
11A connectionist simulation
- Testing whether a network could build abstract
relationships corresponding to subjects and
objects - There is no innate knowledge of lg in the network
(no grammatical relations, no features
facilitating word displacement etc.) - Main assumptions
- System can process sequential data
- Its trying to map sequences of words to semantic
roles
12EXPERIMENT
- SRN takes in sentences with various patterns
- At each time step, a word or a full stop is
presented - After each sentence an input representing
reset is presented to zero out the outputs. - The output patterns represent semantic roles in a
slot-based respresentation. - The input vocabulary 56 words (25 verbs, 25
nouns, 6 function words)
13Network architecture
14- SRN was taught to assign the proper noun
identifiers to the appropriate roles for a number
of sentence structures. - Types of sentences
- 1. simple declerative intransitives, eg.
- Sandy jumped (agent role) Sandy fell (patient
role) - 2. simple declerative transitives, eg.
- Sandy kissed him (ag. pt.) Sandy saw him.
- 3. simple declerative passives,eg.
- Sandy was kissed (pt.)
- 4. questions
- Who did Sandy kiss? (ag. pt., object questioned)
- Who kissed Sandy? (ag.pt., subject questioned)
15Generalisation test
- Test involved two systematic gaps two types of
sentences not present in training - passive sentences with experiental verbs, eg.
Dominika was seen by Max. - questioning embedded subjects in transitive
clauses with experiental verbs - eg. Who did Marta persuade to see Lidka?
16RESULTS
- SRN (as connectionists expected) reacted to those
gaps in a different way - It didnt cope with the passive construction.
- It bridged the questioning embedded subject gap.
- conspiracy of construction
- (it was provided with a sufficiently varied
constructions to cope with this gap
successfully) - The same was observed in case of child L1A
17- How the network represented subjects internally
(in the hidden layer) ? - each verb construction combination has a specific
place where the subject is being encoded - agents and patients are stored separately because
they can appear together experiencers are
stored very close to agents since they never
apper together.
18CONCLUSIONS
- The most abstract aspects of lg are learnable.
- The networks ability to abstract from semantics
ability to partially bridge the artificial gap
in the training set (questioned embedded subject
of experiental verbs). - SRN was able to define the position of the
subject in terms of a semantically-abstract
entity.