Title: Jakobson's Grand Unified Theory of Linguistic Cognition
1Jakobson's Grand Unified Theory of Linguistic
Cognition
- Paul Smolensky
- Cognitive Science Department
- Johns Hopkins University
with
Géraldine Legendre Alan Prince Peter Jusczyk
Suzanne Stevenson
Elliott Moreton Karen Arnold Donald
Mathis Melanie Soderstrom
2Grammar and Cognition
- 1. What is the system of knowledge?
- 2. How does this system of knowledge arise in
the mind/brain? - 3. How is this knowledge put to use?
- 4. What are the physical mechanisms that serve
as the material basis for this system of
knowledge and for the use of this knowledge? - (Chomsky 88 p. 3)
3Advertisement
- The complete story, forthcoming (2003) Blackwell
- The harmonic mind From neural computation to
optimality-theoretic grammar - Smolensky Legendre
4Jakobsons Program
- A Grand Unified Theory for the cognitive science
of language is enabled by Markedness - Avoid a
- ? Structure
- Alternations eliminate a
- Typology Inventories lack a
- ? Acquisition
- a is acquired late
- ? Processing
- a is processed poorly
- ? Neural
- Brain damage most easily disrupts a
Formalize through OT?
5 Structure Acquisition Use Neural Realization
- ? Theoretical. OT (Prince Smolensky 91, 93)
- Construct formal grammars directly from
markedness principles - General formalism/ framework for grammars
phonology, syntax, semantics GB/LFG/ - Strongly universalist inherent typology
6 Structure Acquisition Use Neural Realization
- Theoretical Formal structure enables OT-general
- Learning algorithms
- Constraint Demotion Provably correct and
efficient (when part of a general decomposition
of the grammar learning problem) - Tesar 1995 et seq.
- Tesar Smolensky 1993, , 2000
- Gradual Learning Algorithm
- Boersma 1998 et seq.
- ? Empirical
- Initial state predictions explored through
behavioral experiments with infants
7 Structure Acquisition Use Neural Realization
- Theoretical
- Theorems regarding the computational complexity
of algorithms for processing with OT grammars - Tesar 94 et seq.
- Ellison 94
- Eisner 97 et seq.
- Frank Satta 98
- Karttunen 98
8 Structure Acquisition Use Neural Realization
- Theoretical OT derives from the theory of
abstract neural (connectionist) networks - via Harmonic Grammar (Legendre, Miyata, Smolensky
90) - For moderate complexity, now have general
formalisms for realizing - complex symbol structures as distributed patterns
of activity over abstract neurons - structure-sensitive constraints/rules as
distributed patterns of strengths of abstract
synaptic connections - optimization of Harmony
? Construction of a miniature, concrete LAD
9Program
- Structure
- ? OT
- Constructs formal grammars directly from
markedness principles - Strongly universalist inherent typology
- ? OT allows completely formal markedness-based
explanation of highly complex data - Acquisition
- ? Initial state predictions explored through
behavioral experiments with infants - Neural Realization
- ? Construction of a miniature, concrete LAD
10? The Great Dialectic
- Phonological representations serve two masters
FAITHFULNESS
MARKEDNESS
Locked in conflict
11OT from Markedness Theory
- MARKEDNESS constraints a No a
- FAITHFULNESS constraints
- Fa demands that /input/ ? output leave a
unchanged (McCarthy Prince 95) - Fa controls when a is avoided (and how)
- Interaction of violable constraints Ranking
- a is avoided when a Fa
- a is tolerated when Fa a
- M1 M2 combines multiple markedness dimensions
12OT from Markedness Theory
- MARKEDNESS constraints a
- FAITHFULNESS constraints Fa
- Interaction of violable constraints Ranking
- a is avoided when a Fa
- a is tolerated when Fa a
- M1 M2 combines multiple markedness dimensions
- Typology All cross-linguistic variation results
from differences in ranking in how the
dialectic is resolved (and in how multiple
markedness dimensions are combined)
13OT from Markedness Theory
- MARKEDNESS constraints
- FAITHFULNESS constraints
- Interaction of violable constraints Ranking
- Typology All cross-linguistic variation results
from differences in ranking in resolution of
the dialectic - Harmony MARKEDNESS FAITHFULNESS
- A formally viable successor to Minimize
Markedness is OTs Maximize Harmony (among
competitors)
14 ? Structure
- Explanatory goals achieved by OT
- Individual grammars are literally and formally
constructed directly from universal markedness
principles - Inherent Typology
- Within the analysis of phenomenon F in language
L is inherent a typology of F across all languages
15Program
- Structure
- ? OT
- Constructs formal grammars directly from
markedness principles - Strongly universalist inherent typology
- ? OT allows completely formal markedness-based
explanation of highly complex data --- Friday - Acquisition
- ? Initial state predictions explored through
behavioral experiments with infants - Neural Realization
- ? Construction of a miniature, concrete LAD
16? Structure Summary
- OT builds formal grammars directly from
markedness MARK, with FAITH - Friday
- Inventories consistent with markedness relations
are formally the result of OT with local
conjunction - Even highly complex patterns can be explained
purely with simple markedness constraints all
complexity is in constraints interaction through
ranking and conjunction Lango ATR vowel harmony
17Program
- Structure
- ? OT
- Constructs formal grammars directly from
markedness principles - Strongly universalist inherent typology
- ? OT allows completely formal markedness-based
explanation of highly complex data - Acquisition
- ? Initial state predictions explored through
behavioral experiments with infants - Neural Realization
- ? Construction of a miniature, concrete LAD
18Nativism I Learnability
- Learning algorithm
- Provably correct and efficient (under strong
assumptions) - Sources
- Tesar 1995 et seq.
- Tesar Smolensky 1993, , 2000
- If you hear A when you expected to hear E,
increase the Harmony of A above that of E by
minimally demoting each constraint violated by A
below a constraint violated by E
19Constraint Demotion Learning
If you hear A when you expected to hear E,
increase the Harmony of A above that of E by
minimally demoting each constraint violated by A
below a constraint violated by E
Correctly handles difficult case multiple
violations in E
20Nativism I Learnability
- M F is learnable with /inpossible/?impossible
- not in- except when followed by
- exception that proves the rule, M NPA
- M F is not learnable from data if there are no
exceptions (alternations) of this sort, e.g.,
if lexicon produces only inputs with mp, never
np then ?M and ?F, no M vs. F conflict, no
evidence for their ranking - Thus must have M F in the initial state, H0
21The Initial State
- OT-general MARKEDNESS FAITHFULNESS
- Learnability demands (Richness of the Base)
- (Alan Prince, p.c., 93 Smolensky 96a)
- ? Child production restricted to the unmarked
- ? Child comprehension not so restricted
- (Smolensky 96b)
22Nativism II Experimental Test
- Collaborators
- Peter Jusczyk
- Theresa Allocco
- Language Acquisition (2002)
23Nativism II Experimental Test
- Linking hypothesis
- More harmonic phonological stimuli ? Longer
listening time - More harmonic
- ?M ? M, when equal on F
- ?F ? F, when equal on M
- When must chose one or the other, more harmonic
to satisfy M M F - M Nasal Place Assimilation (NPA)
24Experimental Paradigm
- Headturn Preference Procedure (Kemler Nelson et
al. 95 Jusczyk 97)
- X/Y/XY paradigm (P. Jusczyk)
- un...b?...umb?
- un...b?...umb?
FNP
R
p .006
?FAITH
- Highly general paradigm Main result
254.5 Months (NPA)
264.5 Months (NPA)
274.5 Months (NPA)
284.5 Months (NPA)
29Program
- Structure
- ? OT
- Constructs formal grammars directly from
markedness principles - Strongly universalist inherent typology
- ? OT allows completely formal markedness-based
explanation of highly complex data - Acquisition
- ? Initial state predictions explored through
behavioral experiments with infants - Neural Realization
- ? Construction of a miniature, concrete LAD
30The question
- The nativist hypothesis, central to generative
linguistic theory - Grammatical principles respected by all human
languages are encoded in the genome. - Questions
- Evolutionary theory How could this happen?
- Empirical question Did this happen?
- Today What concretely could it mean for a
genome to encode innate knowledge of universal
grammar?
31 UGenomics
- The game Take a first shot at a concrete example
of a genetic encoding of UG in a Language
Acquisition Device - Proteins ? Universal grammatical principles ?
Time to willingly suspend disbelief
32 UGenomics
- The game Take a first shot at a concrete example
of a genetic encoding of UG in a Language
Acquisition Device - Proteins ? Universal grammatical principles ?
- Case study Basic CV Syllable Theory (Prince
Smolensky 93) - Innovation Introduce a new level, an abstract
genome notion parallel to and encoding
abstract neural network
33Approach Multiple Levels of Encoding
Biological Genome
34UGenome for CV Theory
- Three levels
- Abstract symbolic Basic CV Theory
- Abstract neural CVNet
- Abstract genomic CVGenome
35UGenomics Symbolic Level
- Three levels
- Abstract symbolic Basic CV Theory
- Abstract neural CVNet
- Abstract genomic CVGenome
36Approach Multiple Levels of Encoding
Biological Genome
37Basic syllabification Function
- Basic CV Syllable Structure Theory
- Basic No more than one segment per syllable
position .(C)V(C). - /underlying form/ ? surface form
- /CVCC/ ? .CV.C V C. /pædd/?pæd?d
- Correspondence Theory
- McCarthy Prince 1995 (MP)
- /C1V2C3C4/ ? .C1V2.C3 V C4
38Why basic CV syllabification?
- underlying ? surface linguistic forms
- Forms simple but combinatorially productive
- Well-known universals typical typology
- Mini-component of real natural language grammars
- A (perhaps the) canonical model of universal
grammar in OT
39Syllabification Constraints (Con)
- PARSE Every element in the input corresponds to
an element in the output - ONSET No V without a preceding C
- etc.
40UGenomics Neural Level
- Three levels
- Abstract symbolic Basic CV Theory
- Abstract neural CVNet
- Abstract genomic CVGenome
41Approach Multiple Levels of Encoding
Biological Genome
42CVNet Architecture
/ C1 C2 /
C1 V C2
43Connection substructure
44PARSE
- All connection coefficients are 2
45ONSET
- All connection coefficients are ?1
46Crucial Open Question(Truth in Advertising)
- Relation between strict domination and neural
networks?
47CVNet Dynamics
- Boltzmann machine/Harmony network
- Hinton Sejnowski 83 et seq. Smolensky 83 et
seq. - stochastic activation-spreading algorithm higher
Harmony ? more probable - CVNet innovation connections realize fixed
symbol-level constraints with variable strengths - learning modification of Boltzmann machine
algorithm to new architecture
48Learning Behavior
- A simplified system can be solved analytically
- Learning algorithm turns out to
- Dsi(?) e violations of constrainti P?
49UGenomics Genome Level
- Three levels
- Abstract symbolic Basic CV Theory
- Abstract neural CVNet
- Abstract genomic CVGenome
50Approach Multiple Levels of Encoding
Biological Genome
51Connectivity geometry
52ONSET
x0 segment S S VO
N S x0
53Connectivity PARSE
- Correspondence units grow north west and
connect with input output units.
- Output units grow east and connect
- Input units grow south and connect
-
54To be encoded
- How many different kinds of units are there?
- What information is necessary (from the source
units point of view) to identify the location of
a target unit, and the strength of the connection
with it? - How are constraints initially specified?
- How are they maintained through the learning
process?
55Unit types
- Input units C V
- Output units C V x
- Correspondence units C V
- 7 distinct unit types
- Each represented in a distinct sub-region of the
abstract genome - Help ourselves to implicit machinery to spell
out these sub-regions as distinct cell types,
located in grid as illustrated
56Direction of projection growth
- Topographic organizations widely attested
throughout neural structures - Activity-dependent growth a possible alternative
- Orientation information (axes)
- Chemical gradients during development
- Cell age a possible alternative
57Projection parameters
- Direction
- Extent
- Local
- Non-local
- Target unit type
- Strength of connections encoded separately
58Connectivity Genome
- Contributions from ONSET and PARSE
59CVGenome Connectivity
60Encoding connection strength
- Network-level specification
- For each constraint ?i , need to embody
- Constraint strength si
- Connection coefficients (F ? ? cell
types) - Product of these is contribution of ?i to the
F ? ? connection weight
61Processing
62Development
63Learning
64CVGenome Connection Coefficients
65Abstract Gene Map
General Developmental Machinery
Connectivity
Constraint Coefficients
C-I
V-I
C-C
direction
extent
target
CORRESPOND
RESPOND
COVx B 1
CCVC B ?2
CC CICO 1
VC VIVO 1
G??
G??
?
?
66UGenomics
- Realization of processing and learning algorithms
in abstract molecular biology, using the types
of interactions known to be biologically possible
and genetically encodable
67UGenomics
- Host of questions to address
- Will this really work?
- Can it be generalized to distributed nets?
- Is the number of genes 770.26 plausible?
- Are the mechanisms truly biologically plausible?
- Is it evolvable?
? How is strict domination to be handled?
68Hopeful Conclusion
- Progress is possible toward a Grand Unified
Theory of the cognitive science of language - addressing the structure, acquisition, use, and
neural realization of knowledge of language - strongly governed by universal grammar
- with markedness as the unifying principle
- as formalized in Optimality Theory at the
symbolic level - and realized via Harmony Theory in abstract
neural nets which are potentially encodable
genetically
69Hopeful Conclusion
- Progress is possible toward a Grand Unified
Theory of the cognitive science of language
Thank you for your attention (and indulgence)
Still lots of promissory notes, but all in a
common currency Harmony unmarkedness
hopefully this will promote further progress by
facilitating integration of the sub-disciplines
of cognitive science