Title: The Harmonic Mind
1The Harmonic Mind
- Paul Smolensky
- Cognitive Science Department
- Johns Hopkins University
with
Géraldine Legendre Donald Mathis Melanie
Soderstrom
Alan Prince Peter Jusczyk
2Advertisement
The Harmonic Mind From neural computation to
optimality-theoretic grammar Paul
Smolensky Géraldine Legendre
- Blackwell 2002 (??)
- Develop the Integrated Connectionist/Symbolic
(ICS) Cognitive Architecture - Apply to the theory of grammar
- Present a case study in formalist
multidisciplinary cognitive science show
inputs/outputs of ICS
3Talk Outline
- ? Sketch the ICS cognitive architecture,
pointing to contributions from/to traditional
disciplines - Connectionist processing as optimization
- Symbolic representations as activation patterns
- Knowledge representation Constraints
- Constraint interaction I Harmonic Grammar,
Parser - Explaining productivity in ICS (Fodor et al.
88 et seq.) - Constraint interaction II Optimality Theory
(OT) - Nativism I Learnability theory in OT
- Nativism II Experimental test
- Nativism III UGenome
4Processing I Activation
- Computational neuroscience ? ICS
- Key sources
- Hopfield 1982, 1984
- Cohen and Grossberg 1983
- Hinton and Sejnowski 1983, 1986
- Smolensky 1983, 1986
- Geman and Geman 1984
- Golden 1986, 1988
5Processing I Activation
Processing spreading activation is
optimization Harmony maximization
6Processing II Optimization
- Cognitive psychology ? ICS
- Key sources
- Hinton Anderson 1981
- Rumelhart, McClelland, the PDP Group 1986
Processing spreading activation is
optimization Harmony maximization
7Processing II Optimization
Processing spreading activation is
optimization Harmony maximization
8Two Fundamental Questions
? Harmony maximization is satisfaction of
parallel, violable constraints
- 2. What are the constraints?
- Knowledge representation
- Prior question
- 1. What are the activation patterns data
structures mental representations evaluated
by these constraints?
9Representation
- Symbolic theory ? ICS
- Complex symbol structures
- Generative linguistics ? ICS
- Particular linguistic representations
- PDP connectionism ? ICS
- Distributed activation patterns
- ICS
- realization of (higher-level) complex symbolic
structures in distributed patterns of activation
over (lower-level) units (tensor product
representations etc.)
10Representation
11Constraints
- Linguistics (markedness theory) ? ICS
- ICS ? Generative linguistics Optimality Theory
- Key sources
- Prince Smolensky 1993 ms. Rutgers TR
- McCarthy Prince 1993 ms.
- Texts Archangeli Langendoen 1997, Kager 1999,
McCarthy 2001 - Electronic archive http//roa.rutgers.edu
12Constraints
NOCODA A syllable has no coda
H(as k æ t) sNOCODA lt 0
13Constraint Interaction I
- ICS ? Grammatical theory
- Harmonic Grammar
- Legendre, Miyata, Smolensky 1990 et seq.
14Constraint Interaction I
The grammar generates the representation that
maximizes H this best-satisfies the constraints,
given their differential strengths
Any formal language can be so generated.
15Harmonic Grammar Parsing
- Simple, comprehensible network
- Simple grammar G
- X ? A B Y ? B A
- Language
-
Parsing
16Simple Network Parser
- Fully self-connected, symmetric network
- Like previously shown network
Except with 12 units representations and
connections shown below
17Explaining Productivity
- Approaching full-scale parsing of formal
languages by neural-network Harmony maximization - Have other networks that provably compute
recursive functions - !? productive competence
- How to explain?
181. Structured representations
19 2. Structured connections
20 Proof of Productivity
- Productive behavior follows mathematically from
combining - the combinatorial structure of the vectorial
representations encoding inputs outputs - and
- the combinatorial structure of the weight
matrices encoding knowledge
21Constraint Interaction II OT
- ICS ? Grammatical theory
- Optimality Theory
- Prince Smolensky 1993
22Constraint Interaction II OT
- Differential strength encoded in strict
domination hierarchies - Every constraint has complete priority over all
lower-ranked constraints (combined) - Approximate numerical encoding employs special
(exponentially growing) weights - Grammars cant count question period
23Constraint Interaction II OT
- Constraints are universal (Con)
- Candidate outputs are universal (Gen)
- Human grammars differ only in how these
constraints are ranked - factorial typology
- First true contender for a formal theory of
cross-linguistic typology - 1st innovation of OT constraint ranking
- 2nd innovation Faithfulness
24The Faithfulness / Markedness Dialectic
- cat /kat/ ? kæt NOCODA why?
- FAITHFULNESS requires pronunciation lexical
form - MARKEDNESS often opposes it
- Markedness-Faithfulness dialectic ? diversity
- English FAITH NOCODA
- Polynesian NOCODA FAITH (French)
- Another markedness constraint M
- Nasal Place Agreement Assimilation (NPA)
?g ? ?b, ?d velar
nd ? md, ?d coronal
mb ? nb, ?b labial
25Optimality Theory
- Diversity of contributions to theoretical
linguistics - Phonology
- Syntax
- Semantics
- Here New connections between linguistic theory
the cognitive science of language more generally - Learning
- Neuro-genetic encoding
26Nativism I Learnability
- Learning algorithm
- Provably correct and efficient (under strong
assumptions) - Sources
- Tesar 1995 et seq.
- Tesar Smolensky 1993, , 2000
- If you hear A when you expected to hear E,
increase the Harmony of A above that of E by
minimally demoting each constraint violated by A
below a constraint violated by E
27Constraint Demotion Learning
If you hear A when you expected to hear E,
increase the Harmony of A above that of E by
minimally demoting each constraint violated by A
below a constraint violated by E
Correctly handles difficult case multiple
violations in E
28Nativism I Learnability
- M F is learnable with /inpossible/?impossible
- not in- except when followed by
- exception that proves the rule, M NPA
- M F is not learnable from data if there are no
exceptions (alternations) of this sort, e.g.,
if lexicon produces only inputs with mp, never
np then ?M and ?F, no M vs. F conflict, no
evidence for their ranking - Thus must have M F in the initial state, H0
29Nativism II Experimental Test
- Collaborators
- Peter Jusczyk
- Theresa Allocco
- (Elliott Moreton, Karen Arnold)
30Nativism II Experimental Test
- Linking hypothesis
- More harmonic phonological stimuli ? Longer
listening time - More harmonic
- ?M ? M, when equal on F
- ?F ? F, when equal on M
- When must chose one or the other, more harmonic
to satisfy M M F - M Nasal Place Assimilation (NPA)
314.5 Months (NPA)
324.5 Months (NPA)
334.5 Months (NPA)
344.5 Months (NPA)
35Nativism III UGenome
- Can we combine
- Connectionist realization of harmonic grammar
- OTs characterization of UG
- to examine the biological plausibility of UG as
innate knowledge? - Collaborators
- Melanie Soderstrom
- Donald Mathis
36Nativism III UGenome
- The game take a first shot at a concrete example
of a genetic encoding of UG in a Language
Acquisition Device no commitment to its
(in)correctness - Introduce an abstract genome notion parallel to
(and encoding) abstract neural network - Is connectionist empiricism clearly more
biologically plausible than symbolic nativism? - No!
37The Problem
- No concrete examples of such a LAD exist
- Even highly simplified cases pose a hard problem
- How can genes which regulate production of
proteins encode symbolic principles of
grammar? - Test preparation Syllable Theory
38Basic syllabification Function
- /underlying form/ ? surface form
- Plural form of dish
- /d?s/ ? .d?. ? z.
- /CVCC/ ? .CV.C V C.
39Basic syllabification Function
- /underlying form/ ? surface form
- Plural form of dish
- /d?s/ ? .d?. ? z.
- /CVCC/ ? .CV.C V C.
- Basic CV Syllable Structure Theory
- Prince Smolensky 1993 Chapter 6
- Basic No more than one segment per syllable
position .(C)V(C).
40Basic syllabification Function
- /underlying form/ ? surface form
- Plural form of dish
- /d?s/ ? .d?. ? z.
- /CVCC/ ? .CV.C V C.
- Basic CV Syllable Structure Theory
- Correspondence Theory
- McCarthy Prince 1995 (MP)
- /C1V2C3C4/ ? .C1V2.C3 V C4
41Syllabification Constraints (Con)
- PARSE Every element in the input corresponds to
an element in the output no deletion MP 95
MAX
42Syllabification Constraints (Con)
- PARSE Every element in the input corresponds to
an element in the output - FILLV/C Every output V/C segment corresponds to
an input V/C segment every syllable position in
the output is filled by an input segment no
insertion/epenthesis MP 95 DEP
43Syllabification Constraints (Con)
- PARSE Every element in the input corresponds to
an element in the output - FILLV/C Every output V/C segment corresponds to
an input V/C segment - ONSET No V without a preceding C
44Syllabification Constraints (Con)
- PARSE Every element in the input corresponds to
an element in the output - FILLV/C Every output V/C segment corresponds to
an input V/C segment - ONSET No V without a preceding C
- NOCODA No C without a following V
45Network Architecture
/C1 C2 /
C1 V C2
46Connection substructure
47PARSE
- All connection coefficients are 2
48ONSET
- All connection coefficients are ?1
49Network Dynamics
50Crucial Open Question(Truth in Advertising)
- Relation between strict domination and neural
networks? - Apparently not a problem in the case of the CV
Theory
51To be encoded
- How many different kinds of units are there?
- What information is necessary (from the source
units point of view) to identify the location of
a target unit, and the strength of the connection
with it? - How are constraints initially specified?
- How are they maintained through the learning
process?
52Unit types
- Input units C V
- Output units C V x
- Correspondence units C V
- 7 distinct unit types
- Each represented in a distinct sub-region of the
abstract genome - Help ourselves to implicit machinery to spell
out these sub-regions as distinct cell types,
located in grid as illustrated
53Connectivity geometry
54Constraint PARSE
- Input units grow south and connect
- Output units grow east and connect
- Correspondence units grow north west and
connect with input output units.
55Constraint ONSET
- Short connections grow north-south between
adjacent V output units, - and between the first V node and the first x
node.
56Direction of projection growth
- Topographic organizations widely attested
throughout neural structures - Activity-dependent growth a possible alternative
- Orientation information (axes)
- Chemical gradients during development
- Cell age a possible alternative
57Projection parameters
- Direction
- Extent
- Local
- Non-local
- Target unit type
- Strength of connections encoded separately
58Connectivity Genome
- Contributions from ONSET and PARSE
59ONSET
x0 segment S S VO
N S x0
60Encoding connection strength
- Network-level specification
- For each constraint ?i , need to embody
- Constraint strength si
- Connection coefficients (F ? ? cell
types) - Product of these is contribution of ?i to the
F ? ? connection weight
61Processing
62Development
63Learning
64Learning Behavior
- A simplified system can be solved analytically
- Learning algorithm turns out to
- Dsi(?) e violations of constrainti P?
65Abstract Gene Map
General Developmental Machinery
Connectivity
Constraint Coefficients
C-I
V-I
C-C
direction
extent
target
CORRESPOND
RESPOND
COVx B 1
CCVC B ?2
CC CICO 1
VC VIVO 1
G??
G??
?
?
66Summary
- Described an attempt to integrate
- Connectionist theory of mental processes
- (computational neuroscience, cognitive
psychology) - Symbolic theory of
- Mental functions (philosophy, linguistics)
- Representations
- General structure (philosophy, AI)
- Specific structure (linguistics)
- Informs theory of UG
- Form, content
- Genetic encoding
67Thanks for your attention!!