The Harmonic Mind - PowerPoint PPT Presentation

1 / 67

About This Presentation

Title:

The Harmonic Mind

Description:

Processing spreading activation is optimization: Harmony maximization ... 1. What are the activation patterns data structures mental representations ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 68

Provided by: paulsmo

Category:

more less

Transcript and Presenter's Notes

Title: The Harmonic Mind

1
The Harmonic Mind

Paul Smolensky
Cognitive Science Department
Johns Hopkins University

with
Géraldine Legendre Donald Mathis Melanie
Soderstrom
Alan Prince Peter Jusczyk
2
Advertisement
The Harmonic Mind From neural computation to
optimality-theoretic grammar Paul
Smolensky Géraldine Legendre

Blackwell 2002 (??)
Develop the Integrated Connectionist/Symbolic
(ICS) Cognitive Architecture
Apply to the theory of grammar
Present a case study in formalist
multidisciplinary cognitive science show
inputs/outputs of ICS

3
Talk Outline

? Sketch the ICS cognitive architecture,
pointing to contributions from/to traditional
disciplines
Connectionist processing as optimization
Symbolic representations as activation patterns
Knowledge representation Constraints
Constraint interaction I Harmonic Grammar,
Parser
Explaining productivity in ICS (Fodor et al.
88 et seq.)
Constraint interaction II Optimality Theory
(OT)
Nativism I Learnability theory in OT
Nativism II Experimental test
Nativism III UGenome

4
Processing I Activation

Computational neuroscience ? ICS
Key sources
Hopfield 1982, 1984
Cohen and Grossberg 1983
Hinton and Sejnowski 1983, 1986
Smolensky 1983, 1986
Geman and Geman 1984
Golden 1986, 1988

5
Processing I Activation
Processing spreading activation is
optimization Harmony maximization
6
Processing II Optimization

Cognitive psychology ? ICS

Key sources
Hinton Anderson 1981
Rumelhart, McClelland, the PDP Group 1986

Processing spreading activation is
optimization Harmony maximization
7
Processing II Optimization
Processing spreading activation is
optimization Harmony maximization
8
Two Fundamental Questions
? Harmony maximization is satisfaction of
parallel, violable constraints

2. What are the constraints?
Knowledge representation
Prior question
1. What are the activation patterns data
structures mental representations evaluated
by these constraints?

9
Representation

Symbolic theory ? ICS
Complex symbol structures
Generative linguistics ? ICS
Particular linguistic representations
PDP connectionism ? ICS
Distributed activation patterns
ICS
realization of (higher-level) complex symbolic
structures in distributed patterns of activation
over (lower-level) units (tensor product
representations etc.)

10
Representation
11
Constraints

Linguistics (markedness theory) ? ICS
ICS ? Generative linguistics Optimality Theory
Key sources
Prince Smolensky 1993 ms. Rutgers TR
McCarthy Prince 1993 ms.
Texts Archangeli Langendoen 1997, Kager 1999,
McCarthy 2001
Electronic archive http//roa.rutgers.edu

12
Constraints
NOCODA A syllable has no coda
H(as k æ t) sNOCODA lt 0
13
Constraint Interaction I

ICS ? Grammatical theory
Harmonic Grammar
Legendre, Miyata, Smolensky 1990 et seq.

14
Constraint Interaction I
The grammar generates the representation that
maximizes H this best-satisfies the constraints,
given their differential strengths
Any formal language can be so generated.
15
Harmonic Grammar Parsing

Simple, comprehensible network
Simple grammar G
X ? A B Y ? B A
Language

Parsing
16
Simple Network Parser

Fully self-connected, symmetric network
Like previously shown network

Except with 12 units representations and
connections shown below
17
Explaining Productivity

Approaching full-scale parsing of formal
languages by neural-network Harmony maximization
Have other networks that provably compute
recursive functions
!? productive competence
How to explain?

18
1. Structured representations
19
2. Structured connections
20
Proof of Productivity

Productive behavior follows mathematically from
combining
the combinatorial structure of the vectorial
representations encoding inputs outputs
and
the combinatorial structure of the weight
matrices encoding knowledge

21
Constraint Interaction II OT

ICS ? Grammatical theory
Optimality Theory
Prince Smolensky 1993

22
Constraint Interaction II OT

Differential strength encoded in strict
domination hierarchies
Every constraint has complete priority over all
lower-ranked constraints (combined)
Approximate numerical encoding employs special
(exponentially growing) weights
Grammars cant count question period

23
Constraint Interaction II OT

Constraints are universal (Con)
Candidate outputs are universal (Gen)
Human grammars differ only in how these
constraints are ranked
factorial typology
First true contender for a formal theory of
cross-linguistic typology
1st innovation of OT constraint ranking
2nd innovation Faithfulness

24
The Faithfulness / Markedness Dialectic

cat /kat/ ? kæt NOCODA why?
FAITHFULNESS requires pronunciation lexical
form
MARKEDNESS often opposes it
Markedness-Faithfulness dialectic ? diversity
English FAITH NOCODA
Polynesian NOCODA FAITH (French)
Another markedness constraint M
Nasal Place Agreement Assimilation (NPA)

?g ? ?b, ?d velar
nd ? md, ?d coronal
mb ? nb, ?b labial
25
Optimality Theory

Diversity of contributions to theoretical
linguistics
Phonology
Syntax
Semantics
Here New connections between linguistic theory
the cognitive science of language more generally
Learning
Neuro-genetic encoding

26
Nativism I Learnability

Learning algorithm
Provably correct and efficient (under strong
assumptions)
Sources
Tesar 1995 et seq.
Tesar Smolensky 1993, , 2000
If you hear A when you expected to hear E,
increase the Harmony of A above that of E by
minimally demoting each constraint violated by A
below a constraint violated by E

27
Constraint Demotion Learning
If you hear A when you expected to hear E,
increase the Harmony of A above that of E by
minimally demoting each constraint violated by A
below a constraint violated by E
Correctly handles difficult case multiple
violations in E
28
Nativism I Learnability

M F is learnable with /inpossible/?impossible
not in- except when followed by
exception that proves the rule, M NPA
M F is not learnable from data if there are no
exceptions (alternations) of this sort, e.g.,
if lexicon produces only inputs with mp, never
np then ?M and ?F, no M vs. F conflict, no
evidence for their ranking
Thus must have M F in the initial state, H0

29
Nativism II Experimental Test

Collaborators
Peter Jusczyk
Theresa Allocco
(Elliott Moreton, Karen Arnold)

30
Nativism II Experimental Test

Linking hypothesis
More harmonic phonological stimuli ? Longer
listening time
More harmonic
?M ? M, when equal on F
?F ? F, when equal on M
When must chose one or the other, more harmonic
to satisfy M M F
M Nasal Place Assimilation (NPA)

31
4.5 Months (NPA)
32
4.5 Months (NPA)
33
4.5 Months (NPA)
34
4.5 Months (NPA)
35
Nativism III UGenome

Can we combine
Connectionist realization of harmonic grammar
OTs characterization of UG
to examine the biological plausibility of UG as
innate knowledge?
Collaborators
Melanie Soderstrom
Donald Mathis

36
Nativism III UGenome

The game take a first shot at a concrete example
of a genetic encoding of UG in a Language
Acquisition Device no commitment to its
(in)correctness
Introduce an abstract genome notion parallel to
(and encoding) abstract neural network
Is connectionist empiricism clearly more
biologically plausible than symbolic nativism?
No!

37
The Problem

No concrete examples of such a LAD exist
Even highly simplified cases pose a hard problem
How can genes which regulate production of
proteins encode symbolic principles of
grammar?
Test preparation Syllable Theory

38
Basic syllabification Function

/underlying form/ ? surface form
Plural form of dish
/d?s/ ? .d?. ? z.
/CVCC/ ? .CV.C V C.

39
Basic syllabification Function

/underlying form/ ? surface form
Plural form of dish
/d?s/ ? .d?. ? z.
/CVCC/ ? .CV.C V C.
Basic CV Syllable Structure Theory
Prince Smolensky 1993 Chapter 6
Basic No more than one segment per syllable
position .(C)V(C).

40
Basic syllabification Function

/underlying form/ ? surface form
Plural form of dish
/d?s/ ? .d?. ? z.
/CVCC/ ? .CV.C V C.
Basic CV Syllable Structure Theory
Correspondence Theory
McCarthy Prince 1995 (MP)
/C1V2C3C4/ ? .C1V2.C3 V C4

41
Syllabification Constraints (Con)

PARSE Every element in the input corresponds to
an element in the output no deletion MP 95
MAX

42
Syllabification Constraints (Con)

PARSE Every element in the input corresponds to
an element in the output
FILLV/C Every output V/C segment corresponds to
an input V/C segment every syllable position in
the output is filled by an input segment no
insertion/epenthesis MP 95 DEP

43
Syllabification Constraints (Con)

PARSE Every element in the input corresponds to
an element in the output
FILLV/C Every output V/C segment corresponds to
an input V/C segment
ONSET No V without a preceding C

44
Syllabification Constraints (Con)

PARSE Every element in the input corresponds to
an element in the output
FILLV/C Every output V/C segment corresponds to
an input V/C segment
ONSET No V without a preceding C
NOCODA No C without a following V

45
Network Architecture

/C1 C2/ ? C1 V C2

/C1 C2 /
C1 V C2
46
Connection substructure
47
PARSE

All connection coefficients are 2

48
ONSET

All connection coefficients are ?1

49
Network Dynamics
50
Crucial Open Question(Truth in Advertising)

Relation between strict domination and neural
networks?
Apparently not a problem in the case of the CV
Theory

51
To be encoded

How many different kinds of units are there?
What information is necessary (from the source
units point of view) to identify the location of
a target unit, and the strength of the connection
with it?
How are constraints initially specified?
How are they maintained through the learning
process?

52
Unit types

Input units C V
Output units C V x
Correspondence units C V
7 distinct unit types
Each represented in a distinct sub-region of the
abstract genome
Help ourselves to implicit machinery to spell
out these sub-regions as distinct cell types,
located in grid as illustrated

53
Connectivity geometry

Assume 3-d grid geometry

54
Constraint PARSE

Input units grow south and connect
Output units grow east and connect
Correspondence units grow north west and
connect with input output units.

55
Constraint ONSET

Short connections grow north-south between
adjacent V output units,
and between the first V node and the first x
node.

56
Direction of projection growth

Topographic organizations widely attested
throughout neural structures
Activity-dependent growth a possible alternative
Orientation information (axes)
Chemical gradients during development
Cell age a possible alternative

57
Projection parameters

Direction
Extent
Local
Non-local
Target unit type
Strength of connections encoded separately

58
Connectivity Genome

Contributions from ONSET and PARSE

59
ONSET
x0 segment S S VO
N S x0

VO segment NS S VO

60
Encoding connection strength

Network-level specification

For each constraint ?i , need to embody
Constraint strength si
Connection coefficients (F ? ? cell
types)
Product of these is contribution of ?i to the
F ? ? connection weight

61
Processing
62
Development
63
Learning
64
Learning Behavior

A simplified system can be solved analytically
Learning algorithm turns out to
Dsi(?) e violations of constrainti P?

65
Abstract Gene Map
General Developmental Machinery
Connectivity
Constraint Coefficients
C-I
V-I
C-C
direction
extent
target
CORRESPOND
RESPOND
COVx B 1
CCVC B ?2
CC CICO 1
VC VIVO 1
G??
G??
?
?
66
Summary