Title: CLIN-Utrecht-99
1Starting With Complex Primitives Pays
Off Complicate Locally, Simplify
Globally ARAVIND K. JOSHI Department of Computer
and Information Science and Institute for
Research in Cognitive Science CogSci
2003 Boston, August 1 2003
2Outline
3Introduction
- Formal systems to specify a grammar formalism
- Start with primitives (basic primitive
structures or building blocks) as simple as
possible and then introduce various
operations for constructing more complex
structures - Alternatively,
4Introduction CLSG
- Start with complex (more complicated)
primitives which directly capture some crucial
linguistic properties and then introduce some
general operations for composing them
-- Complicate Locally, Simplify Globally (CLSG - CLSG approach is characterized by localizing
almost all complexity in the set of primitives,
a key property
5Introduction CLSG localization of complexity
- Specification of the finite set of complex
primitives becomes the main task of a linguistic
theory - CLSG pushes all dependencies to become local,
i. e. , they arise initially in the primitive
structures to start with
6Constrained formal systems another dimension
- Unconstrained formal systems -- add
linguistic constraints, which become, in
a sense all stipulative - Alternatively, start with a constrained formal
system, just adequate for describing language
-- formal constraints become universal, in a
sense -- other linguistic constraints
become stipulative - Convergence CLSG approach leads to
constrained formal systems
7CLSG approach
- CLSG approach as led to several new insights
into - Syntactic description
- Semantic composition
- Language generation
- Statistical processing
- Psycholinguistic properties
- Discourse structure
- CSLG approach will be described by a particular
class of grammars, TAG (LTAG), which illustrate
the CSLG approach to its maximum - Simple examples to communicate the interplay
between formal analysis and linguistic and
processing issues
8Context-free Grammars
- The domain of locality is the one level tree
-- primitive building blocks
DET the N man/car
CFG, G S NP VP
VP V NP VP VP ADV NP DET N
V likes ADV passionately
S
VP
N
N
DET
car
the
man
NP
VP
VP
ADV
VP
NP
V
ADV
passionately
likes
N
NP
DET
V
9Context-free Grammars
- The arguments of the predicate are not in the
same local domain - They can be brought together in the same domain
-- by introducing a rule
S NP V NP
- However, then the structure is lost
- Further the local domains of a CFG are not
necessarily lexicalized - Domain of Locality and Lexicalization
10Towards CLSG Lexicalization
- Lexical item One or more elementary structures
(trees, directed acyclic graphs), which are
syntactically and semantically encapsulated. - Universal combining operations
- Grammar Lexicon
11Lexicalized Grammars
- Context-free grammar (CFG)
CFG, G S NP VP
NP Harry
VP V NP VP VP ADV
NP peanuts
V likes ADV passionately
(Non-lexical)
(Lexical)
S
NP
VP
Harry
VP
ADV
passionately
NP
V
likes
peanuts
12Weak Lexicalization
- Greibach Normal Form (GNF)
CFG rules are of the form A a B1
B2 ... Bn A a This
lexicalization gives the same set of strings but
not the same set of trees, i.e., the same set of
structural descriptions. Hence, it is a weak
lexicalization.
13Strong Lexicalization
- Same set of strings and same set of trees or
structural descriptions. - Tree substitution grammars (TSG)
- Increased domain of locality
- Substitution as the only combining operation
14Substitution
X
a
b
X
g
X
b
15Strong Lexicalization
- Tree substitution grammars (TSG)
CFG, G S NP VP
NP Harry
VP V NP
NP peanuts
V likes
S
a3 NP
a2
NP
TSG, G a1
Harry
peanuts
NP
VP
V NP
likes
16Insufficiency of TSG
- Formal insufficiency of TSG
G S SS (non-lexical) S a (lexical)
CFG
S
S
S
TSG G a1
a2
a3
S
S
S
S
a
a
a
17Insufficiency of TSG
S
S
S
TSG G a1
a2
g
S
S
S
S
S
S
S
S
a
a
S
S
a
a
a
S
a3
S
S
a
a
a
g grows on both sides of the root
G can generate all strings of G but not all
trees of G. CFGs cannot be lexicalized by TSGs,
i.e., bysubstitution only.
18Adjoining
X
b
a
X
X
X
g
b
X
Tree b adjoined to tree a at the node labeled X
in the tree a
19 With Adjoining
G S SS S a
S
S
S
TSG G a1
a2
a3
a
S
S
S
S
g
S
a
a
S
S
Adjoining a2 to a3 at the S node, the root node
and then adjoining a1 to the S node of the
derived tree we have g .
a
S
S
a
a
CFGs can be lexicalized by LTAGs. Adjoining is
crucial for lexicalization.
Adjoining arises out of lexicalization
20Lexicalized LTAG
- Finite set of elementary trees anchored on
lexical items -- extended projections of
lexical anchors, -- encapsulate syntactic and
semantic dependencies - Elementary trees Initial and Auxiliary
- Operations Substitution and Adjoining
- Derivation
- Derivation Tree
- How elementary trees are put together.
- Derived tree
21Localization of Dependencies
- agreement person, number, gender
- subcategorization sleeps null eats NP gives
NP NP thinks S - filler-gap who did John ask Bill to invite e
- word order within and across clauses as in
scrambling and clitic movement - function argument all arguments of the
lexical anchor are localized
22Localization of Dependencies
- word-clusters (flexible idioms)
non-compositional aspect - take a walk, give a cold shoulder to
- word co-occurrences
- lexical semantic aspects
- statistical dependencies among heads
- anaphoric dependencies
23LTAG Examples
S
S
a1
a2
VP
S
NP
NP
V
NP
VP
NP
likes
V
NP
likes
transitive
e
object extraction
some other trees for likes subject extraction,
topicalization, subject relative, object
relative, passive, etc.
24LTAG A derivation
S
S
b2
b1
a2
S
V
S
S
NP
VP
NP
does
VP
V
S
NP
V
NP
think
likes
e
a5
a3
NP
a4
NP
NP
Bill
who
Harry
25LTAG A Derivation
who does Bill think Harry likes
S
S
b2
b1
a2
S
V
S
S
NP
VP
NP
does
VP
V
S
NP
V
NP
think
likes
substitution
e
a5
a3
NP
a4
NP
NP
adjoining
Bill
who
Harry
26LTAG Derived Tree
who does Bill think Harry likes
S
S
NP
V
S
who
does
VP
NP
V
S
Bill
think
VP
NP
NP
V
Harry
likes
e
27LTAG Derivation Tree
substitution
who does Bill think Harry likes
adjoining
likes
a2
who
a3
a4
Harry
b1
think
a5
Bill
does
b2
Compositional semantics on this derivation
structure Related to dependency diagrams
28Nested Dependencies
Architecture of elementary trees a and b
determines the nature of dependencies described
by the TAG grammar
S
a
S
G b
S
a
b
a
S
b
a S b
a S b
a a ab b b
S
a b
29 Crossed dependencies
b
a
S
S
a
a
S
S
b
b
S
Architecture of elementary trees a and b
determines the kinds of dependencies that can
be characterized b is one level below a and to
the right of the spine
30Topology of Elementary Trees Crossed
dependencies
S
S
a
a
S
S
S
b
b
S
a
S
a
S
a a b b
b
S
Linear structure
b
31Topology of Elementary Trees Crossed
dependencies
Dependencies are nested on the tree
S
S
a
a
S
S
S
b
b
S
a
S
a
S
a a b b
b
S
(Linear) Crossed Dependencies
b
32Examples Nested Dependencies
- Center embedding of relative clauses in English
(1) The rat1 the cat2 chased2 ate1 the
cheese
- Center embedding of complement clauses in German
(2) Hans1 Peter2 Marie3 schwimmen3 lassen2
sah1 (Hans saw Peter let/make Marie swim)
Important differences between (1) and (2)
33Examples Crossed Dependencies
- Center embedding of complement clauses in Dutch
Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
(Jan saw Piet let/make Marie swim)
- It is possible to obtain a wide range of
complex dependencies, i.e., complex
combinations of nested and crossed
dependencies. Such patterns arise in word
order phenomena such as scrambling and clitic
movement and also due to scope ambiguities -
34LTAG Some Formal Properties
- TAGs are more powerful than CFGs, both weakly
and strongly, i.e., in terms of both -- the
string sets they characterize and -- the
structural descriptions they support - TAGs carry over all formal properties of
CFGs, modified in an appropriate way - -- polynomial parsing, n6 as compared to
n3 - TAGs correspond to Embedded Pushdown Automata
(EPDA) in the same way as PDAs correspond to
CFGs (Vijay-Shanker, 1987)
35LTAG Some Formal Properties
- An EPDA is like a PDA, however, at each move it
can - -- create a specified (by the move) number
of stacks to the left and right of the
current stack and push specified
information into them -- push or pop on the
current stack -- at the end of the move
--stack pointer moves to the top of - the rightmost stack
- -- if a stack becomes empty it drops out
36LTAG Some Formal Properties
EPDA
Input tape
Finite Control
x x x
Current stack
37LTAG Some Formal Properties
EPDA
Input tape
Finite Control
x x
x x x
x x
x
old current stack
newly created stacks by the move
38LTAG Some Formal Properties
- TAGs (more precisely, languages of TAGs) belong
to the class of languages called mildly
context-sensitive languages (MCSL)
characterized by - polynomial parsing complexity
- grammars for the languages in this class can
characterize a limited set of patterns of nested
and crossed dependencies and their
combinations - languages in this class have the constant growth
property, i.e., sentences, if arranged in
increasing order of length, grow only by a
bounded amount - this class properly includes CFLs
39LTAG Some Formal Properties
- MCSL hypothesis Natural Languages belong to
MCSL - Generated very fruitful research in
- comparing different linguistic and formal
proposals - discovering provable equivalences among
formalisms - and constrained formal systems
- providing new perspectives on linguistic
theories and processing issues - In general, leading to a fruitful interplay of
formal frameworks, substantive linguistic
theories, and computational and processing
paradigms
40Two alternate perspectives on LTAG
- Supertagging
- Flexible composition
41Supertagging supertag disambiguation
Two supertags for likes
S
S
a1
a2
VP
S
NP
NP
V
NP
VP
NP
likes
V
NP
likes
e
Elementary trees associated with a lexical item
can be regarded as super parts-of-speech (super
POS or supertags) associated with that item
42Supertagging supertag disambiguation
- Given a corpus parsed by an LTAG grammar
- we have statistics of supertags -- unigram,
bigram, trigram, etc. - these statistics combine the lexical statistics
as well as the statistics of the constructions in
which the lexical items appear - Apply statistical disambiguation techniques
for standard parts-of-speech (POS) such as N
(noun), V(verb), P(preposition), etc. for
supertagging Joshi Srinivas (1994),
Srinivas and Joshi (1998)
43Supertagging
a5
a2
a1
a3
a4
a8
a6
a7
b2
b4
a13
a9
a10
b3
a12
a11
b1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
the purchase price includes two
ancillary companies
On the average a lexical item has about 15 to 20
supertags
44Supertagging
a5
a2
a1
a3
a4
a8
a6
a7
b2
b4
a13
a9
a10
b3
a12
a11
b1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
the purchase price includes two
ancillary companies
- Select the correct supertag for each word --
shown in blue - Correct supertag for a word means
the supertag that corresponds to that word in
the correct parse of the sentence
45Supertagging -- performance
- Training corpus 1 million words
- Test corpus 47,000 words
- Baseline Assign the most likely supertag 77
- Trigram supertagger 92 Srinivas (1997)
- Some recent results 93
Chen Vijay-Shanker (2000) - Improvement from 77 to 93
- Comparison with standard POS over 90 to 98
46Abstract characterization of supertagging
- Complex (richer) descriptions of primitives
(anchors) - contrary to the standard mathematical convention
- Associate with each primitive all information
associated with it
47Abstract characterization of supertagging
- Making descriptions of primitives more complex
- increases the local ambiguity, i.e., there are
more descriptions for each primitive - however, these richer descriptions of primitives
locally constrain each other - analogy to a jigsaw puzzle -- the richer the
description of each primitive the better - Waltz?
48Complex descriptions of primitives
- Making the descriptions of primitives more
complex - allows statistics to be computed over these
complex descriptions - these statistics are more meaningful
- local statistical computations over these complex
descriptions lead to robust and efficient
processing - Skip?
49Flexible Composition
Adjoining as Wrapping
a at x
Split
a
X
X
X
a1 supertree of a at X
a2 subtree of a at X
50Flexible Composition
Adjoining as Wrapping
X
b
a
X
X
a1 supertree of a at X
X
g
b
X
a2 subtree of a at X
a wrapped around b i.e., the two components a1
and a2 are wrapped around b
51Flexible Composition
Wrapping as substitutions and adjunctions
S
b
a
S
S
NP(wh)
VP
NP
VP
NP
V
S
V
NP
think
substitution
likes
e
- We can also view this composition as a
wrapped around b - Flexible composition
adjoining
52Flexible Composition
Wrapping as substitutions and adjunctions
a
S
substitution
a1
S
adjoining
b
S
NP(wh)
S
VP
a2
NP
VP
NP
V
S
V
NP
think
likes
e
a1 and a2 are the two components of a a1
attached (adjoined) to the root node S of b a2
attached (substituted) at the foot node S of b
Leads to multi-component TAG (MC-TAG)
53Multi-component LTAG (MC-LTAG)
a
a1
b
b
a2
The two components are used together in one
composition step. Both components attach to
nodes in b, an elementary tree. This preserves
locality. The representation can be used for
both -- predicate-argument relationships --
non-p/a information such as scope, focus, etc.
54Tree-Local Multi-component LTAG (MC-LTAG)
- How can the components of MC-LTAG compose
preserving locality of LTAG - Tree-Local MC-LTAG
-- Components of a set compose only with an
elementary tree or an elementary
component- Flexible composition - Tree-Local
MC-LTAGs are weakly equivalent to LTAGs -
However, Tree-Local MC-LTAGs provide structural
descriptions not obtainable by LTAGs - Increased
strong generative power
55Scope ambiguities Example
( every student hates some course)
a1
a2
a3
S
a11
S
a21
S
VP
NP
V
NP
a12
NP
a22
NP
DET
N
hates
DET
N
every
some
N
N
a4
a5
student
course
56Derivation with scope information Example
( every student hates some course)
a1
a2
a3
S
a11
S
a21
S
VP
NP
V
NP
a12
NP
a22
NP
DET
N
hates
DET
N
every
some
N
N
a4
a5
student
course
57Derivation tree with scope information Example
( every student hates some course)
a3(hates)
0
0
1
2.2
a11(E)
a12(every)
a22(some)
a21(S)
2
2
a4(student)
a5(course)
- a11 and
a21 are both adjoined at the root of a3(hates)
- They can be adjoined in any order, thus
representing the two scope readings
(underspecified representation) - The scope
readings represented in the LTAG derivation itself
58Tree-Local MC-LTAG and flexible semantics
- Applications to word order variations including
scrambling, clitic movement and even scope
ambiguities - All word order variations up to two levels of
embedding (three clauses in all) can be
correctly described by tree-local MC-TAGs with
flexible - composition correctly means providing
appropriate structural descriptions, i.e.,
correct semantics -- however,
59Tree-Local MC-LTAG and flexible semantics
- Beyond two levels of embedding not all
patterns of word order variation will be
correctly described Joshi,
Becker, and Rambow (2002) - Thus the class of tree-local MC-TAG has the
property that for any grammar, G, in this
class, if G works up to two levels of
embedding then it fails beyond two levels for
at least some patterns of word order
60Tree-Local MC-LTAG and flexible semantics
- Beyond two levels of embedding not all
patterns of word order variation will be
correctly described Joshi,
Becker, and Rambow (2002) - Thus the class of tree-local MC-TAG has the
property that for any grammar, G, in this
class, if G works up to two levels of
embedding then it fails beyond two levels for
at least some patterns of word order Main
idea
61Tree-Local MC-LTAG and flexible semantics
- Three clauses, C1, C2, and C3, each clause
can be either a single elementary tree or a
multi- component tree set with two components - The verb in C1 takes the verb in C2 as the
argument and the verb in C2 takes the verb in
C3 as the argument - Flexible composition allows us to compose the
three clauses in three ways
62Tree-Local MC-LTAG and flexible semantics
Three ways of composing C1, C2, and C3
(1)
C1 C2 C3
C1 C2 C3
(2)
C1 C2 C3
(3)
- The third mode of composition is crucial for
completing the proof for two levels of embedding - It is not available beyond two levels, without
violating semantics!
63Psycholinguistic processing issues
- Supertagging in psycholinguistic models
-
- Processing of crossed and nested dependencies
- A new twist to the competence performance
distinction - -- a different perspective on this
distinction
64Supertagging in psycholinguistic models
- Convergence of perspectives in the roles of
-- computational linguistics and
psycholinguistics - Due to a shift to lexical and statistical
approaches to sentence processing - A particular integration by Kim,
Srinivas, and Trueswell (2002) from the
perspective of LTAG
65Supertagging in psycholinguistic models
- Supertagging Much of the computational work of
linguistic analysis, --
traditionally viewed as structure building
-- can be viewed as lexical disambiguation - Integration of supertagging in a
psycholinguistic model -- One would
predict that many of the initial
processing commitments of syntactic analysis
are made at the lexical level in the
sense of supertagging
66Supertagging in psycholinguistic models
- Integration of
- a constraint-based lexicalist theory (CBL),
MacDonald, Pearlmutter, and Seidenberg (1994),
Trueswell and Tanenhaus (1984) - lexicon represented as supertags with their
distribution estimated from the supertagging
experiments described earlier (Srinivas (1997))
67Supertagging in psycholinguistic models
- Distinction between PP attachment ambiguities
in(1) I saw the man in the park with a
telescope - (2) The secretary of the general with red hair
Two supertags for with
VP
NP
VP
PP
NP
PP
NP
P
NP
P
with
with
68Supertagging in psycholinguistic models
- Distinction between PP attachment ambiguities
in(1) I saw the man in the park with a
telescope(2) The secretary of the general with
red hair - In (1) the ambiguity is lexical in the
supertagging sense - In (2) the ambiguity is resolved at the level of
attachment computation (structure building) - The ambiguity in (1) is resolved at an earlier
level of processing while in (2) it is resolved
at a later level ofprocessing
69Supertagging in psycholinguistic models
(3) The student forgot her name (4) The student
forgot that the homework was due today In (1)
forgot takes an NP complement while in (2) it
takes a that S complement -- Thus there will
be two different supertags for forgot -- The
ambiguity in (3) and (4) is lexical
(supertagging sense) and need not be viewed as
a structural ambiguity
Kim, Srinivas, and Trueswell (2002) present a
neuralnet based architecture using supertags and
confirmthese and related results
70Processing of nested and crossed dependencies
- CFG associated automata PDA
- TAG associated automata EPDA (embedded PDA)
(Vijay-Shanker (1987)) - EPDAs provide a new perspective on the relative
ease or difficulty of processing crossed and
nested dependencies which arise in --
center embedded complement constructions
71Processing of nested and crossed dependencies
- Hans1 Peter2 Marie3 schwimmen3 lassen2 sah1
(German nested order) - Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
(Dutch crossed order) - Jan saw Peter let/make Mary swim
(English iterated order, no center embedding)
Center embedding of complements -- each verb
is embedded in a higher verb, except the
matrix verb (top level tensed verb)
72Processing of nested and crossed dependencies
- Hans1 Peter2 Marie3 schwimmen3 lassen2 sah1
(German nested order) - Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
(Dutch crossed order)
Bach, Brown, and Marslen-Wilson (1986) Stated
very simply, they showed that Dutch is easier
than German Crossed order is easier to process
than nested order
73Processing of nested and crossed dependencies
- German and Dutch subjects performed two
tasks-- rating comprehensibility and a test of
successful comprehensionon matched sets of
sentences which varied in complexity from a
simple sentence to one containing three
levels of embedding no difference between
Dutch and German for sentences within the
normal range (up to one level) but with a
significant preference emerging for the Dutch
crossed order Bach. Brown, and
Marslen-Wilson (1986)
74Processing of nested and crossed dependencies
- Hans1 Peter2 Marie3 schwimmen3 lassen2 sah1
(German nested order) - Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
(Dutch crossed order)
- It is not enough to locate a well formed
structure but we need to have a place for it to
go-- In (1) a PDA can locate the innermost N3
and V3 but we do not know at this stage where
this structure belongs, we do not have the higher
verb, V2 - PDA is inadequate for (1) and, of course, for (2)
75Processing of nested and crossed dependencies
- Hans1 Peter2 Marie3 schwimmen3 lassen2 sah1
(German nested order) - Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
(Dutch crossed order)
- EPDA can precisely model the processing of (1)
and (2), - consistent with the principle that -- when a
well formed structure is identified it is
POPPED only if there is a place for it to go,
i.e., the structure in which it fits has
already been POPPED --Principle of Partial
Interpretation (PPI), Joshi(1990), based on
Bach, Brown, and Marslen-Wilson (1986)
76Processing of nested and crossed dependencies
- Hans1 Peter2 Marie3 schwimmen3 lassen2 sah1
(German nested order) - Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
(Dutch crossed order)
- Measure of complexity maximum number of items
from the input that have to be held back before
the sentence processing (interpretation) is
complete. - German is about twice as hard as Dutch
77Processing of nested and crossed dependencies
- Principle of partial interpretation (PPI) can be
correctly instantiated for both Dutch and
German, resulting in complexity for German
about twice as that for Dutch - Among all possible strategies consistent with
PPI we choose the one, say, M1, which makes
Dutch as hard as possible - Among all possible strategies consistent with
PPI we choose the one, say, M2, which makes
German as easy as possible - Then show that the complexity of M1 is less than
M2 by about the same proportion as in Bach et
al. (1986)!
78Processing of nested and crossed dependencies
- Significance of the EPDA modeling of the
processing of nested and crossed dependencies - Precise correspondence between EPDA and TAG,
- -- direct correspondence between
processing and grammars - We have a precise characterization of the
computational power of the processing strategy
Much more recent work, e.g., Gibson (2000), Lewis
(2002),
Vasishth (2002)
79Competence performance distinction- a new twist
- How do we decide whether a certain property
is a competence property or a performance
property? - Main point The answer depends on the formal
devices available for describing language! - In the context of MC-LTAG describing a variety
of word order phenomena, such as scrambling,
clitic movement, and even scope ambiguities,
there is an interesting answer - We will look at scrambling (e.g., in German)
80(1) Hans1 Peter2 Marie3 schwimmen3 lassen2
sah1 (Hans saw Peter make Marie swim)
Competence performance distinction- a new twist
- In (1) the three nouns are in the standard
order It is possible for them to be in any
order, in principle, keeping the verbs in the
same order as in (1), for example, as in (2)
(2) Hans1 Marie3 Peter2 schwimmen3 lassen2
sah1
In general, P(N1, N2 Nk) Vk Vk-1 V1
where P is a permutation of k nouns
81Competence performance distinction- a new twist
- Sentences involving scrambling from more than
two levels are difficult to interpret - (B) Similar to the difficulty of processing more
than two (perhaps even more than one) center
embedding of relative clauses in EnglishThe rat
the cat the dog chased bit ate the cheese - (C) Since the difficulty in (B) is regarded as a
performance property, we could also
declare the difficulty in (A) also as
performance property, but WAIT !
82if G works up to two levels of embedding then it
fails beyond two levels for some patterns of word
order by not being able to assign a correct
structural description, i.e,, correct semantics
Competence performance distinction- a new twist
- We already know that the class of tree-local
MC-TAG has the property that for any grammar
G, in this class,
- Inability to assign correct structural
descriptions is the reason for the processing
difficulty!!
83Competence performance distinction- a new twist
- So what should we conclude?
- The claim is not that we must conclude that
the difficulty of processing sentences with
scrambling from more than two levels of
embedding has to be a competence property - The claim is that we are presented with a
choice -- the property can be a competence
property -- or, we can continue to regard it
a performance property
84Competence performance distinction- a new twist
- To the best of knowledge, this is the first
example where a particular processing
difficulty can be claimed as a competence
property - Hence, whether a property is competence
property or a performance property depends on
the formal devices (grammars and machines)
available to us for describing language - What about the difficulty of processing more
than two levels (perhaps only one) of center
embedding of relative clauses in English?
85Competence performance distinction- a new twist
- In order to show that the difficulty of
processing sentences with more than two levels
of center embeddings of relative clauses, we
will have to exhibit a class of grammars, say,
G, such that for any grammar, G, in G, if G
assigns correct - correct structural descriptions (correct
semantics), for all sentences up to two levels of - embedding, then - G fails to assign
correct structural descriptions to some
sentences with more than - two embeddings
86Competence performance distinction- a new twist
- For each grammar, G, in G -- if G works up
to two levels then -- G fails beyond two
levels - However, as far as I know, we cannot exhibit
such a class, G -- Finite State Grammars
(FSG) will not work -- CFGs will not work
-- TAGs will not work - So we have no choice but to regard the
processing difficulty as a performance property
87Competence performance distinction- a new twist
- For center embedding of relative clauses --
we have no choice, so far - For scrambling of center embedded complement
clauses, -- we have a choice, we have an
opportunity to claim the property as a
competence property - The two constructions are quite different
- The traditional assumption that all such
properties have to be performance properties is
not justified at all!
88Summary
89Tree-Local Multi-component LTAG (MC-LTAG)
- How can the components of MC-LTAG compose
preserving locality of LTAG - Tree-Local MC-LTAG
-- Components of a set compose only with an
elementary tree or an elementary
component- Non-directional composition -
Tree-Local MC-LTAGs are weakly equivalent to
LTAGs - However, Tree-Local MC-LTAGs provide
structural descriptions not obtainable by
LTAGs - Increased strong generative power
90Scrambling N3 N2 N1 V3 V2 V1
VP
VP
VP
N3 VP
N2 VP
N1 VP
VP
VP
VP
VP
N2
VP
VP
N1
VP
VP
N3
e
e
V3
V1
e
V3
91Scrambling N3 N2 N1 V3 V2 V1
(non-directional composition, semantics of
attachments)
VP
VP
VP
N3 VP
N2 VP
N1 VP
VP
VP
VP
VP
N2
VP
VP
N1
VP
VP
N3
e
e
V2
V1
e
V3
substitution
adjoining
92Scrambling N0 N3 N2 N1 V3 V2 V1 V0
( breakdown after two levels of embedding)
VP
VP
VP
VP
N3 VP
N2 VP
N1 VP
N0 VP
VP
VP
VP
VP
VP
N2
VP
VP
N1
VP
VP
N0
VP
VP
N3
e
e
e
V2
V1
V0
e
V3
substitution
adjoining
93Scrambling N0 N3 N2 N1 V3 V2 V1 V0
-- Beyond two levels of embedding semantically
coherent structural descriptions cannot be
assigned to all scrambled strings -- the
multi-component tree for V0 is forced to combine
with the VP component of the V2 tree -- the
V0 tree cannot be combined with the V1 tree
because the composition has to be tree local
-- Similar results hold for clitic movement
94Semantics
substitution
who does Bill think Harry likes
adjoining
eats
a1
1
2.2
2
a2 Harry
a3 fruit
b1
for
2.2
breakfast
a4
eats (x, y, e) Harry (x) fruit (y) for (e,
z) breakfast (z) l1 eats (x, y) l2 Harry
(x) l3 fruit (y) l4 for (e, z) l5
breakfast(z)
95Semantics
Harry eats fruit for breakfast
b1
VP
a1
S
PP
VP
VP
P
NP
NP
NP
for
V
eats
a4
a2
NP
a3
NP
NP
breakfast
Harry
fruit