Title: Experimental Syntax
1Experimental Syntax
- Dora Alexopoulou
- Research Centre for English and Applied
Linguistics - Cambridge
- ta259_at_cam.ac.uk
2- What is experimental syntax?
-
-
3- What is experimental syntax?
-
- Broad sense a program of applying formal
experimental methodology(ies?) to investigate
hypotheses relevant to current syntactic theory. -
-
4- What is experimental syntax?
-
- Broad sense a program of applying formal
experimental methodology(ies?) to investigate
hypotheses relevant to current syntactic theory. -
- Narrow sense focus on acceptability
judgements -
5- What is experimental syntax?
-
- Broad sense a program of applying formal
experimental methodology(ies?) to investigate
hypotheses relevant to current syntactic theory. -
- Narrow sense focus on acceptability
judgements - Cowart W, 1997, Experimental Syntax applying
objective methods to sentence judgement, Thousand
Oaks,CA Sage Publications.
6- Course Structure
- Lecture I
- Magnitude Estimation from psychophysics to
linguistic acceptability - Lecture II
- Experimental syntax and the study of interfaces
- Beyond linguistic acceptability
- Lecture III
- Gradience and syntactic theory
7Grammaticality/acceptability judgements
- Intuitions about grammatical/acceptable and
ungrammatical/unacceptable sentences indirectly
reveal rules/principles of grammars. -
- English (i) met Mary John
- (ii) loves you.
8- But acceptability judgements are gradient
- (i) Who does John think Mary will fire t?
- (ii) ?Who did Mary wonder whether they will fire
t? - (iii) Who did John meet the girl who will marry
t? - What is the status of sentences of intermediate
acceptability like (ii)?
9- (i) Who does John think Mary will fire t?
- (ii) ?Who did Mary wonder whether they will fire
t? - (iii) Who did John meet the girl who will marry
t? -
- (v) Who were you wondering if we should see?
- (Chung McCloskey 1983).
10Why do we care about gradient judgements? (I)
- Are such discrepancies an artefact of the
absence of an unambiguous notational system and
the lack of a systematic way of quantifying
linguistic intuitions, or do they represent real
disagreements about the acceptability of the
structures in question?
11Why do we care about gradient judgements? (II)
- an adequate linguistic theory will have to
recognise degrees of grammaticalness (Chomsky
1975).
12Why do we care about gradient judgements? (II)
- critics of generative grammar might take the
existence of gradient well-formedness judgments
as an indication that the entire enterprise is
misconceived. In this eliminativist view,
gradient well-formedeness juddments constitute
evidence that generative linguistics must be
replaced by something very different, something
much fuzzier (Hayes 200088).
13Why do we care about gradient judgements? (III)
- Is gradience a consequence of grammatical
principles of different strength? - Are there different types of constraints? (soft
vs. hard) - Where does gradience live?
14- Gradience and crosslinguistic variation.
- Is the type of a constraint (hard, soft)
crosslinguistically constant? - Is quantitative variation characteristic of
soft constraints only?
15- Theoretical accommodation is gradience the
result of interaction of ranked/ weighted
constraints (Stochastic OT) or the consequence of
the interaction of a basically categorical
grammar with the interfaces? - Can gradient variation be accommodated by the PP
model of parametric variation?
16- Magnitude Estimation
- From psychophysics to Linguistic Acceptability
- Bard E.G., D Robertson and A Sorace, 1996,
Magnitude Estimation for Linguistic
acceptability, Language, 72(1).32-68. - Cowart W, 1997, Experimental Syntax applying
objective methods to sentence judgement, Thousand
Oaks,CA Sage Publications.
17- Magnitude Estimation
- From psychophysics to Linguistic Acceptability
- Bard E.G., D Robertson and A Sorace, 1996,
Magnitude Estimation for Linguistic
acceptability, Language, 72(1).32-68. - Cowart W, 1997, Experimental Syntax applying
objective methods to sentence judgement, Thousand
Oaks,CA Sage Publications. - Schutze C, 1996, The empirical base of
linguisticsgrammaticality judgements and
linguistic methodology, Chicago Chicago
University Press
18The validity problem
- How many distinctions?
- Absolute vs. relative judgments.
- Absolute judgments require a decision as to
whether (or to what extent) a stimulus has a
particular property. People tend to use their own
implicit reference point. - Relative judgments require a comparison between
two or more stimuli with respect to a particular
property. - People are psychometrically better at giving
relative judgments.
19The reliability problem
- Consistency across and within subjects
- Do intermediate points reflect indeterminancy or
real gradience? (do speakers agree in their
judgement of intermediate sentences or can they
only give reliable judgments for structures at
the end points of the acceptability continuum?)
20Robustness
- No application of robust experimental design
factoring out extragrammatical factors such as
processing complexity, appropriateness of
discourse context, frequency, mode of
presentation, order of presentation etc,
distraction with fillers etc.
21Limited ordinal scales
- Limited in their range of values
- How many distinctions?
- vCompletely acceptable and natural
- ? Acceptable, but perhaps somewhat unnatural
- ?? Doubtful, but perhaps acceptable
- ?Marginal, but not totally unacceptable
- Thoroughly unacceptable
- Horrible
22Ordinal scales
- Difficult to interpret what do adjacent points
in an ordinal scale mean? - They cannot capture the relative strength of
grammatical violations. - Precise differences in acceptability between
sentences.
23ME in psychophysics
- Can human subjects make reliable proportional
judgements of physical stimuli? (e.g. brightness,
length, loudness etc.). - Stevens (1975) magnitude estimation
24- Task subjects are required to associate a
numerical judgement with a physical stimulus
once the initial stimulus, or modulus, is
presented and a number associated with it,
subjects are asked to assign to each successive
stimulus a number reflecting the relationship
between each stimulus and the modulus.
25- No restriction on number values means that
subjects can indicate as many distinctions as
they perceive. - Since a ratio scale is involved, numerical
differences between stimuli reflect differences
in impressions. - Scaling is not about absolute accuracy of
judgements, but about the relative relationships
between judgements of stimuli of different
intensities. - In psychophysics ME numerical values can be
directly compared with measures of physical
stimuli giving rise to impressions.
26- Stevens Power Law Equal ratios on the physical
dimensions give rise to equal ratios of
judgements - (e.g. in judgements of line length, doubling
physical line length doubles subjective line
length in judgements of brightness, every time
the stimulus energy doubles, the subjective
brightness becomes 1.5 times larger).
27Are acceptability judgments like any other kind
of human judgment?
- Position 1 no, acceptability judgments derive
from a special cognitive faculty characterized
by properties that are not shared by other kinds
of behaviour. - Position 2 yes, acceptability judgments obey the
same constraints as any other kind of human
judgment.
28From psychophysics to linguistics
- Problem linguistic acceptability has no obvious
physical continuum against which to compare
judgements. - Solution cross modality matching (Lodge 1981).
- Bard, Robertson and Sorace 1996 cross modality
validation study.
29An ME experiment
- Instructions
- Training phase with line length
- Practice phase with sentences
- Main experimental phase
30ME instructions
- Instructions for practice phase explaining the
notion of proportionality using line length. - Instructions explaining that linguistic
acceptability can be judged in the same way as
line length.
31ME instructions
- Your task is to judge how good or bad each
sentence is by assigning a number to it. - You can use any number that seems appropriate to
you. For each sentence after the first, assign a
number to show how good or bad that sentence is
in proportion to the reference sentence.
32ME instructions
- For example, if the first sentence was
- (1) cat the mat on sat the
- and you gave it a 1, and if the next example
- (2) The dog the bone ate.
- seemed 20 times better, youd give it twenty.
If it seems half as good as the reference
sentence, give it the number 0.5.
33ME instructions
- You can use any range of positive numbers
including, if necessary, fractions or decimals. - There are no correct answers, so whatever seems
right to you is a valid response. - We are interested in first impressions, so dont
spend too long thinking about your judgement.
34Experimental phase
- Modulus a sentence of intermediate acceptability
- Experimental items in random order and
interspersed with filler items - Setting time limits to intervals between
sentences may reduce the likelihood of
prescriptive/metalinguistic responses.
35Data analysis normalisation
- Dividing each numerical value by the value
assigned by a given subject to the modulus
creates a common scale. Analyses are then carried
out on log-transformed judgements. - Advantage parametric tests
36Some examples
- Which friend Thomas has painted a picture
of?(INV) - Which friend have Thomas painted a picture of?
(AGR) - Which picture has Thomas painted a picture of
her? (RES) - Which friend has Thomas painted a picture of?
37(No Transcript)
38- Soft Constraints
- ?Which friend has Thomas painted the picture of?
(DEF) - ?Which picture has Thomas torn up a picture of?
(EXIST) - ?How many friends has Thomas painted a picture
of?(REF)
39- Power law for linguistic stimuli
40Is ME superior to n-point scales?
- Number of distinctions
- Bard et. al found that subjects may indicate more
than the 7 distinctions available in 7-point
Likert-scales. - But Weskott and Fanselow (2008) report that this
may not be so.
41Is ME superior to n-point scale and binary
judgement tasks?
- Informativity
- There is greater inherent variability in ME
experiments than in binary or n-point scale tasks
(Weskott and Fanselow 2008). - Two experiments comparing SO/OS patterns for acc
and dative objects in German.
42- But note that the relevant experiments involved
only two conditions. - ME is suitable for investigating interactions
between different types of constraints. -
43An example word order in Greek
- (i) I Maria tha diavasi to vivlio
- the-nom Maria will read-3sg the book
- Maria will read the book
- (ii) To vivlio tha diavasi I Maria
- (iii) Tha diavasi I Maria to vivlio
- SVO, OVS, VSO
- Accent on first or rightmost NP
44- All focus
- What happened?
- Accent on the rightmost constituent
- V or S initial
- vsO, svO
45- Object focus
- What did Maria read?
- Accent on object NP
- Not v-initial
- svO, Ovs
46- All focus VSO as good as SVO
47- Object Focus accent placement is the strongest
cue
48What is the effect of the modulus?
- The modulus is, in effect, a unit of measurement.
49What is the effect of the modulus?
- The modulus is, in effect, a unit of measurement.
But there is no zero.
50What is the effect of the modulus?
- The modulus is, in effect, a unit of measurement.
But there is no zero. - If subjects are calculating ratios, then they
must be using implicit zeros.
51What is the effect of the modulus?
- The modulus is, in effect, a unit of measurement.
But there is no zero. - If subjects are calculating ratios, then they
must be using implicit zeros. But then distances
should vary across participants.
52What is the effect of the modulus?
- The modulus is, in effect, a unit of measurement.
But there is no zero. - If subjects are calculating ratios, then they
must be using implicit zeros. But then distances
should vary across participants. - Do subjects ignore instructions and use the
modulus as a single reference point along a
linear scale of acceptability? (Featherston
2007,2008).
53- Sprouse (2008) investigation of the effect of
modulus on ME judgements. - G What does Bill think that you are cooking
tonight? - ADJ Who did Mary hide her face because she
recongised? - CSC What does Jane think that you should eat
carrots and? - FSS Who did Frank danced with shock the guests?
- ISS What can to see be scary for a child?
- LBC Whose did John think that you saw father
yesterday? - NC What did you doubt the claim that Jessica
invented? - RC Who does Erin trust the nurse who cared for?
- WH Who do you wonder whether Mike met on
vacation?
54- Type of modulus
- IF-Island
- What do you wonder if Larry bought?
- CSC-violation
- What do you think that Larry bought a shirt and?
- NC-island
- What did you start the rumour that Larry bought?
- RC-island
- What did Larry help the customer who bought?
55- The modulus appears to have no effect!
- Subjects show astonishing consistency in their
numerical judgments across the four experiments! - Potential for developing an inventory of controls
or anchors to be included standardly in ME
studies for standardising comparisons between ME
experiments.
56- So it seems that participants are rating each
item identically across these experiments,
regardless of the modulus.
57- So it seems that participants are rating each
item identically across these experiments,
regardless of the modulus. This is a striking
result.
58- So it seems that participants are rating each
item identically across these experiments,
regardless of the modulus. This is a striking
result. In essence, we are giving a sample of
people a list of 50 sentences, and asking them to
assign numbers to these sentences using a task
that is technically impossible for them to
perform.
59- So it seems that participants are rating each
item identically across these experiments,
regardless of the modulus. This is a striking
result. In essence, we are giving a sample of
people a list of 50 sentences, and asking them to
assign numbers to these sentences using a task
that is technically impossible for them to
perform. - Whether they attempt to perform the impossible
task or not, they all coalesce on the same
response pattern they give the sentences the
same numerical rating (modulo normal variation
assumed by the statistical tests), and they all
use the value assigned to the impossible-to-use
modulus as an upper bound for ungrammatical
sentences Sprouse 2008 (p.22)
60- Whatever the answer to the modulus mystery,
- the cognitive effect measured by ME studies is a
very robust one. - Potential for inventory of control
conditions/anchors to enable comparisons across
experiments of different design.
61- If judgments are so consistent and stable, what
about satiation or the linguists disease? - Snyder 2000 An experimental investigation of
syntactic satiation effects, LI 31, 575-582.
62WebExp2
- WebExp 2 (Keller et al 1998, Mayo et al 2005)
-
- http//www.webexp.info
63WebExp main features
- Different experimental paradigms supported.
- Automatic authentication of subjects details and
e-mail address. - Automatic randomisation of experimental materials
for each subject. - Time responses recorded automatic checks can be
carried out on both onset and completion times. - Data response storage easily processed by
standard statistics packages. - Validation studies indicate high correlation with
lab-based data.
64Conclusion
- Magnitude Estimation can be used successfully by
naïve subjects to judge linguistic acceptability. - ME results are highly replicable between
experiments. - ME allows us to investigate sophisticated
hypotheses derived from linguistic theory and can
be a useful tool for reliable comparative work.
65- Bard E.G., D Robertson and A Sorace, 1996,
Magnitude Estimation for Linguistic
acceptability, Language, 72(1).32-68. - Cowart W, 1997, Experimental Syntax applying
objective methods to sentence judgement, Thousand
Oaks,CA Sage Publications. - Featherston S, 2003, Magnitude Estimation and
what it can do for your syntax some
Wh-constraints in German,Lingua
115(11).1525-1550. - Featherston S, 2007, Data in Generative Grammar
the carrot and the stick. Theoretical Linguistics
33 (3).269-318. - Keller F 2000, Gradience in Grammar Experimental
and Experimental aspects of Degrees of
Grammaticality, PhD thesis, University of
Edinburgh. - Keller F and T Alexopoulou, 2001, Phonology
competes with syntax experimental evidence for
the interaction of word order and accent
placement in information structure, Cognition,
79(3).301-372. - Reips, U-D. 2002. Standards for internet-based
experimenting. Experimental Psychology 49
243-256. - Schutze C, 1996, The empirical base of
linguisticsgrammaticality judgements and
linguistic methodology, Chicago Chicago
University Press. - Snyder 2000 An experimental investigation of
syntactic satiation effects, LI 31, 575-582. - Sprouse J, 2008, Evaluating Linguistic Magnitude
Estimation, Proceedings of WCCFL 2008. - Sprouse J, 2007, A program for experimental
syntax, PhD thesis, UMD. - Sorace A, Acceptability judgements and magnitude
estimation in experimental linguistic research,
lectures at EMLAR 2006. - Weskott T. and Fanselow G, Variance and
Informativity in Different Measures of Linguistic
Acceptability, Proceedings of WCCFL 2008.