Title: Acceptability and grammaticality
1Acceptability and grammaticality
- Research Methods in Theoretical Linguistics
- Caroline Heycock
- (partly based on presentation by Antonella
Sorace)
2Outline
- Why do we need acceptability judgments?
- What are the problems with acceptability
judgments? - Can Magnitude Estimation help with any of these
problems? - Exemplification from ongoing studies on Faroese
(and related languages)
3Outline
- Why do we need acceptability judgments?
- What are the problems with acceptability
judgments? - Can Magnitude Estimation help with any of these
problems? - Exemplification from ongoing studies on Faroese
(and related languages)
4Why do we need judgment data?
Need Problems ME Examples
- There is no direct way to access I-language (the
speakers knowledge of their language), we need
to triangulate from all available sources of
data. - Corpus data typically
- aggregate across speakers
- include performance errors
- allow no straightforward distinction between
non-occurring and ungrammatical - may not exist
5Outline
- Why do we need acceptability judgments?
- What are the problems with acceptability
judgments? - How can Magnitude Estimation help with any of
these problems? - Exemplification from ongoing studies on Faroese
(and related languages)
6Eliminating independent factors
Need Problems ME Examples
- English topicalization
- Do these examples show that topicalization (the
fronting of a constituent to the beginning of a
sentence) in English is ungrammatical? -
- 1.a. I have never owned any books on French
cookery. - b.Any books on French cookery, I have never
owned.
7Establishing minimal contrasts
Need Problems ME Examples
- Japanese quantifier scope
- Here are three possible pairs of examples from
Japanese. Each pair might be argued to show that
quantifier scope in Japanese is determined by
surface order when the arguments are in the
canonical SVO order, but that the scrambled OSV
order gives rise to scope ambiguity. What is the
best pair to make this point?
8- 2.a.dareka-ga dono gakusei-mo hometa
- someone-nom every student praised
- There is someone who praised every student
Every student was praised by someone
(possibly a different person for each student) - b. dono-gakusei mo dareka-ga hometa
- every student someone-nom praised
- There is someone who praised every
student. Every student was praised by
someone (possibly a different person for each
student) - 3.a. dareka-ga subete-no gakusei-o
hometa - someone-nom all-gen student-acc
praised - There is someone who praised every
student Every student was praised by
someone (possibly a different person for each
student) - b. dono-gakusei mo dareka-ga hometa
- every student someone-nom praised
- There is someone who praised every
student. Every student was praised by
someone (possibly a different person for each
student) - 4. a. dareka-ga dono gakusei-mo hometa
- someone-nom every student praised
- There is someone who praised every
student Every student was praised by
someone (possibly a different person for each
student) - b. dareka-o dono gakusei-mo hometa
- someone-acc every student praised
- There is someone who praised every
student. Every student was praised by
someone (possibly a different person for each
student)
9Eliminating other sources of unacceptability
Need Problems ME Examples
- English Noun Phrases
- Is this a good example to show that the syntactic
rules of English disallow the iteration/recursion
of determiners? - 5. that this pen
- Why (not)? Could one do better?
10Example from the literature 1
- Rizzi 1990 Adjuncts resist long extraction
out of wh-islands - 1. a. ?Which problemi were you wondering whether
to tackle ti? - b. Howi were you wondering which problem to
tackle ti? - But this is not because they are adjuncts (rather
than complements), but rather because they are
not referential. Evidence there are complements
that are nonreferential that also resist
extraction out of wh (and negative) islands - 2. a. How muchi did Bill say that the book cost
ti? - b. How much did Bill wonder whether the book
cost ti? - 3. a. How muchi did it cost ti?
- b. How much didnt it cost ti?
- Kroch 1998 the problem is that the
presuppositions of the questions in (2b) and (3b)
make the sentences unusable under most discourse
circumstances.
11Example from the literature 1
- How many points are the judges arguing about
whether to deduct? - 5. A How much have beans been costing lately?
- B The price has been jumping around so much,
youd do better to ask How much havent they
cost?
12Example from the literature 2
- Friedin 1986, Lebeaux 1988 Complements in
wh-phrases seem to act with respect to Condition
C (non-anaphoric, non-pronominal noun phrases
must not be c-commanded by a coreferring
expression) as though they were in their base
position adjuncts dont. Friedin 1986 - 1. a. Which report that Johni was incompetent
did hei submit? - b. Which report that Johni revised did
hei submit? - 2. Hei submitted the report that Johni
revised. - More examples from Lebeaux 1988
- 3. a. Whose claim that Johni is nice did hei
believe? - b. Which story that Johni wrote did hei
like?
13Example from the literature 2
- But (Kuno 1997)
- 4. a. Whose allegation that Johni was less
than truthful did hei refute
vehemently? - b. Whose claim that the Senatori had
violated the campaign - finance regulations did hei dismiss
as politically motivated? - Lasnik 2003 on (1a)
- 1. a. Which report that Johni was incompetent
did hei submit? - There might also be an interfering pragmatic
factor in Freidins example . It is not
customary for an individual (say, John) to be in
a position where he would submit reports (even
more peculiarly, one selected out of several) on
his own incompetence -
14Example from the literature 2
- On (3a)
- 3. a. Whose claim that Johni is nice did hei
believe? - it is at least somewhat unusual for someone
(John in this case) to rely on others claims in
order to determine his or her own personality
characteristics (niceness in this instance).
Further, it is not easy to imagine a situation
where a set of claims that John is nice can be
sufficiently individuated that some can be
believed and others not. To illustrate this
point, I present the following one scene play,
with three characters - Susan John is nice.
- Mary John is nice.
- John I believe Susan, but I dont believe
Mary. - Johns line of dialogue is very strange in this
context. But if this exchange is not the kind
of situation that would make (3a) felicitous,
what would be?
15Validity
Need Problems ME Examples
- Judgments are also a type of behaviour, known to
be affected by - processing constraints
- personality and mental state
- presentation (order, context, mode)
- absolute vs relative task
- linguistic training
16Reliability
Need Problems ME Examples
- Interspeaker variation
- This may or may not be considered a problem of
reliability, depending on assumptions about
individuals grammars, but it is at least a
methodological problem - Intraspeaker inconsistency
17Conventional measurements of acceptability
Need Problems ME Examples
- Judgments of linguistic acceptablity usually form
category scales (ok/) or limited ordinal scales
(ok/?/?/), (1,2,3,4,5) - These scales require absolute rating judgments,
rather than relative ranking judgments - Ordinal scales provide no information about the
relative distance between adjacent points on the
scale
18Problems arising with conventional scales for
acceptability judgments
Need Problems ME Examples
- Limited in their range of values
- Lack of statistical power
- These scales cannot be analysed using parametric
statistics, because this type of analysis
requires the data to be on at least an interval
scale. - Inconsistency
- Even trained linguists used diacritics in
different ways. Comparison between different
studies is extremely difficult. - Uninterpretability
- What do the middle points on a rating scale
actually mean? - How can we distinguish between lack of certainty
and intermediate acceptability?
19Outline
- Why do we need acceptability judgments?
- What are the problems with acceptability
judgments? - How can Magnitude Estimation help with any of
these problems? - Exemplification from ongoing studies on Faroese
(and related languages)
20Magnitude Estimation in psychophysics
Need Problems ME Examples
- ME is an experimental technique used to determine
quickly and easily how much of a given sensation
a person is having. - In an ME experiment subjects are presented with a
standard stimulus (a modulus) and are asked to
express the magnitude by a number. - They are then presented with a series of stimuli
that vary in intensity and are asked to assign
each of the stimuli a number relative to the
modulus.
21ME in psychophysics
Need Problems ME Examples
- Subjects assign a number
- to the modulus to reflect magnitude of pertinent
characteristics (length, loudness, brightness) - to each successive stimulus to indicate apparent
magnitude relative to the first (or to a previous
stimulus)
22ME in psychophysics Scaling
Need Problems ME Examples
- Scaling in ME is not about absolute accuracy of
judgments - Scaling is about the relative relationships
between judgments of stimuli of different
intensities.
23ME in psychophysics modalities
Need Problems ME Examples
- The numerical modality is the most common but
other modalities are possible (e.g. line length). - Other modalities can be more user-friendly
particularly if you are testing people who (think
they) are numerically-challenged.
24ME in psychophysics can people do it?
Need Problems ME Examples
- Many magnitude estimation experiments use a
control condition in which subjects are asked to
perform magnitude estimations of the length of a
line. - Magnitude estimations of line length have been
shown to be proportional to the actual length of
the lines.
25ME in psychophysics can people do it?
Need Problems ME Examples
- If you can show that for a group of subjects
magnitude estimations increased proportionally
with the length of lines, you have established
that the subjects do indeed understand the
instructions they have been given and can assign
numbers to their sensations systematically.
26ME in psychophysics can people do it?
Need Problems ME Examples
- ME provides measurements of subjective
impressions on a numerical scale which can be
plotted against the objective measure of the
physical stimuli giving rise to the impressions. - It does not restrict the number of values which
can be used. - Linear regression of estimates against physical
measures in log-log coordinates produces a
straight line with a slope characteristics of the
physical property being assessed equal ratios on
the physical dimension give rise to equal ratios
of judgments (Stevens Power Law).
27ME in Linguistics
Need Problems ME Examples
- Unlike other dimensions, linguistic acceptability
has no obvious physical continuum to plot
against subjects impressions. - However, Bard, Robertson Sorace 1996 have
applied standard cross-modality matching
techniques and were able to show that the
technique is reliable.
28Typical instructions
Need Problems ME Examples
- Heres an example of what the instructions look
like...
29Instructions
Need Problems ME Examples
- The purpose of this exercise is to get you to
judge the acceptability of some English
sentences. You will see a series of sentences on
the screen. These sentences are all different.
Some will seem perfectly okay to you, but others
will not. What we're after is not what you think
of the meaning of the sentence, but what you
think of the way it's constructed.
30Need Problems ME Examples
- Your task is to judge how good or bad each
sentence is by assigning a number to it. - You can use any number that seems appropriate to
you. For each sentence after the first, assign a
number to show how good or bad that sentence is
in proportion to the reference sentence.
31- For example, if the first sentence was
- (1) cat the mat on sat the.
- and you gave it a 1, and if the next example
- (2) the dog the bone ate.
- seemed 20 times better, you'd give it twenty. If
- it seems half as good as the reference sentence,
- give it the number 0.5
Need Problems ME Examples
32Need Problems ME Examples
- You can use any range of positive numbers you
like including, if necessary, fractions or
decimals. - You should not restrict your responses to, say,
an academic marking scale. - You may not use minus numbers or zero, of course,
because they aren't proper multiples or fractions
of positive numbers. - If you forget the reference sentence don't worry
if each of your judgments is in proportion to the
first, you can judge the new sentence relative to
any of them that you do remember.
33Need Problems ME Examples
- There are no 'correct' answers, so whatever seems
right to you is a valid response. Nor is there a
'correct' range of answers or a correct place
to start. - Any convenient positive number will do for the
reference. - We are interested in your first impressions, so
don't spend too long thinking about your judgment.
34Need Problems ME Examples
- Remember
- Use any number you like for the first sentence.
- Judge each sentence in proportion to the
reference sentence. - Use any positive numbers you think appropriate.
35Choices about the modulus face validity
Need Problems ME Examples
- The experimenter has the option of assigning a
fixed number to the modulus. - Another option is to leave the modulus in sight
throughout the experiment. - This option has good face validity, but it isnt
clear to what extent it affects the ultimate
reliability of the estimates. - People dont need to remember the modulus if
they are making judgments proportionally, the
reference point shifts as they move on.
36Advantages of quasi-randomization
Need Problems ME Examples
- The experimenter can impose constraints on the
randomization to prevent certain experimental
items from occurring consecutively. - The modulus can be chosen to represent an
intermediate degree of acceptability. - A number (or a line) of intermediate size can be
assigned to the modulus by the experimenter.
37Timed vs untimed ME
Need Problems ME Examples
- Timing the intervals between sentences may reduce
the likelihood that people consult metalinguistic
or prescriptive knowledge. - Intervals have to be different for non-native
speakers they have to be piloted carefully.
38Varying the instructions
Need Problems ME Examples
- There is a tendency in some people to use a fixed
(usually 10-point) scale. This is possibly
because of familiarity with school marking
systems. - If the instructions contain an explicit warning
against using a restricted range of numbers, the
tendency is much reduced. - People are very sensitive to instructions these
have to be as explicit and clear as possible. - A detailed practice session is essential!
39Advantages
Need Problems ME Examples
- ME yields interval scales, which allow the use of
parametric statistics - Mathematical operations can be applied to the
estimates, allowing - a direct indication of the speakers ability to
discriminate between more or less acceptable
sentences - a direct measure of the strength of speakers
preferences
40Advantages
Need Problems ME Examples
- Informants are enabled to express their
intuitions without any restrictions of the
judgment scale. - They are asked to provide purely comparative
judgments these are relative both to a reference
item and the individual subjects own previous
judgments. - At no point is an absolute criterion of
grammaticality applied. - The subjects themselves fix the value of the
reference item relative to which subsequent
judgments are made.
41Advantages
Need Problems ME Examples
- The scale used by informants is open-ended and
has no minimum division subjects can always add
a further highest score or produce an additional
intermediate rating. - The result is that subjects are able to produce
judgments which distinguish all and only the
differences they perceive.
42Data analysis normalisation
Need Problems ME Examples
- ME data need to be normalized because people use
different ranges of estimates. - Raw magnitude values are generally transformed
into logs in order to yield a normal
distribution. - Each number is divided by the modulus that the
subject had assigned to the reference sentence,
or alternatively the z-scores are used. - Any statistical package can easily do these
transformations.
43Outline
- Why do we need acceptability judgments?
- What are the problems with acceptability
judgments? - How can Magnitude Estimation help with any of
these problems? - Exemplification from ongoing studies on Faroese
(and related languages)
44Faroese
Need Problems ME Examples
- Question
- Do modern speakers of Faroese have V-to-I as
part of their competence grammar(s)? - Initial pilot 24 speakers, 10 from Suðuroy, 14
from Tórshavn area (but no effect of dialect area
was found) - 3x3 design (clause type x V Adv order)
45(No Transcript)
46 47References
- Bard, E. D. Robertson, and A. Sorace. (1996)
Magnitude Estimation of linguistic acceptablity.
Language, 723268. - Cowart, W. 1997. Experimental Syntax Applying
objective methods to sentence judgments. Sage, - Freidin, R. (1986). Fundamental issues in the
theory of binding. In Lust, B., editor, Studies
in the Acquisition of Anaphora. Reidel,
Dordrecht. - Keller, F. (2000). Gradience in Grammar
Experimental and Computational Aspects of Degrees
of Grammaticality. PhD thesis, University of
Edinburgh. - Kroch, A. (1998). Amount quantification,
referentiality, and long wh-movement. In
Dimitriadis, A., Lee, H., Moisset, C., and
Williams, A., editors, University of Pennsylvania
Working Papers in Linguistics, volume 5.2, pages
2136. (Originally circulated in 1989). - Lasnik, H. (2003). Minimalist Investigations in
Linguistic Theory. Routledge, London/New York. - Lebeaux, D. (1988). Language Acquisition and the
Form of the Grammar. PhD thesis, University of
Massachusetts, Amherst. - Rizzi, L. (1990). Relativized Minimality,
volume 16 of Linguistic Inquiry Monograph. MIT
Press, Cambridge, Mass. - Sorace. A. and F. Keller. (2005) Gradience in
linguistic data. Lingua, 115(11)14971524. - Sprouse, J. (2007). A program for experimental
syntax Finding the relationship between
acceptability and grammatical knowledge. PhD
thesis, University of Maryland, College Park, 2007