Week 6. Optimality Theory and acquisition. - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Week 6. Optimality Theory and acquisition.

Description:

GRS LX 865 Topics in Linguistics Week 6. Optimality Theory and acquisition. Optimality Theory Grammar involves constraints on the representations (e.g., SS, LF, PF ... – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 51
Provided by: paulha64
Learn more at: https://www.bu.edu
Category:

less

Transcript and Presenter's Notes

Title: Week 6. Optimality Theory and acquisition.


1
GRS LX 865Topics in Linguistics
  • Week 6. Optimality Theory and acquisition.

2
Optimality Theory
  • Grammar involves constraints on the
    representations (e.g., SS, LF, PF, or perhaps a
    combined representation).
  • The constraints exist in all languages.
  • Where languages differ is in how important each
    constraint is with respect to each other
    constraint.

3
Optimality Theory
  • In our analysis, one constraint is Parse-T, which
    says that tense must be realized in a clause. A
    structure without tense (where TP has been
    omitted, say) will violate this constraint.
  • Another constraint is F (Dont have a
    functional category). A structure with TP will
    violate this constraint.

4
Optimality Theory
  • Parse-T and F are in conflictit is impossible
    to satisfy both at the same time.
  • When constraints conflict, the choice made (on a
    language-particular basis) of which constraint is
    considered to be more important (more highly
    ranked) determines which constraint is satisfied
    and which must be violated.

5
Optimality Theory
  • So if F gtgt Parse-T, TP will be omitted.
  • and if Parse-T gtgt F, TP will be included.

6
Optimality Theorybig picture
  • Universal Grammar is the constraints that
    languages must obey.
  • Languages differ only in how those constraints
    are ranked relative to one another. (So,
    parameter ranking)
  • The kids job is to re-rank constraints until
    they match the order which generated the input
    that s/he hears.

7
French kid data
  • This means if a kid uses 3sg or present tense, we
    cant tell if they are really using 3sg (they
    might be) or if they are not using agreement at
    all and just pronouncing the default.
  • So, we looked at non-present tense forms and
    non-3sg forms only to avoid the question of the
    defaults.

8
The idea
  • Kids are subject to conflicting constraints
  • Parse-T Include a projection for tense
  • Parse-Agr Include a project for agreement
  • F Dont complicate your tree with functional
    projections
  • F2 Dont complicate your tree so much as to
    have two functional projections.

9
The idea
  • Sometimes Parse-T beats out F, and then theres
    a TP. Or Parse-Agr beats out F, and then theres
    an AgrP. Or both Parse-T and Parse-Agr beat out
    F2, and so theres both a TP and an AgrP.
  • But what does sometimes mean?

10
Floating constraints
  • The innovation in Legendre et al. (2000) that
    gets us off the ground is the idea that as kids
    re-rank constraints, the position of the
    constraint in the hierarchy can get somewhat
    fuzzy, such that two positions can
    overlap. F Parse-T

11
Floating constraints
  • F Parse-T
  • When the kid evaluates a form in the constraint
    system, the position of Parse-T is fixed
    somewhere in the rangeand winds up sometimes
    outranking, and sometimes outranked by, F.

12
Floating constraints
  • F Parse-T
  • (Under certain assumptions) this predicts that we
    would see TP in the structure 50 of the time,
    and see structures without TP the other 50 of
    the time.

13
French kid data
  • Looked at 3 French kids from CHILDES
  • Broke development into stages based on a modified
    MLU-type measure based on how long most of their
    utterances were (2 words, more than 2 words) and
    how many of the utterances contain verbs.
  • Looked at tense and agreement in each of the
    three stages represented in the data.

14
French kid data
  • Kids start out using 3sg agreement and present
    tense for practically everything (correct or
    not).
  • We took this to be a default
  • (No agreement? Pronounce it as 3sg. No tense?
    pronounce it as present. Neither? Pronounce it as
    an infinitive.).

15
French kid data
  • This means if a kid uses 3sg or present tense, we
    cant tell if they are really using 3sg (they
    might be) or if they are not using agreement at
    all and just pronouncing the default.
  • So, we looked at non-present tense forms and
    non-3sg forms only to avoid the question of the
    defaults.

16
French kids data
  • We found that tense and agreement develop
    differentlyspecifically, in the first stage we
    looked at, kids were using tense fine, but then
    in the next stage, they got worse as the
    agreement improved.
  • Middle stage looks likecompetition between
    Tand Agr for a single node.

17
A detail about counting
  • We counted non-3sg and non-present verbs.
  • In order to see how close kids utterances were
    to adults utterances, we need to know how often
    adults use non-3sg and non-present, and then see
    how close the kids are to matching that level.
  • So, adults use non-present tense around 31 of
    the timeso when a kid uses 31 non-present
    tense, we take that to be 100 success
  • In the last stage we looked at, kids were
    basically right at the 100 success level for
    both tense and agreement.

18
Proportion of non-present and non-3sg verbs
19
Proportion of non-finite root forms
20
A model to predict the percentages
  • Stage 3b (first stage)
  • no agreement
  • about 1/3 NRFs, 2/3 tensed forms F2 FParse
    T ParseA

21
A model to predict the percentages
  • Stage 4b (second stage)
  • non-3sg agreement and non-present tense each
    about 15 (about 40 agreeing, 50 tensed)
  • about 20 NRFs F2 FParseT ParseA

22
A model to predict the percentages
  • Stage 4c (third stage)
  • everything appears to have tense and agreement
    (adult-like levels) F2 FParseT ParseA

23
Predicted vs. observedtense
24
Predicted vs. observedagrt
25
Predicted vs. observedNRFs
26
Various things (homework)
  • Is the OT model just proposed a
    structure-building or full competence model?
  • How does the OT model fit in the overall big
    picture with the ATOM model?

27
Various things (homework)
  • For French, we assumed that NRFs appear when both
    TP and AgrP are missing. Yet, Schütze Wexler
    1996 claimed the root infinitives appeared with
    either TP or AgrP were missing.
  • Which one is it?

28
French v. English
  • English TAgr is pronounced like
  • /s/ if we have features 3, sg, present
  • /ed/ if we have the feature past
  • /Ø/ otherwise
  • French TAgr is pronounced like
  • danser NRF
  • a dansé (3sg) past
  • je danse 1sg (present)
  • jai dansé 1sg past

29
?
  • ? ?
  • ?
  • ? ?
  • ? ?
  • ?
  • ?

30
What were doing
  • The driver who my neighbor who I trust suggested
    took me to the airport.
  • The driver who my neighbor who my boss trusts
    suggested took me to the airport.
  • Overarching hypothesis Sentence difficulty has
    to do with holding onto several unsatisfied
    dependencies. Longer ones are harder to hold.
  • Question What measures length?
  • Hypothesis New referents.

31
How do we see if thats right?
  • Center-embedded sentences are the most taxing,
    several started dependencies, center-most element
    triple-counted.
  • The driver who my neighbor who I trust
  • Thats the most sensitive point, seems to be near
    critical point of processability.

32
Experimenting
  • Does it matter whether we have a known referent
    (I, you) or a new referent (my neighbor)?
  • To know for sure, we try holding everything
    constant except the most embedded subject and see
    if there are differences (which can then be
    attributed to the only thing thats different,
    the properties of the most embedded subject).

33
Building the items
  • The driver who my neighbor who I trust suggested
    took me to the airport.
  • The driver who my neighbor who John trusts
    suggested took me to the airport.
  • The driver who my neighbor who the housekeeper
    trusts suggested took me to the airport.
  • The driver who my neighbor who they trust
    suggested took me to the airport.

34
Planning the experiment
  • Each set of four sentences constitutes a token
    set (a.k.a. item)
  • Each item are four conditions (1/2 pronoun, name,
    definite description, 3 pronoun).
  • Counterbalancing rules
  • Each subject will judge no more than one sentence
    from each token set.
  • Each subject will judge all conditions and will
    see equal numbers of sentences from each
    condition
  • Every sentence in every token set will be judged
    by some subject.

35
Trial lists
  • We have four conditions, so we need
  • Four different scripts (versions of the lists)
  • Some number of fourples of token sets.
  • E.g., items 1-4, each with conds a-d
  • Subj W 1a, 2b, 3c, 4d (script 1)
  • Subj X 1b, 2c, 3d, 4a (script 2)
  • Subj Y 1c, 2d, 3a, 4b (script 3)
  • Subj Z 1d, 2a, 3b, 4c (script 4)

36
Our experiment
  • We will have 20 items (picked from the ones you
    submitted) and 20 fillers.
  • (Note Thats on the small side for a real
    experiment)
  • Next steps
  • Create the lists of test sentences for the four
    different scripts.
  • Spec out and pseudocode our experiment
  • Investigate PsyScript
  • Run the experiment
  • Deal with the data

37
Creating the scripts
  • Our sentences are made of very predictable
    components
  • The X who/that the Y who/that Z VP1 VP2 VP3
  • The only thing that changes across conditions is
    Z, while the rest changes across token sets.
  • We can use Excel to build these from their
    pieces, to avoid unnecessary errors.

38
Worksheets
  • Components
  • Subj1
  • Rel1
  • Subj2
  • Rel2
  • Subj3a
  • Subj3b
  • Subj3c
  • VP3
  • VP2
  • VP1
  • Answer
  • Question
  • Fillers
  • Question
  • Answer
  • Regions
  • The way Ive set it up, everything needs to be
    exactly 8 regions long (even the fillers)

39
Worksheets
  • Constructed
  • Computes item (token group) and condition based
    on row number, comes up with a code like I5V2
    (fifth token group, version 2). Builds the
    sentence region by region based on the condition
    number.
  • Tables
  • Keeps track of what will be on each script.
  • Scripts are divided into blocks, and each block
    has one of each condition and four fillers,
    randomized.
  • Sort column is 2block plus a random number (to
    order the blocks, but randomly within)

40
Worksheets
  • Script
  • The master script sheet
  • This generates a script based on the columns you
    put into I1 and J1. (The column refer to the
    tables sheet, where the item and condition
    numbers will be found)
  • B and C for script 1
  • D and E for script 2
  • F and G for script 3
  • H and I for script 4
  • Script a, script d
  • Actual scripts.
  • Select the part of script sheet that has data
    (A1O41) and copy.
  • Go to script a sheet
  • Paste special and choose Value (so we dont copy
    formulas, only results).
  • Delete column B-D (item, cond, row), select rows
    2-41, hit sort button, delete column A (sort),
    and row 1 (labels)
  • Save as tab-delimited text.

41
The scripts are ready
  • So, we have the data that were going to use.
  • The next thing is to figure out how were going
    to test these.
  • The goal is to test reading time on each region
    of the sentence by presenting the sentence region
    by region.

42
Thinking through the experiment
  • What do we want to have happen?
  • Display some instructions
  • Do some practice trials
  • Display practice is over message
  • Do some real trials
  • Display thanks!
  • The trials
  • Show fully obscured sentence, wait for a key
  • Reveal next word, wait for a key, until done
  • Ask question, wait for response
  • Give sound feedback about correctness

43
PsyScript
  • To do this, well use PsyScript, an environment
    for creating psychology experiments on the Mac.
  • (Its basically the only freely available
    software of this type that has promise for
    working in the future if PsyScope had not
    become commercial as E-Prime, wed be learning
    that instead).

44
AppleScript
  • The underlying machinery behind PsyScript is
    something called AppleScript.
  • This has been part of the Mac OS for about the
    past 10 years, although it is gaining power and
    popularity recently.
  • AppleScript is a means by which you can tell
    other programs what to do.
  • For example, tell Internet Explorer to go to a
    particular web page, tell Word to create a new
    document and type the date,
  • Until you have an actual need for this, it
    doesnt seem very exciting

45
AppleScript
  • AppleScript is a sophisticated high-level
    programming language designed to be human
    readable (and kind of human writable). Its
    supposed to look a lot like English.
  • PsyScript itself is an application that can be
    bossed around by AppleScript, and has the
    features that are useful in psycholinguistic
    experiments, such as timing, drawing, input, data
    recording functions.

46
Getting started
  • To write (and use) AppleScript, we use Script
    Editor.
  • Easiest way to do this Find the end experiment
    script and double-click on it.
  • tell application PsyScript
  • end experiment
  • end tell

47
Note about PsyScript
  • PsyScript runs faster from the Script Editor
  • If you run PsyScript from the Script Editor you
    have to manually tell it where your script is.
  • To do this, find the line that says tell
    fileHelper to setContainer and change the thing
    in parentheses to what you see when you
    Command-click on the name of the script in the
    title bar of the Script Editor Window, bottom to
    top, each separated by and not including the
    actual name of the script. E.g.,
  • setContainer(Station 5Desktop
    FolderPsyScript)

48
Movingwindow
  • I wrote a script called movingwindow to do what
    were going to do today.
  • The stimuli and instructions files are in a
    folder called resources in the same folder as
    the script is. The names of these files are set
    at the top of the script, in mine, they are
  • Mwstimuli.txt sentence list as exported from
    Excel (tab-delimited text, exporting e.g., script
    a)
  • Mwpractice.txt sentence list for the practice
    items
  • Mwinstruc.txt initial instructions
  • Mwready.txt post-practice instructions
  • Mwthanks.txt end of experiment debriefing.
  • Results are stored in results folder.

49
Sentence lists
  • To generate the sentence lists in the right
    format for movingwindow, go to one of the script
    a-d pages, do Save As from Excel, and choose
    tab-delimited text.
  • Columns should be code, question, answer,
    sentence (in eight columns)
  • The end results will come out in a file that you
    can load back into Excel (a tab-delimited file)
  • Columns are code, region number, time for
    region, correct answer 1/0, text of region

50
?
  • ? ?
  • ?
  • ? ?
  • ? ?
  • ?
  • ?
Write a Comment
User Comments (0)
About PowerShow.com